当前位置:网站首页>Ambari HDP cluster building strategy

Ambari HDP cluster building strategy

2020-12-08 10:37:16 Architecture technology column

The fastest shortcut in the world , It's down to earth , This article has been included 【 Architecture technology column 】 Focus on this place where you like to share .

Recently, because of the need to reuse Ambari Set up a set of Hadoop colony , I recorded the construction process , I also hope to give a reference to the partners who have the same needs ,

author : Header data

Ambari Ubuntu14.04 The latest version 2.2.1

HDP Ubuntu14.04 The latest version 2.4.3.0

Ambari What is it?

Apache Ambari It's based on Web Tools for , Support Apache Hadoop The supply of clusters 、 Management and monitoring .

Ambari Has supported most of Hadoop Components , Include HDFS、MapReduce、Hive、Pig、 Hbase、Zookeeper、Sqoop and Hcatalog etc. .

Apache Ambari Support HDFS、MapReduce、Hive、Pig、Hbase、Zookeepr、Sqoop and Hcatalog And so on . It's also 5 Top tier hadoop One of the management tools .( It's an open source hadoop One click installation service )

What can we do with him ? Why should we use it ?

We can use ambari Fast build and manage hadoop And frequently used service components .

such as hdfs、yarn、hive、hbase、oozie、sqoop、flume、zookeeper、kafka wait .( To put it bluntly, you can steal a lot of laziness )

And why do we use it

  • The first is the ambari It's an early Hadoop Manage cluster tools
  • The second is mainly now Hadoop The official website also recommends the use of Ambari.
  • Cluster provisioning is simplified through step-by-step installation wizard .
  • Pre configure the key operation and maintenance indicators (metrics), Can be viewed directly Hadoop Core(HDFS and MapReduce) And related projects ( Such as HBase、Hive and HCatalog) Is it healthy? .
  • Support visualization and analysis of job and task execution , Better view of dependencies and performance .
  • Through a complete RESTful API Expose the surveillance information , Integration of existing operation and maintenance tools .
  • The user interface is very intuitive , Users can easily and effectively view information and control the cluster .

Ambari Use Ganglia Collect metrics , use Nagios Support system alarm , When you need to get the attention of the Administrator ( such as , Problems such as node downtime or insufficient disk space ), The system will send an email to it .

Besides ,Ambari Be able to install secure ( be based on Kerberos)Hadoop colony , In this way, the right Hadoop Security support , Provides role-based user authentication 、 Authorization and audit functions , And integrated for user management LDAP and Active Directory.

Cluster building

1、 Let's do some preparation before installation

##  Tell the servers who they are first , What's the nickname ( Modify the configuration hosts file )
vim /etc/hosts
10.1.10.1 master
10.1.10.2 slave1
10.1.10.3 slave2

##  Then let's go in and out of their house freely with the access card   Stop at the station ( Configure password free login )
ssh-keygen -t rsa ## Execute on all machines 
cat ~/.ssh/id_rsa.pub ##  View public key 
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys ##  Write the public key to authorized_keys In file 
###  First, write all the public keys to master The server 
###  Next, put master Don't write the public key of slave1,slave2
###  Finally using scp Order the password to be told  ( I won't tell you my password is “ What time is it ”)
scp ~/.ssh/authorized_keys slave1:~/.ssh/authorized_keys
scp ~/.ssh/authorized_keys slave2:~/.ssh/authorized_keys

## Update time zone and system localization configuration 
apt-get install localepurge ##  a meal enter Don't mind  ( Unload the unused local Translating documents )
dpkg-reconfigure localepurge && locale-gen zh_CN.UTF-8 en_US.UTF-8 ##  a meal enter Don't mind 
apt-get update && apt-get install -y tzdata 
echo "Asia/Shanghai" > /etc/timezone  ##  Change the time zone to Shanghai 
rm /etc/localtime
dpkg-reconfigure -f noninteractive tzdata
vi  /etc/ntp.conf
server 10.1.10.1

2、 And then doing something Ubuntu System optimization

###1.1  Turn off swap partition 
swapoff -a
vim /etc/fstab ##  Delete Note swap That line   It's like this 
# swap was on /dev/sda2 during installation
#UUID=8aba5009-d557-4a4a-8fd6-8e6e8c687714 none swap  sw   0   0

### 1.2  Modify the number of open file descriptors   Add at the end  ulimit
vi /etc/profile
ulimit -SHn 512000
vim /etc/security/limits.conf ##  The size increases 10 times 
* soft nofile 600000
* hard nofile 655350
* soft nproc 600000
* hard nproc 655350
### 1.2  Use the command to make the change effective 
source /etc/profile

###1.3  Modify kernel configuration 
vi /etc/sysctl.conf
###  Just stick it up 
fs.file-max = 65535000
net.core.somaxconn = 30000
vm.swappiness = 0
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 16384 16777216
net.core.netdev_max_backlog = 16384
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.ip_local_port_range = 1024 65000
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.conf.lo.disable_ipv6=1
###  Execute the command to make the configuration work 
sysctl -p

###1.4  Configure kernel shutdown THP function 
echo never > /sys/kernel/mm/transparent_hugepage/enabled
## Permanent ban .
vi /etc/rc.local   
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then  
   echo never > /sys/kernel/mm/transparent_hugepage/enabled  
fi  
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then  
   echo never > /sys/kernel/mm/transparent_hugepage/defrag  
fi  

3、 Installation and deployment ambari-server ( Environmental Science :Ubuntu 14.04 + Ambari 2.2.1)

##  Update download source 
wget -O /etc/apt/sources.list.d/ambari.list http://public-repo-1.hortonworks.com/ambari/ubuntu14/2.x/updates/2.2.1.0/ambari.list
apt-key adv --recv-keys --keyserver keyserver.ubuntu.com B9733A7A07513CAD
apt-get update
##  stay master Node installation ambari-server 
apt-get install ambari-server -y
##  Install on all nodes  ambari-agent 
apt-get install ambari-agent -y

4、 modify ambari-agent Configuration point ambari-server

vi /etc/ambari-agent/conf/ambari-agent.ini
##  modify hostname
[server] 
hostname=master
url_port=8440
secured_url_port=8441

##  initialization ambari-server To configure ambari  service  Database, JDK( Default 1.7), LDAP  Generally choose default   
ambari-server setup  ##  Go crazy enter

##  start-up ambari
ambari-server start
ambari-agent start

5、 After a headache Shell command , It's time to connect some human things .

Use your browser to access http://10.1.10.1:8080/ The account password defaults to amdin/admin Click on LAUNCH INSTALL WIZARD Let's start happily

6、 Give the cluster a name

7、 There's a little bit of attention here to make sure your hdp Version, or there will be trouble later

**8、 What I have configured in this is HDP2.4.3 **

Example : http://public-repo-1.hortonworks.com/HDP/debian7/2.x/updates/2.4.3.0

Click on next It will check whether the data source is normal , If you report an error here, you can click "Skip Repository Base URL validation (Advanced) " Do a skip check

9、 fill hostname master slave1 slave2 Because in slave install ambari-agent So choose not to use ssh

10、 Check the server status -- We need to wait here If the waiting time is too long, you can restart ambari-server

11、 Choose the services we need HDFS YARN ZK

12、 Use it directly Ambari Default allocation method Click next to start the installation

13、 Now it's time to think about the speed of the Internet

14、 All the way after installation Next Refresh the main page to see our Hadoop The cluster is started by default

15、 Get into HDFS Next Click on restart ALL You can restart all components

16、 Verify that the installation was successful Click on NameNodeUI

17、 Basic information page

18、Hadoop It's finished. Don't you want to run a task ?

##  Enter the server to execute 
###  establish hdfs Catalog   You can do it again http://master:50070/explorer.html#/ Interface 
hdfs dfs -mkdir -p /data/input 
###  Upload files from the server to hdfs On 
hdfs dfs -put  file   /data/input/
###  Use the example provided on the official website to test 
hadoop jar hdfs://tesla-cluster/data/hadoop-mapreduce-examples-2.7.1.2.4.0.0-169.jar wordcount /data/input /data/output1

19、 give the result as follows Generate _SUCCESS And documents

The following is an unorthodox account

finally , Through the above steps, we built a set of hadoop colony , But then there are some problems ,NameNode and ResouceManage It's all a single point model ,ambari Support HA( High availability ) Because the space is limited , There will be a separate one at the back of the head .

版权声明
本文为[Architecture technology column]所创,转载请带上原文链接,感谢
https://chowdera.com/2020/12/202012081037030185.html