原创

Centos7安装大数据集群(CDH6)

温馨提示:
本文最后更新于 2018年06月20日,已超过 2,360 天没有更新。若文章内的图片失效(无法正常加载),请留言反馈或直接联系我

1.安装Centos7

2.配置网络

vi /etc/sysconfig/network-scripts/ifcfg-ens33

3.配置主机名、关闭防火墙和SELinux

vi /etc/hosts

###添加
192.168.200.125    master.cdh6
192.168.200.126    slave.cdh6
###关闭防火墙
systemctl stop firewalld
###禁止防火墙开机自启
systemctl disable firewalld
####临时生效
setenforce 0
###永久生效
修改 /etc/selinux/config 下的 SELINUX=disabled

4.设置文件打开数量和用户最大进程数

###查看文件打开数量
ulimit -a 
###查看用户最大进程数
ulimit -u
vi /etc/security/limits.conf
###增加以下内容:
* soft nofile 65535
* hard nofile 65535
* soft nproc 32000
* hard nproc 32000

降低对硬盘的缓存:

###/proc/sys/vm/swappiness设置为0,修改swap空间的swappiness
echo "vm.swappiness=0"  >> /etc/sysctl.conf

5.集群时间同步

#全部节点安装ntp
rpm -qa |grep ntpd
###没有安装ntp,则需要安装此服务
yum install -y ntp

master:

vi /etc/ntp.conf
###去掉这个注释,将地址改成网段地址
restrict 192.168.200.0 mask 255.255.255.0 nomodify notrap

###注释掉这几个
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst

###添加一下内容
server 127.127.1.0
fudge  127.127.1.0  stratum  10


vi /etc/sysconfig/ntpd
###加入下面一句话,用于配置boot时间和系统时间同步
SYNC_HWCLOCK=yes
###可选,选择上海时区
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime

slave:

输入 crontab -e 命令进入编辑状态,然后输入一下内容(该任务保存在目录/var/spool/cron 下,必须用root用户才能看到  )
* * * * * /usr/sbin/ntpdate master.cdh6

启动ntp:

service ntpd start
service ntpd status
chkconfig ntpd on
chkconfig --list |grep ntpd

6.SSH免密登录

---------- 配置主机之间的免密ssh登陆 ---------
假如 master.cdh6  要登陆  slave.cdh6
在master.cdh6上操作:
首先生成密钥对,命令如下:
ssh-keygen   (提示时,直接回车即可)        
将生产的秘钥copy到master.cdh6上,命令如下
ssh-copy-id   master.cdh6
ssh-copy-id   slave.cdh6
如果出现 ssh-copy-id: command not found 需要执行该命令(yum -y install openssh-clients)

7.安装repo、GPG key、jdk

###安装repo
wget https://archive.cloudera.com/cm6/6.0.0/redhat7/yum/cloudera-manager.repo -P /etc/yum.repos.d/

###导入GPG key
rpm --import https://archive.cloudera.com/cm6/6.0.0/redhat7/yum/RPM-GPG-KEY-cloudera

###安装jdk
yum install -y oracle-j2sdk1.8
####配置java环境变量
vi /etc/profile
#####最后面加上
export JAVA_HOME=/usr/java/jdk1.8.0_141-cloudera
export PATH=$PATH:$JAVA_HOME/bin
#####生效环境变量
source /etc/profile

8.上传cdh资源包到master.cdh6节点

将CHD6相关的Parcel包放到主节点的/opt/cloudera/parcel-repo/目录中,如果没有此目录,可以自己创建。

注意

1.最后将• CDH-6.0.0-1.cdh6.0.0.p0.537114-el7.parcel.sha256,重命名为• CDH-6.0.0-1.cdh6.0.0.p0.537114-el7.parcel.sha,这点必须注意否则,系统会重新下载• CDH-6.0.0-1.cdh6.0.0.p0.537114-el7.parcel文件。

2.将CDH-6.0.0-1.cdh6.0.0.p0.537114-el7.parcel.sha里面的秘钥改为自己cdh的版本号,在manifest.json找到相应的版本秘钥。

9.在slave.cdh6安装MySQL

wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm

rpm -ivh mysql-community-release-el7-5.noarch.rpm

yum update

yum install -y mysql-server

systemctl start mysqld

systemctl enable mysqld
###初始化Mysql

/usr/bin/mysql_secure_installation

[...]
Enter current password for root (enter for none):
OK, successfully used password, moving on...
[...]
Set root password? [Y/n] Y
New password:
Re-enter new password:
Remove anonymous users? [Y/n] Y
[...]
Disallow root login remotely? [Y/n] N
[...]
Remove test database and access to it [Y/n] Y
[...]
Reload privilege tables now? [Y/n] Y
All done!

10.安装MySQL JDBC Driver

wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.46.tar.gz

tar zxvf mysql-connector-java-5.1.46.tar.gz

mkdir -p /usr/share/java/

cd mysql-connector-java-5.1.46

cp mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar

注意:

一定要将mysql-connector-java-5.1.46-bin.jar改名为mysql-connector-java.jar,不然初始化cm的时候无法识别。

建库:

需要建的库有scm、amon、rman、hue、metastore、sentry、nav、navms、oozie

CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

CREATE DATABASE hive DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

开放远程权限:

mysql> use mysql;

mysql> grant all privileges on *.* to 'root'@'%' identified by '123456' with grant option;

mysql> grant all privileges on *.* to 'scm'@'master.cdh6' identified by '123456' with grant option;

mysql> flush privileges;

11.安装CM

yum安装:

安装之前需要先安装jdk。

yum install cloudera-manager-server

12.初始化数据库

12.1如果数据库和CM在一台服务器上

/opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm

12.2如果数据库和CM不在一台服务器上

/opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h slave1.cdh6 --scm-host master.cdh6 scm scm

12.2如果你的数据库是oracle

/opt/cloudera/cm/schema/scm_prepare_database.sh -h cm-oracle.example.com oracle orcl sample_user sample_pass

13.启动CM服务

systemctl start cloudera-scm-server

###查看日志
tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log

###查看端口,会出现7180。
####如果没有netstat,则安装yum install net-tools
netstat -nltp

14.登录

http://192.168.200.125:7180

用户名admin

密码admin

15.可能会遇到的问题

15.1虚拟内存设置

Cloudera 建议将 /proc/sys/vm/swappiness 设置为 0。当前设置为 60。使用 sysctl 命令在运行时更改该设置并编辑 /etc/sysctl.conf 以在重启后保存该设置。您可以继续进行安装,但可能会遇到问题,Cloudera Manager 报告您的主机由于交换运行状况不佳。以下主机受到影响:

临时解决

通过echo 0 > /proc/sys/vm/swappiness即可解决。

永久解决

sysctl -w vm.swappiness=0

echo vm.swappiness = 0 >> /etc/sysctl.conf

15.2大内存设置

大内存页禁用

echo never>/sys/kernel/mm/transparent_hugepage/defrag

echo never>/sys/kernel/mm/transparent_hugepage/enabled

15.3升级软件依赖版本

Starting with CDH 6, PostgreSQL-backed Hue requires the Psycopg2 version to be at least 2.5.4, see the documentation for more information. This warning can be ignored if hosts will not run CDH 6, or will not run Hue with PostgreSQL. The following hosts have an incompatible Psycopg2 version of '2.5.1'

解决方法:可以忽略。

yum install python-pip

pip install --upgrade psycopg2

15.4安装Parcel提示主机运行状况不良

解决方法

删除agent目录下面的cm_guid文件,并重启失败节点的agent服务恢复。

find / -name cm_guid
/var/lib/cloudera-scm-agent/cm_guid

删除它/var/lib/cloudera-scm-agent/cm_guid

###重启agent
cloudera-scm-agent restart
本文目录