1、部署准备
Docker已经在Linux(CentOS和Ubuntu等)中完美适应,在两者中任一个都行,安装时尽量用最新的正式版本,可以减少不必要的麻烦。目前大多数教程都是基于spark1.4之前的,虽然1.6.0与之前的部署差别不大但还是有一些问题的
我们在此用的是
在安装完Docker后,你就可以拉Spark库啦,在此推荐sequenceiq/spark:1.6.0,在拉之前请解决外网https访问问题,因为公司服务器代理的原因在此我耗费啦一天半的时间,最终放弃。
另外,关于docker的安装什么的就不说啦,一大堆一大堆的都是教程,此外使用docker安装避免了直接部署的很多问题,例如java版本,Python之类的问题
2、拉取spark镜像
docker pull sequenceiq/spark:1.6.0
然后如果网络畅通的话会下载2.87G大小的镜像,所以需要等待很长时间,如果不能访问https请设置docker默认代理设置,即文件/etc/default/docker加入http_proxy和https_proxy,dokcer默认走自己的代理所以你设置的系统代理可能不起作用的哟
成功后会显示镜像状态号,代表完成拉取
3、编译镜像
docker build --rm -t sequenceiq/spark:1.6.0 .
运行时将Docker-hub上的那个Dockerfile下载下来,放到你运行该指令的文件目录下 下载地址:https://github.com/sequenceiq/docker-spark/blob/master/Dockerfile
4、启动容器
我搭建一个master和两个worker2 创建一个master容器:
docker run --name master -it -p 8088:8088 -p 8042:8042 -p 8085:8080 -p 4040:4040 -p 7077:7077 -p 2022:22 -v /data:/data -h master sequenceiq/spark:1.6.0 bash
说明:--name 表示创建的容器的名称,-it可以理解创建一个标准的临时终端,-p表示映射本地到容器的端口 -v 建立本地到容器的文件,就是让本地的data文件和容器上的文件共享,-h设置容器的机器名 创建三个worker容器(在两个不同终端运行):
docker run --name worker1 -it -h worker1 sequenceiq/spark:1.6.0 bash docker run --name worker2 -it -h worker2 sequenceiq/spark:1.6.0 bash
四个容器现在已经建立成功了,相互ping一下,若是平不通的话,请关闭本地的防火墙: service iptables stop(每次重启都必须配置)
5、配置集群
5.1 关闭每个容器上的hadoop集群
cd /usr/local/hadoop-2.6.0/sbin/ ./stop-all.sh
5.2 配置每个容器上的hosts文件(每次重启都必须配置)
vi /etc/hosts 172.17.0.1 master 172.17.0.2 worker1 172.17.0.3 worker2 127.0.0.1 localhost ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters
5.3 配置每个容器上hadoop的slaves文件
vi /usr/local/hadoop-2.6.0/etc/hadoop/slaves 配置为 worker1 worker2
5.4 配置每个容器上core-site.xml
vi /usr/local/hadoop-2.6.0/etc/hadoop/core-site.xml 配置为
fs.defaultFS hdfs://master:9000
5.5 配置每个容器上spark的slaves文件
cd /usr/local/spark-1.6.0-bin-hadoop2.6/conf cp slaves.template slaves vi slaves 配置为 worker1 worker2
5.6 配置每个容器上Spark的spark-env.sh
cd /usr/local/spark-1.6.0-bin-hadoop2.6/conf/ cp spark-env.sh.template spark-env.sh vi spark-env.sh 添加 export JAVA_HOME=/usr/java/default export SPARK_MASTER_IP=master export SPARK_WORKER_CORES=1 # export SPARK_WORKER_INSTANCES=1 export SPARK_EXECUTOR_INSTANCES=3 export SPARK_MASTER_PORT=7077 export SPARK_WORKER_MEMORY=1g export MASTER=spark://${SPARK_MASTER_IP}:${SPARK_MASTER_PORT}
5.7 在每个容器上配置spark的yarn(每次重启都必须配置)
cd /usr/local/spark-1.6.0-bin-hadoop2.6/yarn-remote-client/ 1>修改core-site.xml
2> 修改yarn-site.xml fs.default.name hdfs://master:9000 dfs.client.use.legacy.blockreader true yarn.resourcemanager.scheduler.address master:8030 yarn.resourcemanager.address master:8032 yarn.resourcemanager.webapp.address master:8088 yarn.resourcemanager.resource-tracker.address master:8031 yarn.resourcemanager.admin.address master:8033 yarn.application.classpath /usr/local/hadoop/etc/hadoop, /usr/local/hadoop/share/hadoop/common/*, /usr/local/hadoop/share/hadoop/common/lib/*, /usr/local/hadoop/share/hadoop/hdfs/*, /usr/local/hadoop/share/hadoop/hdfs/lib/*, /usr/local/hadoop/share/hadoop/mapreduce/*, /usr/local/hadoop/share/hadoop/mapreduce/lib/*, /usr/local/hadoop/share/hadoop/yarn/*, /usr/local/hadoop/share/hadoop/yarn/lib/*, /usr/local/hadoop/share/spark/*
5.8 在每个容器上配置hadoop集群(每次重启都必须配置)
1> 创建存储目录2> 修改core-site.xml/usr/local/hadoop-2.6.0/ 目录 1>在hadoop的home下创建 mkdir -p dfs/name mkdir -p dfs/data mkdir -p tmp
3> 修改hdfs-site.xml
fs.defaultFS hdfs://master:9000 io.file.buffer.size 131702 hadoop.tmp.dir /usr/local/hadoop-2.6.0/tmp
4> 修改mapred-site.xml
dfs.replication 3 dfs.namenode.name.dir /usr/local/hadoop-2.6.0/dfs/name dfs.datanode.data.dir /usr/local/hadoop-2.6.0/dfs/data dfs.namenode.secondary.http-address master:9001 dfs.webhdfs.enabled true
5> 配置yarn-site.xml(这里注意内存和cpu的设置)cp mapred-site.xml.template mapred-site.xml
mapreduce.framework.name yarn mapreduce.jobhistory.address master:10020 mapreduce.jobhistory.webapp.address master:19888
yarn.nodemanager.aux-services mapreduce_shuffle yarn.nodemanager.auxservices.mapreduce.shuffle.class org.apache.hadoop.mapred.ShuffleHandler yarn.resourcemanager.address master:8032 yarn.resourcemanager.scheduler.address master:8030 yarn.resourcemanager.resource-tracker.address master:8031 yarn.resourcemanager.admin.address master:8033 yarn.resourcemanager.webapp.address master:8088 yarn.nodemanager.resource.memory-mb 2048 yarn.nodemanager.resource.cpu-vcores 1 yarn.application.classpath /usr/local/hadoop/etc/hadoop, /usr/local/hadoop/share/hadoop/common/*, /usr/local/hadoop/share/hadoop/common/lib/*,/usr/local/hadoop/share/hadoop/hdfs/*, /usr/local/hadoop/share/hadoop/hdfs/lib/*,/usr/local/hadoop/share/hadoop/mapreduce/*, /usr/local/hadoop/share/hadoop/mapreduce/lib/*, /usr/local/hadoop/share/hadoop/yarn/*, /usr/local/hadoop/share/hadoop/yarn/lib/* Number of seconds after an application finishes before the nodemanager's DeletionService will delete the application's localized file directory and log directory. To diagnose Yarn application problems, set this property's value large enough (for example, to 600 = 10 minutes) to permit examination of these directories. After changing the property's value, you must restart the nodemanager in order for it to have an effect. The roots of Yarn applications' work directories is configurable with the yarn.nodemanager.local-dirs property (see below), and the roots of the Yarn applications' log directories is configurable with the yarn.nodemanager.log-dirs property (see also below). yarn.nodemanager.delete.debug-delay-sec 600
6 启动spark集群
6.1 在master上启动hadoop集群
cd /usr/local/hadoop-2.6.0/sbin/ ./start-all.sh
在master上可以查看jps bash-4.1# jps 2441 NameNode 2611 SecondaryNameNode 2746 ResourceManager 3980 Jps 在worker上查看 bash-4.1# jps 1339 DataNode 1417 NodeManager 2290 Jps
6.2 在master上启动spark集群
cd /usr/local/spark-1.4.0-bin-hadoop2.6/sbin/ ./start-all.sh 在master上查看 bash-4.1# jps 2441 NameNode 2611 SecondaryNameNode 2746 ResourceManager 3980 Jps 3392 Master 在每个worker上查看 bash-4.1# jps 1339 DataNode 1417 NodeManager 2290 Jps
7、运行集群
如果你配置了master的slaves文件,那么你只需要运行master中spark目录下的./sbin/start-all.sh即可,不然你就需要一个个去运行master和slaves了。在浏览器打开(你的mastre ip:8080) 可以看到如下spark各节点的状态。
7、部署问题以及解决方案
7.1、一直报worker无法启动
原因:系统自带的jdk没有卸载干净
rpm -qa | grep java rpm -e --nodeps java_cup-0.10k-5.el6.x86_64 rpm -e --nodeps java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64
7.2、一直报JAVA_HOME没有设置
echo $JAVA_HOME 得到自己的JAVA路径 在所有的节点 /usr/local/spark-1.6.0-bin-hadoop/conf/spark-env.sh 中添加export JAVA_HOME=自己的JAVA路径