【mesos:2】Hadoop(CDH4.2.1)のmesos化

はじめに


mesos/hadoop · GitHub

を参考に、Hadoop (CDH4.2.1)を、mesos対応にする。

環境

物理サーバ 3台 (x86_64)

Centos 6.4 (6.2から、yum updateで、バージョンアップ)

Open JDK 6

CDH4.2.1 CDH 4.2.1 Documentation Archive

  • 1台 namenode + second namenode + JobTracker + datanode
  • 2台 datanode
  • replica 1

インストール

  1. hadoop-mesos をBUILD する。*1
  2. BUILD した,hadoop-mesos-x.x.x.jar を、Hadoopのライブラリに3台ともコピーする。*2
  3. HDFS 上に、mesosを組み込んだjarを配置する。
  4. mapred-site.xml を、書き換える。
  5. /etc/init.d/hadoop-0.20-mapreduce-jobtracker に、環境変数を設定する。

インストールログ

BUILD hadoop-mesos-0.0.3.jar

[root@uldata14 ~]# cp hadoop-mesos-0.0.3.jar /usr/lib/hadoop-0.20-mapreduce/lib/
[root@uldata14 ~]#  tar zxf mr1-2.0.0-mr1-cdh4.2.1.tar.gz
[root@uldata14 ~]# cp hadoop-mesos-0.0.3.jar hadoop-2.0.0-mr1-cdh4.2.1/lib/
[root@uldata14 ~]# tar czf hadoop-2.0.0-mr1-cdh4.2.1.tar.gz hadoop-2.0.0-mr1-cdh4.2.1
[root@uldata14 ~]# cp hadoop-2.0.0-mr1-cdh4.2.1.tar.gz /tmp
[root@uldata14 ~]# su - hadoop
[hadoop@uldata14 ~]$ sudo -u hdfs hadoop fs -put /tmp/hadoop-2.0.0-mr1-cdh4.2.1.tar.gz /hadoop-2.0.0-mr1-cdh4.2.1.tar.gz
[root@uldata14 ~]# cat /etc/hadoop/conf/mapred-site.xml
<configuration>
  <property>
    <name>mapred.local.dir</name>
    <value>/var/lib/hadoop-mapreduce/cache/mapred/</value>
  </property>
<property>
  <name>mapred.job.tracker</name>
  <value>10.29.254.14:8021</value>
</property>
<property>
  <name>mapred.jobtracker.taskScheduler</name>
  <value>org.apache.hadoop.mapred.MesosScheduler</value>
</property>
<property>
  <name>mapred.mesos.taskScheduler</name>
  <value>org.apache.hadoop.mapred.JobQueueTaskScheduler</value>
</property>
<property>
  <name>mapred.mesos.master</name>
  <value>10.29.254.14:5050</value>
</property>
  <property>
    <name>mapred.mesos.executor.uri</name>
    <value>hdfs://10.29.254.14:8020/hadoop-2.0.0-mr1-cdh4.2.1.tar.gz</value>
    <description>
      This is the URI of the Hadoop on Mesos distribution.
      NOTE: You need to MANUALLY upload this yourself!
    </description>
  </property>
</configuration>
[root@uldata14 ~]# vi /etc/init.d/hadoop-0.20-mapreduce-jobtracker 

# mesos
export MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos.so

実験

[hadoop@uldata14 ~]$ hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar pi 10 10
Number of Maps  = 10
Samples per Map = 10
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
.
.
.
Job Finished in 12.57 seconds
Estimated value of Pi is 3.20000000000000000000
2013-10-02 xx:xx:xx,821 INFO org.apache.hadoop.mapred.JobInProgress: job_201310021831_0004: nMaps=10 nReduces=1 max=-1
2013-10-02 xx:xx:xx,837 INFO org.apache.hadoop.mapred.MesosScheduler: Added job job_201310021831_0004
2013-10-02 xx:xx:xx,837 INFO org.apache.hadoop.mapred.JobTracker: Job job_201310021831_0004 added successfully for user 'hadoop' to queue 'default'

*1:2013/09/30 では、0.0.3

*2:Jobtrackerのあるやつだけでいいかも