Hello friends,
Let's check how to run one simple map reduce program in Linux environment.
It's a word count program.
1. create file words.txt with few words like shown below.
words.txt
--------------------------------
cow india japan
america japan
hindu muslim christian
india cow
america america america
china
india
china pakistan
2. cp words.txt to hdfs (give appropriate path)
hadoop fs -copyFromLocal words.txt /user/cloudera/words.txt
3. create mapper.sh
wc_mapper.sh
--------------------------
#! /bin/bash
while read line
do
for word in $line
do
echo $word 1
done
done
4.create reducer.sh
wc_reducer.sh
------------------------
#! /bin/bash
cnt=0
old=''
new=''
start=0
while read line
do
new=`echo $line|cut -d' ' -f1`
if [ "$new" != "$old" ]; then
[ $start -ne 0 ] && echo -e "$old\t$cnt"
old=$new
cnt=1
start=1
else
cnt=$(( $cnt + 1 ))
fi;
done
echo -e "$old\t$cnt"
5. invoke map-reduce using following command. ( Give proper path)
hadoop jar /usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.3.0-mr1-cdh5.1.0.jar -input /user/cloudera/words.txt -output /user/cloudera/op_wc -mapper wc_mapper.sh -reducer wc_reducer.sh -file wc_mapper.sh -file wc_reducer.sh
6. Check file created in hdfs
$ hadoop fs -ls -R /user/cloudera/op_wc
-rw-r--r-- 1 cloudera cloudera 0 2015-02-18 03:27 /user/cloudera/op_wc/_SUCCESS
-rw-r--r-- 1 cloudera cloudera 61 2015-02-18 03:27 /user/cloudera/op_wc/part-00000
$ hadoop fs -cat /user/cloudera/op_wc2/part-00000
america 4
christian 1
cow 2
hindu 1
india 2
japan 2
muslim 1
------------------------------------
Done!!! Enjoy . :)
If you get trouble ping me at dhanooj.world@gmail.com
Let's check how to run one simple map reduce program in Linux environment.
It's a word count program.
1. create file words.txt with few words like shown below.
words.txt
--------------------------------
cow india japan
america japan
hindu muslim christian
india cow
america america america
china
india
china pakistan
2. cp words.txt to hdfs (give appropriate path)
hadoop fs -copyFromLocal words.txt /user/cloudera/words.txt
3. create mapper.sh
wc_mapper.sh
--------------------------
#! /bin/bash
while read line
do
for word in $line
do
echo $word 1
done
done
4.create reducer.sh
wc_reducer.sh
------------------------
#! /bin/bash
cnt=0
old=''
new=''
start=0
while read line
do
new=`echo $line|cut -d' ' -f1`
if [ "$new" != "$old" ]; then
[ $start -ne 0 ] && echo -e "$old\t$cnt"
old=$new
cnt=1
start=1
else
cnt=$(( $cnt + 1 ))
fi;
done
echo -e "$old\t$cnt"
5. invoke map-reduce using following command. ( Give proper path)
hadoop jar /usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.3.0-mr1-cdh5.1.0.jar -input /user/cloudera/words.txt -output /user/cloudera/op_wc -mapper wc_mapper.sh -reducer wc_reducer.sh -file wc_mapper.sh -file wc_reducer.sh
6. Check file created in hdfs
$ hadoop fs -ls -R /user/cloudera/op_wc
-rw-r--r-- 1 cloudera cloudera 0 2015-02-18 03:27 /user/cloudera/op_wc/_SUCCESS
-rw-r--r-- 1 cloudera cloudera 61 2015-02-18 03:27 /user/cloudera/op_wc/part-00000
$ hadoop fs -cat /user/cloudera/op_wc2/part-00000
america 4
christian 1
cow 2
hindu 1
india 2
japan 2
muslim 1
------------------------------------
Done!!! Enjoy . :)
If you get trouble ping me at dhanooj.world@gmail.com
Comments
Post a Comment