Hadoop Yarn MR(MapReduce) streaming using Shell script

Hello friends,
Let's check how to run one simple map reduce program in Linux environment.
It's a word count program.

1. create file words.txt with few words like shown below.

words.txt
--------------------------------
cow india japan
america japan
hindu muslim christian
india cow
america america america
china
india
china pakistan

2. cp words.txt to hdfs (give appropriate path)
hadoop fs -copyFromLocal words.txt /user/cloudera/words.txt

3. create mapper.sh
wc_mapper.sh
--------------------------
#! /bin/bash
while read line
do
for word in $line
do
    echo $word 1
done
done

4.create reducer.sh
wc_reducer.sh
------------------------
#! /bin/bash
cnt=0
old=''
new=''
start=0
while read line
do
new=`echo $line|cut -d' ' -f1`
if [ "$new" != "$old" ]; then
[ $start -ne 0 ] && echo -e "$old\t$cnt"
old=$new
cnt=1
start=1
else
cnt=$(( $cnt + 1 ))
fi;
done
echo -e "$old\t$cnt"

5. invoke map-reduce using following command. ( Give proper path)

hadoop jar    /usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.3.0-mr1-cdh5.1.0.jar -input /user/cloudera/words.txt -output /user/cloudera/op_wc -mapper wc_mapper.sh -reducer wc_reducer.sh -file wc_mapper.sh   -file wc_reducer.sh

6. Check file created in hdfs
$ hadoop fs -ls -R /user/cloudera/op_wc
-rw-r--r--   1 cloudera cloudera          0 2015-02-18 03:27 /user/cloudera/op_wc/_SUCCESS
-rw-r--r--   1 cloudera cloudera         61 2015-02-18 03:27 /user/cloudera/op_wc/part-00000

$ hadoop fs -cat /user/cloudera/op_wc2/part-00000
america    4
christian    1
cow    2
hindu    1
india    2
japan    2
muslim    1

------------------------------------

Done!!! Enjoy . :)

If you get trouble ping me at dhanooj.world@gmail.com

DhansWorld

Search This Blog

Hadoop Yarn MR(MapReduce) streaming using Shell script

Comments

Post a Comment

Popular posts from this blog

how to get hive table size from metastore mysql

Hadoop Yarn MR(MapReduce) streaming using Shell script part 2

MySQL replication - Master Slave Easy way with Crash test sample