当前位置:网站首页>MapReduce on yarn process

MapReduce on yarn process

2021-05-04 12:08:46 Huang Wenchao

MapReduce on yarn The process of

1、 Preparation stage

1、 The client executes the written jar Package 

2、 Client pass RPC call YarnRunnable.getNewJobID() Get new job id

3、 Then check if our output path exists ( If it exists, it will report an error )

4、 And then calculate the slice , See if you can calculate the slice information ( If there's no problem, go on )

5、 Apply the corresponding information to the current application (jar package ,job Configuration information , Piece information ) Submitted to the hdfs://.../staging/application_id below 

6、 Client pass RPC Method call  YarnRunner.submit() Method to job Submitted to the resourceManager Apply to run under 

2、 Operation phase

1、job The request will be converted to  MR AppMaster request 

2、MR AppMaster The request will be submitted to scheduler Inside , from scheduler Distribution containers 

3、 to Appmaster Apply for containers ( In which NameNode On , Assign a few cores , How many? cpu, How much RAM? )

4、 stay NodeManager Start one in MR AppMaster( This creates a bookkeeping object , Used to keep each task Implementation progress . state )

5、 And then get the slice information 、 Configuration information 、split Number , Every split nodal reduce Number 
6、 Normal execution MapReduce, Get partition data 

7、 Pull it Reduce Task Program , There are just a few zones Reduce Task Program 

8、 Waiting for execution to end 

3、 Closing phase

1、 Write results to HDFS, One reduce Task Write to a file , The name is part-r-00000,part-r-00001

版权声明
本文为[Huang Wenchao]所创,转载请带上原文链接,感谢
https://chowdera.com/2021/05/20210504120610107R.html

随机推荐