当前位置:网站首页>HashShuffleManager
HashShuffleManager
2022-01-15 02:23:54 【manba_ yqq】
1. Common mechanisms
- Execute the process
1. every last map task Write different results into different buffer in , Every buffer The size is 32K.buffer Play the role of data cache .
2. Every buffer The last file corresponds to a small disk file .
3.reduce task To pull the corresponding disk file . - summary
1.map task The results of the calculation are based on the partition ( The default is hashPartitioner) To decide which disk file to write to . ReduceTask Will go to Map End pull the corresponding disk file .
2. The number of small disk files produced : M(map task The number of )*R(reduce task The number of ) - The problem is , There are too many small files on the disk , Will cause the following problems :
1. stay Shuffle Write In the process, there will be many objects that write small files on the disk .
2. stay Shuffle Read There will be many objects to read small files on the disk
3. stay JVM Too many objects in heap memory will cause frequent gc,gc It can't be solved yet Memory needed to run Words , will OOM.
4. There will be frequent network communication in the process of data transmission , Frequent network communication Now the possibility of communication failure is greatly increased , Once the network communication fails, it will Lead to shuffle file cannot find Because of this mistake task loss Defeat ,TaskScheduler Not responsible for retrying , from DAGScheduler be responsible for retry Stage.
2. Merger mechanism
- summary
The number of small files produced on the disk :C(core The number of )*R(reduce The number of )
版权声明
本文为[manba_ yqq]所创,转载请带上原文链接,感谢
https://chowdera.com/2021/12/202112122242278947.html
边栏推荐
- Problème avec les demandes inter - domaines ne portant pas de cookies
- 数据分析八大模型:OGSM模型
- Appel asynchrone, Multithreading
- Android中的羊角符,面试看这个就够了
- Compréhension approfondie du Multithreading
- Utilisation de is et as
- Classe générique, interface générique
- Classe générique, héritage de l'interface générique, délégué
- Exercice de base de données d'accès
- Accès à la base de données SQL avec Multithreading, invoke et action
猜你喜欢
-
Écrire et tester le Protocole Modbus
-
. net how to connect to Youxuan database?
-
Splitting e-commerce system into micro service
-
Écrire un programme winform en utilisant plusieurs threads
-
Déclarations SQL couramment utilisées
-
Utilisez le texte. Json analyse le fichier json
-
Plusieurs adresses de nuget
-
Lire Modbus TCP avec nmodbus
-
Module 6 operation of the actual combat camp
-
TypeError: Object of type ‘TrackedArray‘ is not JSON serializable
随机推荐
- The world is always hostile to good people.
- Re regular matching findall (. +?) Match any content that conforms to a certain format (regular matching catch bullet screen)
- Android中的羊角符,面試看這個就够了
- 數據分析八大模型:OGSM模型
- La corne d'agneau d'Android, c'est assez pour l'interview
- Huit modèles d'analyse des données: modèle ogsm
- Exemple d'application de linq
- Utilisez S7. Net communication library
- Écrire La Bibliothèque de communication Modbus TCP
- Lire le profil INI
- Utilisez S7. Net read Siemens 1500plc
- Halcon joint C # Programming Experiment
- Utiliser nmodbus4 pour lire les données à la fois RTU et TCP
- Tiktok Data Analysis options Platform - tichoo Data
- MySQL review: create tables, MySQL data types, primary key constraints, primary key
- Linear Algebra: matrix review
- Review of Linear Algebra: determinant
- The digital RMB cross-border payment test has been continuously promoted, and mainland residents can also shop in Hong Kong in the future
- Thesis classification and writing basis
- YC Framework version update: v1.0 zero point two
- Analyse des données tichoo
- Tiktok data analysis platform
- Partage de l'industrie | tichoo Data to attend 2022 Overseas Short video Industry Summit
- [ticoo Information Station] tiktok and Cross - Border E - commerce Weekly Report
- Options d'analyse des données ticoo {infostation}
- Partage de l'industrie | Lu Shidong, PDG de tichoo Data Outlook Global Video e - commerce future Blueprint
- [ticoo Information Station]
- Noël Black Five
- YC Framework version update: v1.0 zero point two
- Lucene分词器
- Gbase 8A slow SQL optimization case
- 微服务系列--聊聊微服务治理中的一些感悟
- 线程池的经典应用场景
- [web security from getting started to giving up] 07_ Insecure file download and upload vulnerability
- 如何落地一款重试组件
- 一起聊聊设计原则
- 大话Redis系列--深入探讨多路复用(上)
- 大话Redis系列--实战案例总结(下)
- 大话Redis系列--实战案例总结(上)
- JVM系列 -- G1与低延迟垃圾收集器