Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
436 views
in Technique[技术] by (71.8m points)

hadoop - Difference between `yarn.scheduler.maximum-allocation-mb` and `yarn.nodemanager.resource.memory-mb`?

What is difference between yarn.scheduler.maximum-allocation-mb and yarn.nodemanager.resource.memory-mb?

I see both of these in yarn-site.xml and I see the explanations here.

yarn.scheduler.maximum-allocation-mb is given the following definition: The maximum allocation for every container request at the RM, in MBs. Memory requests higher than this will throw a InvalidResourceRequestException. Does this mean memory requests ONLY on the resourcemanager are limited by this value?

And yarn.nodemanager.resource.memory-mb is given definition of Amount of physical memory, in MB, that can be allocated for containers. Does this mean the total amount for all containers across the entire cluster, summed together?

HOwever, I still cannot discern between these. Those explanations make me think that they are the same.

Even more confusing, their default values are exactly the same: 8192 mb. How do I tell difference between these? Thank you.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Consider in a scenario where you are setting up a cluster where each machine having 48 GB of RAM. Some of this RAM should be reserved for Operating System and other installed applications.


yarn.nodemanager.resource.memory-mb:

Amount of physical memory, in MB, that can be allocated for containers. It means the amount of memory YARN can utilize on this node and therefore this property should be lower than the total memory of that machine.

<name>yarn.nodemanager.resource.memory-mb</name>
<value>40960</value> <!-- 40 GB -->

The next step is to provide YARN guidance on how to break up the total resources available into Containers. You do this by specifying the minimum unit of RAM to allocate for a Container.

In yarn-site.xml

<name>yarn.scheduler.minimum-allocation-mb</name> <!-- RAM-per-container ->
 <value>2048</value>

yarn.scheduler.maximum-allocation-mb:

It defines the maximum memory allocation available for a container in MB

it means RM can only allocate memory to containers in increments of "yarn.scheduler.minimum-allocation-mb" and not exceed "yarn.scheduler.maximum-allocation-mb" and It should not be more then total allocated memory of the Node.

In yarn-site.xml

<name>yarn.scheduler.maximum-allocation-mb</name> <!-Max RAM-per-container->
 <value>8192</value>

For MapReduce applications, YARN processes each map or reduce task in a container and on a single machine there can be number of containers. We want to allow for a maximum of 20 containers on each node, and thus need (40 GB total RAM) / (20 # of containers) = 2 GB minimum per container controlled by property yarn.scheduler.minimum-allocation-mb

Again we want to restrict maximum memory utilization for a container controlled by property "yarn.scheduler.maximum-allocation-mb"

For example, if one job is asking for 2049 MB memory per map container(mapreduce.map.memory.mb=2048 set in mapred-site.xml), RM will give it one 4096 MB(2*yarn.scheduler.minimum-allocation-mb) container.

If you have a huge MR job which asks for a 9999 MB map container, the job will be killed with the error message.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...