Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
903 views
in Technique[技术] by (71.8m points)

python - Multiprocessing module showing memory for each child process same as Main process.

I am using multiprocessing module of python and have some confusions regarding the same.

Basically, I store some data initially in the Main process, and that is around 16GB (main memory size) as shown in the top command. I have stored these data as global Variables.

Then multiprocessing is done on this data, and processed accordingly and differently accordingly.

Now I see that multiprocessing is happening i.e. all processes has its own CPU utilization, but the memory of all the processes in 16 GB each.. why so.?? Isn't it should use same memory that I send through pass by reference of global variables.. Please some thoughts.

The output of top command is as follows.:-

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 13908 admin 20 0 16.7g 16g 848 R 100.0 17.3 0:32.92 python
13429 admin 20 0 16.7g 16g 3336 S 0.0 17.3 15:06.97 python
13910 admin 20 0 16.7g 16g 848 R 100.3 17.3 0:32.94 python
13911 admin 20 0 16.7g 16g 840 R 100.0 17.3 0:33.02 python
13912 admin 20 0 16.7g 16g 836 R 99.6 17.3 0:33.00 python
13907 admin 20 0 16.7g 16g 796 R 100.0 17.3 0:33.06 python
13909 admin 20 0 16.7g 16g 796 R 99.6 17.3 0:32.93 python

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Each process spawned by the multiprocessing module is in a separate address space. All physical and virtual memory that the original process had is at least logically independent of the new ones after the new ones are created, but initially each new process is an exact duplicate (well, see footnote) of the old. Thus, each will have the same virtual size (16.7 GB) as the original.

Actual underlying physical pages are shared as much as possible, using "copy-on-write". As the various copies run and make changes to their virtual memory, the kernel will copy the underlying physical page as needed. Memory that is never written-to can be shared between all the copies. So even though each process appears to be chewing up a lot of RAM, they aren't, really. If you write to most of it, though—i.e., if each separate process changes most of the 16 GB of data—then they will all have separate copies, and use much more physical RAM.

The multiprocessing module does offer some methods of sharing data (see the "shared memory" section in http://docs.python.org/library/multiprocessing.html) if you want them to share modifications (but then think about how the locking works; see the documentation).


footnote: There's one tiny difference between the original and the clone, after a fork or clone system call: the original gets back the ID of the clone, and the clone gets back the number zero.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...