Reading from Numpy's policy for releasing memory it seems like numpy
does not have any special handling of memory allocation/deallocation. It simply calls free()
when the reference count goes to zero. In fact it's pretty easy to replicate the issue with any built-in python object. The problem lies at the OS level.
Nathaniel Smith has written an explanation of what is happening in one of his replies in the linked thread:
In general, processes can request memory from the OS, but they cannot
give it back. At the C level, if you call free()
, then what actually
happens is that the memory management library in your process makes a
note for itself that that memory is not used, and may return it from a
future malloc()
, but from the OS's point of view it is still
"allocated". (And python uses another similar system on top for
malloc()
/free()
, but this doesn't really change anything.) So the OS
memory usage you see is generally a "high water mark", the maximum
amount of memory that your process ever needed.
The exception is that for large single allocations (e.g. if you create
a multi-megabyte array), a different mechanism is used. Such large
memory allocations can be released back to the OS. So it might
specifically be the non-numpy
parts of your program that are producing
the issues you see.
So, it seems like there is no general solution to the problem .Allocating many small objects will lead to a "high memory usage" as profiled by the tools, even thou it will be reused when needed, while allocating big objects wont show big memory usage after deallocation because memory is reclaimed by the OS.
You can verify this allocating built-in python objects:
In [1]: a = [[0] * 100 for _ in range(1000000)]
In [2]: del a
After this code I can see that memory is not reclaimed, while doing:
In [1]: a = [[0] * 10000 for _ in range(10000)]
In [2]: del a
the memory is reclaimed.
To avoid memory problems you should either allocate big arrays and work with them(maybe use views to "simulate" small arrays?), or try to avoid having many small arrays at the same time. If you have some loop that creates small objects you might explicitly deallocate objects not needed at every iteration instead of doing this only at the end.
I believe Python Memory Management gives good insights on how memory is managed in python. Note that, on top of the "OS problem", python adds another layer to manage memory arenas, which can contribute to high memory usage with small objects.