Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
326 views
in Technique[技术] by (71.8m points)

c# - 2d-Array with more than 65535^2 elements --> Array dimensions exceeded supported range

I've a 64-bit PC with 128 GB of RAM and I'm using C# and .NET 4.5. I've the following code:

double[,] m1 = new double[65535, 65535];
long l1 = m1.LongLength;

double[,] m2 = new double[65536, 65536]; // Array dimensions exceeded supported range
long l2 = m2.LongLength;

I'm aware of <gcAllowVeryLargeObjects enabled="true" /> and I've set it to true.

Why can a multidimensional array not have more than 4294967295 elements? I saw the following answer https://stackoverflow.com/a/2338797/7556646.

I checked as well the documentation for gcAllowVeryLargeObjects and I saw the following remark.

The maximum number of elements in array is UInt32.MaxValue (4294967295).

I cannot understand why there is this limit? Is there a workaround? Is it planned to remove this limit in an upcoming version of .net?

I need the elements in that why in the memory because I want to compute for example a symmetric eigen-value decomposition using Intel MKL.

[DllImport("custom_mkl", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true, SetLastError = false)]
internal static extern lapack_int LAPACKE_dsyevd(
    int matrix_layout, char jobz, char uplo, lapack_int n, [In, Out] double[,] a, lapack_int lda, [In, Out] double[] w);
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Disclaimer: This one turn out waay longer than expected

Why the CLR doesn't support large arrays

There are multiple reasons why the CLR doesn't support large arrays on the managed heap.

Some of them are technical, some of them might be "paradigmal".

This blog post goes into some of the reasons as to why there is a limitation. Essentially there was a decision to limit the maximum size of (capital O) Objects due to memory fragmentation. The cost of implementing the handling of larger objects was weighed against the fact that not many use cases exist[ed] that would require such large objects and those that did, would - in most cases - be due to a design fallacy of the programmer. And since, for the CLR, everything is an Object, this limitation also applies to arrays. To enforce this limitation array indexers were designed with signed integers.

But once you have made sure, that your program design requires you to have such large arrays you are going to need a workaround.

The above mentioned blog post also demonstrates that you can implement big arrays without going into unmanaged territory.

But as Evk has pointed out in the comments you want to pass the array as a whole to an external function via PInvoke. That means you'll need the array on the unmanaged heap, or it'll have to be marshaled during the call. And marshaling the whole thing is a bad idea with arrays this large.

Workaround

So since the managed heap is out of the question you'll need to allocate space on the unmanaged heap and use that space for your array.

Let's say you need 8 GB worth of space:

long size = (1L << 33);
IntPtr basePointer = System.Runtime.InteropServices.Marshal.AllocHGlobal((IntPtr)size);

Great! Now you have a region in virtual memory where you can store up to 8 GB worth of data.

How do I turn this into an array?

Well there are two approaches in C#

The "Unsafe" Approach

This will let you work with pointers. And pointers can be cast to arrays. (In vanilla C they are often one and the same)

If you have a good idea on how to realize 2D Arrays via pointers, then this will be the best option for you.

Here is a pointer

The "Marshal" Approach

You don't need the unsafe context and have to instead "marshal" your data from the managed heap to the unmanaged one. You'll still have to understand pointer arithmetic.

The two main functions you'll want to use are PtrToStructure and the reverse StructureToPtr. With one you'll get a copy of a value type (such as a double) out of a specified position on the unmanaged heap. With the other you'll put a copy of a value type on the unmanaged heap.

Both approaches are "unsafe" in a sense. You'll need to know your pointers

Common Pitfalls include but are not limited to:

  • Forgetting to check bounds rigorously
  • Mixing up the size of my elements
  • Messing up the alignment
  • Mixing up what kind of 2D Array you want
  • Forgetting about padding with 2D Arrays
  • Forgetting to free memory
  • Forgetting to have freed memory and using it anyways

You'll probably want to turn your 2D array desing into a 1D array design


In any case you would want to wrap it all into a class with the appropriate checks and destsructors.

Basic Example for Inspiration

What follows is a generic class that is "like" an array, based on the unmanaged heap.

Features inclulde:

  • It has an index accessor that accepts 64 bit integers.
  • It restricts the types that T can become to value types.
  • It has bounds checking and is disposable.

If you notice, I don't do any type checking, so if Marshal.SizeOf fails to return the correct number we are falling in one of the pits mentioned above.

Features that you'll have to implement yourself include:

  • 2D Accessor and 2D Array arithmetic (depending on what the other library expects, often it's something like p = x * size + y
  • Exposed pointer for PInvoke purposes (Or an internal call)

So use this only as a inspiration, if at all.

using static System.Runtime.InteropServices.Marshal;

public class LongArray<T> : IDisposable where T : struct {
    private IntPtr _head;
    private Int64 _capacity;
    private UInt64 _bytes;
    private Int32 _elementSize;

    public LongArray(long capacity) {
        if(_capacity < 0) throw new ArgumentException("The capacity can not be negative");
        _elementSize = SizeOf(default(T));
        _capacity = capacity;
        _bytes = (ulong)capacity * (ulong)_elementSize;

        _head = AllocHGlobal((IntPtr)_bytes);   
    }

    public T this[long index] {
        get {
            IntPtr p = _getAddress(index);

            T val = (T)System.Runtime.InteropServices.Marshal.PtrToStructure(p, typeof(T));

            return val;
        }
        set {
            IntPtr p = _getAddress(index);

            StructureToPtr<T>(value, p, true);
        }
    }

    protected bool disposed = false;
    public void Dispose() {
        if(!disposed) {
            FreeHGlobal((IntPtr)_head);
            disposed = true;
        }
    }

    protected IntPtr _getAddress(long index) {
        if(disposed) throw new ObjectDisposedException("Can't access the array once it has been disposed!");
        if(index < 0) throw new IndexOutOfRangeException("Negative indices are not allowed");
        if(!(index < _capacity)) throw new IndexOutOfRangeException("Index is out of bounds of this array");
        return (IntPtr)((ulong)_head + (ulong)index * (ulong)(_elementSize));
    }
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...