python - PyTorch memory model: "torch.from_numpy()" vs "torch.Tensor()"

Question

Welcome To Ask or Share your Answers For Others

python - PyTorch memory model: "torch.from_numpy()" vs "torch.Tensor()"

posted Oct 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - PyTorch memory model: "torch.from_numpy()" vs "torch.Tensor()"

I'm trying to have an in-depth understanding of how PyTorch Tensor memory model works.

# input numpy array
In [91]: arr = np.arange(10, dtype=float32).reshape(5, 2)

# input tensors in two different ways
In [92]: t1, t2 = torch.Tensor(arr), torch.from_numpy(arr)

# their types
In [93]: type(arr), type(t1), type(t2)
Out[93]: (numpy.ndarray, torch.FloatTensor, torch.FloatTensor)

# ndarray 
In [94]: arr
Out[94]: 
array([[ 0.,  1.],
       [ 2.,  3.],
       [ 4.,  5.],
       [ 6.,  7.],
       [ 8.,  9.]], dtype=float32)

I know that PyTorch tensors share the memory buffer of NumPy ndarrays. Thus, changing one will be reflected in the other. So, here I'm slicing and updating some values in the Tensor t2

In [98]: t2[:, 1] = 23.0

And as expected, it's updated in t2 and arr since they share the same memory buffer.

In [99]: t2
Out[99]: 

  0  23
  2  23
  4  23
  6  23
  8  23
[torch.FloatTensor of size 5x2]


In [101]: arr
Out[101]: 
array([[  0.,  23.],
       [  2.,  23.],
       [  4.,  23.],
       [  6.,  23.],
       [  8.,  23.]], dtype=float32)

But, t1 is also updated. Remember that t1 was constructed using torch.Tensor() whereas t2 was constructed using torch.from_numpy()

In [100]: t1
Out[100]: 

  0  23
  2  23
  4  23
  6  23
  8  23
[torch.FloatTensor of size 5x2]

So, no matter whether we use torch.from_numpy() or torch.Tensor() to construct a tensor from an ndarray, all such tensors and ndarrays share the same memory buffer.

Based on this understanding, my question is why does a dedicated function torch.from_numpy() exists when simply torch.Tensor() can do the job?

I looked at the PyTorch documentation but it doesn't mention anything about this? Any ideas/suggestions?

question from:https://stackoverflow.com/questions/48482787/pytorch-memory-model-torch-from-numpy-vs-torch-tensor

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T05:37:14+0000

from_numpy() automatically inherits input array dtype. On the other hand, torch.Tensor is an alias for torch.FloatTensor.

Therefore, if you pass int64 array to torch.Tensor, output tensor is float tensor and they wouldn't share the storage. torch.from_numpy gives you torch.LongTensor as expected.

a = np.arange(10)
ft = torch.Tensor(a)  # same as torch.FloatTensor
it = torch.from_numpy(a)

a.dtype  # == dtype('int64')
ft.dtype  # == torch.float32
it.dtype  # == torch.int64

Categories

python - PyTorch memory model: "torch.from_numpy()" vs "torch.Tensor()"

python - PyTorch memory model: "torch.from_numpy()" vs "torch.Tensor()"

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags