Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.3k views
in Technique[技术] by (71.8m points)

python 3.x - Saving Numpy Structure Array to *.mat file

I am using numpy.loadtext to generate a structured Numpy array from a CSV data file that I would like to save to a MAT file for colleagues who are more familiar with MATLAB than Python.

Sample case:

import numpy as np
import scipy.io

mydata = np.array([(1, 1.0), (2, 2.0)], dtype=[('foo', 'i'), ('bar', 'f')])
scipy.io.savemat('test.mat', mydata)

When I attempt to use scipy.io.savemat on this array, the following error is thrown:

Traceback (most recent call last):
  File "C:/Project Data/General Python/test.py", line 6, in <module>
    scipy.io.savemat('test.mat', mydata)
  File "C:python35libsite-packagesscipyiomatlabmio.py", line 210, in savemat
    MW.put_variables(mdict)
  File "C:python35libsite-packagesscipyiomatlabmio5.py", line 831, in put_variables
    for name, var in mdict.items():
AttributeError: 'numpy.ndarray' object has no attribute 'items'

I'm a Python novice (at best), but I'm assuming this is because savemat is set up to handle dicts and the structure of Numpy's structured arrays is not compatible.

I can get around this error by pulling my data into a dict:

tmp = {}
for varname in mydata.dtype.names:
    tmp[varname] = mydata[varname]

scipy.io.savemat('test.mat', tmp)

Which loads into MATLAB fine:

>> mydata = load('test.mat')

mydata = 

    foo: [1 2]
    bar: [1 2]

But this seems like a very inefficient method since I'm duplicating the data in memory. Is there a smarter way to accomplish this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can do scipy.io.savemat('test.mat', {'mydata': mydata}).

This creates a struct mydata with fields foo and bar in the file.

Alternatively, you can pack your loop in a dict comprehension:

tmp = {varname: mydata[varname] for varname in mydata.dtype.names}

I don't think creating a temprorary dictionary duplicates data in memory, because Python generally only stores references, and numpy in particular tries to create views into the original data whenever possible.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...