Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
361 views
in Technique[技术] by (71.8m points)

python - Pass hdf5 file to h5py as binary blob / string?

How can I bypass disk I/O in h5py? Currently I have to do something like this:

msg = socket.recv()
fp = open("tmp.hdf5", 'wb')
fp.write(msg)
fp.close()
f = h5py.File('tmp.hdf5', 'r')

... # alter the file

fp = open("tmp.hdf5", 'rb')
msg = fp.read()
msg = f.toString()
socket.send(data)

I want to do something like this:

msg = socket.recv()
f = h5py.File(msg, driver='core')
... # alter the file
msg = f.toString()
socket.send(msg)

My issue here is speed - disk I/O is too huge of a bottleneck. Is there a quick & easy way to create h5py File objects strings, and subsequently extract the file as a string? I'm willing to go with something like Cython if it comes to it...

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

How can I bypass disk I/O in h5py?

Here is a working example:

"""HDF5 in memory file reading example."""

try:
    import contextlib
    import os
    import tempfile

    import h5py

    hdf5_data = (
        b'x89HDF
x1a
x00x00x00x00x00x08x08x00x04x00x10x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00xffxffxffxffxff'
        b'xffxffxff|x05x00x00x00x00x00x00xffxffxffxffxffxff'
        b'xffxffx00x00x00x00x00x00x00x00`x00x00x00x00x00x00'
        b'x00x01x00x00x00x00x00x00x00x88x00x00x00x00x00x00x00'
        b'xa8x02x00x00x00x00x00x00x01x00x01x00x01x00x00x00x18'
        b'x00x00x00x00x00x00x00x11x00x10x00x00x00x00x00x88x00'
        b'x00x00x00x00x00x00xa8x02x00x00x00x00x00x00TREEx00x00'
        b'x01x00xffxffxffxffxffxffxffxffxffxffxffxffxffxffxff'
        b'xffx00x00x00x00x00x00x00x000x04x00x00x00x00x00x00'
        b'x08x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00HEAPx00x00x00x00X'
        b'x00x00x00x00x00x00x00x10x00x00x00x00x00x00x00xc8x02'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00testx00x00'
        b'x00x00x01x00x00x00x00x00x00x00Hx00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x01x00x06x00x01x00x00x00x00x01x00'
        b'x00x00x00x00x00x01x00(x00x00x00x00x00x01x02x01x00'
        b'x00x00x00x00x01x00x00x00x00x00x00x00x01x00x00x00x00'
        b'x00x00x00x01x00x00x00x00x00x00x00x01x00x00x00x00x00'
        b'x00x00x03x00x10x00x01x00x00x00x10x08x00x00x04x00x00'
        b'x00x00x00 x00x00x00x00x00x05x00x08x00x01x00x00x00'
        b'x02x02x02x01x00x00x00x00x08x00x18x00x01x00x00x00x03'
        b'x01xx05x00x00x00x00x00x00x04x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x12x00x08x00x00x00x00x00x01x00x00'
        b'x00xedxf3xa1Yx00x00px00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00SNODx01x00x01x00x08x00x00x00x00x00x00'
        b'x00 x03x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00x00'
        b'x00x00x00x00x00x00x0090x00x00'
    )

    file_access_property_list = h5py.h5p.create(h5py.h5p.FILE_ACCESS)
    file_access_property_list.set_fapl_core(backing_store=False)
    file_access_property_list.set_file_image(hdf5_data)

    file_id_args = {
        'fapl': file_access_property_list,
        'flags': h5py.h5f.ACC_RDONLY,
        'name': next(tempfile._get_candidate_names()).encode(),
    }

    h5_file_args = {'backing_store': False, 'driver': 'core', 'mode': 'r'}

    with contextlib.closing(h5py.h5f.open(**file_id_args)) as file_id:
        with h5py.File(file_id, **h5_file_args) as h5_file:
            assert h5_file['test'][0] == 12345

    assert not os.path.exists(file_id_args['name'])

except:
    print('Something went wrong!')
    raise

else:
    print('It works!! You can read HDF5 in memory!')

This seems like a hack, but it was what I found when digging around the h5py Github.

As you can see from assert not os.path.exists(file_id_args['name']), although I passed a file name (which should not exist in the file system), no file was actually created.

See:

Please if someone knows a less hacky way to do this or knows how to simplify my code, just edit this response or leave a comment.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...