python - What is the difference between the two ways of accessing the hdf5 group in SVHN dataset?

Question

Welcome To Ask or Share your Answers For Others

python - What is the difference between the two ways of accessing the hdf5 group in SVHN dataset?

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - What is the difference between the two ways of accessing the hdf5 group in SVHN dataset?

I need to read the SVHN dataset and was trying to read the filename of the first image.

I am struggling a bit to understand the structure of HDF5 and especially in understanding the hierarchy/structure of the SVHN dataset

What is the difference between these two approaches of reading the name of the image?

I came across method 1 in this script inside the definition of getName() function: https://github.com/bdiesel/tensorflow-svhn/blob/master/digit_struct.py

I played around with the hdf5 format file and came up with method 2 while trying out different things that showed the same result.

# Both these methods read the first character of the name of the 1st
# image in svhn dataset
f = h5py.File(path_to_svhn_dataset,'r')

# method 1 
f[f['digitStruct']['name'][0][0]].value

# method 2
f[f['digitStruct']['name'].value[0].item()].value[0][0]

The first image is the file with filename "1.png". Both the above mentioned ways of getting the first character of the filename will give us int equivalent of ascii '1'-> 49

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T00:56:05+0000

First, there is a minor difference in output from your 2 methods.
Method 1: returns the full array (of the encoded file name)
Method 2: only returns the first element (character) of the array

Let's deconstruct your code to understand what you have.
The first part deals with h5py data objects.

f['digitStruct'] -> returns a h5py group object
f['digitStruct']['name'] -> returns a h5py dataset object
f['digitStruct']['name'].name -> returns the name (path) of the dataset object

Note:
The /digitStruct/name dataset contains "Object References". Each array entry is a pointer to another h5py object (in this case another dataset). For example (spaces used to delineate the 2 object references):
f[ f['digitStruct']['name'][0][0] ] -> returns the object referenced at [0][0]
So, the outer f[ obj_ref ] works just like other object references.

In the case of f['digitStruct']['name'][0][0], this is an object pointing to dataset /#refs#/b In other words, f['digitStruct']['name'][0][0] references the same object as: f['#refs#']['b'] or f['/#refs#/b']

So much for h5py object references.
Let's continue to get the data from this object reference using Method 1.

f[f['digitStruct']['name'][0][0]].value -> returns the entire /#refs#/b dataset as a NumPy array.

However, dataset.value is deprecated, and NumPy indexing is preferred, like this: f[f['digitStruct']['name'][0][0]][:] (to get the entire array)

Note: both of these return the entire array of encoded characters. At this point, getting the name is Python and NumPy fuctionality. Use this to return the filename as a string:
f[f['digitStruct']['name'][0][0]][:].tostring().decode('ascii')

Now let's deconstruct the object reference you used for Method 2.

f['digitStruct']['name'].value -> returns the entire /digitStruct/name dataset as a NumPy array. It has 13,068 rows with object references

f['digitStruct']['name'].value[0] -> is the first row

f['digitStruct']['name'].value[0].item() -> copies that array element to a python scalar

So all of these point to the same object:
Method 1: f['digitStruct']['name'][0][0]
Method 2: f['digitStruct']['name'].value[0].item()
And are both the same as f['#refs#']['b'] or f['/#refs#/b'] for this example.

Like Method 1, getting the string is Python and NumPy fuctionality.

f[f['digitStruct']['name'].value[0].item()][:].tostring().decode('ascii')

Yes, object references are complicated....
My recommendation:
Extract NumPy arrays from objects using NumPy indexing instead of .value (as shown in Modified Method 1 above).

Example code for completeness. Intermediate print statements used to show what's going on.

import h5py

# Both of these methods read the name of the 1st
# image in svhn dataset
f = h5py.File('test_digitStruct.mat','r')
print (f['digitStruct'])
print (f['digitStruct']['name'])
print (f['digitStruct']['name'].name)

# method 1
print('
test method 1')
print (f[f['digitStruct']['name'][0][0]])
print (f[f['digitStruct']['name'][0][0]].name)
#  both of these get the entire array / filename:
print (f[f['digitStruct']['name'][0][0]].value)
print (f[f['digitStruct']['name'][0][0]][:]) # same as .value above
print (f[f['digitStruct']['name'][0][0]][:].tostring().decode('ascii'))

# method 2
print('
test method 2')
print (f[f['digitStruct']['name'].value[0].item()]) 
print (f[f['digitStruct']['name'].value[0].item()].name) 

# this only gets the first array member / character:
print (f[f['digitStruct']['name'].value[0].item()].value[0][0])
print (f[f['digitStruct']['name'].value[0].item()].value[0][0].tostring().decode('ascii'))
#  this gets the entire array / filename:
print (f[f['digitStruct']['name'].value[0].item()][:])
print (f[f['digitStruct']['name'].value[0].item()][:].tostring().decode('ascii'))

Output from last 2 print statements for each method is identical:

[[ 49]
 [ 46]
 [112]
 [110]
 [103]]
1.png

Categories

python - What is the difference between the two ways of accessing the hdf5 group in SVHN dataset?

python - What is the difference between the two ways of accessing the hdf5 group in SVHN dataset?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags