First, there is a minor difference in output from your 2 methods.
Method 1: returns the full array (of the encoded file name)
Method 2: only returns the first element (character) of the array
Let's deconstruct your code to understand what you have.
The first part deals with h5py
data objects.
f['digitStruct']
-> returns a h5py group object
f['digitStruct']['name']
-> returns a h5py dataset object
f['digitStruct']['name'].name
-> returns the name (path) of the dataset object
Note:
The /digitStruct/name
dataset contains "Object References". Each array entry is a pointer to another h5py object (in this case another dataset).
For example (spaces used to delineate the 2 object references):
f[ f['digitStruct']['name'][0][0] ]
-> returns the object referenced at [0][0]
So, the outer f[ obj_ref ]
works just like other object references.
In the case of f['digitStruct']['name'][0][0]
, this is an object pointing to dataset /#refs#/b
In other words, f['digitStruct']['name'][0][0]
references the same object as:
f['#refs#']['b']
or f['/#refs#/b']
So much for h5py object references.
Let's continue to get the data from this object reference using Method 1.
f[f['digitStruct']['name'][0][0]].value
-> returns the entire /#refs#/b
dataset as a NumPy array.
However, dataset.value
is deprecated, and NumPy indexing is preferred, like this:
f[f['digitStruct']['name'][0][0]][:]
(to get the entire array)
Note: both of these return the entire array of encoded characters.
At this point, getting the name is Python and NumPy fuctionality.
Use this to return the filename as a string:
f[f['digitStruct']['name'][0][0]][:].tostring().decode('ascii')
Now let's deconstruct the object reference you used for Method 2.
f['digitStruct']['name'].value
-> returns the entire /digitStruct/name
dataset as a NumPy array.
It has 13,068 rows with object references
f['digitStruct']['name'].value[0]
-> is the first row
f['digitStruct']['name'].value[0].item()
-> copies that array element to a python scalar
So all of these point to the same object:
Method 1: f['digitStruct']['name'][0][0]
Method 2: f['digitStruct']['name'].value[0].item()
And are both the same as f['#refs#']['b']
or f['/#refs#/b']
for this example.
Like Method 1, getting the string is Python and NumPy fuctionality.
f[f['digitStruct']['name'].value[0].item()][:].tostring().decode('ascii')
Yes, object references are complicated....
My recommendation:
Extract NumPy arrays from objects using NumPy indexing instead of .value
(as shown in Modified Method 1 above).
Example code for completeness. Intermediate print statements used to show what's going on.
import h5py
# Both of these methods read the name of the 1st
# image in svhn dataset
f = h5py.File('test_digitStruct.mat','r')
print (f['digitStruct'])
print (f['digitStruct']['name'])
print (f['digitStruct']['name'].name)
# method 1
print('
test method 1')
print (f[f['digitStruct']['name'][0][0]])
print (f[f['digitStruct']['name'][0][0]].name)
# both of these get the entire array / filename:
print (f[f['digitStruct']['name'][0][0]].value)
print (f[f['digitStruct']['name'][0][0]][:]) # same as .value above
print (f[f['digitStruct']['name'][0][0]][:].tostring().decode('ascii'))
# method 2
print('
test method 2')
print (f[f['digitStruct']['name'].value[0].item()])
print (f[f['digitStruct']['name'].value[0].item()].name)
# this only gets the first array member / character:
print (f[f['digitStruct']['name'].value[0].item()].value[0][0])
print (f[f['digitStruct']['name'].value[0].item()].value[0][0].tostring().decode('ascii'))
# this gets the entire array / filename:
print (f[f['digitStruct']['name'].value[0].item()][:])
print (f[f['digitStruct']['name'].value[0].item()][:].tostring().decode('ascii'))
Output from last 2 print statements for each method is identical:
[[ 49]
[ 46]
[112]
[110]
[103]]
1.png