Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
359 views
in Technique[技术] by (71.8m points)

python - Tensorflow loading unlabeled local data

Let's say, I am training an autoencoder (so I need to define the input dataset, and also the target output). And I need a dataset that's just images (no labels).

I've tried using flow_from_directory(), but it assigns a class to the dataset, and when passed into training, it will collide with the target data, producing an error.

So I guess what I need is to convert my local images into a dataset with a structure like tensorflow_datasets.mnist.

Folder structure:

/data
  /low
    -0.png
    -1.png
    -...
  /high
    -0.png
    -1.png
    -...

What I've tried:

low_generator = keras.preprocessing.image.ImageDataGenerator(
    rescale=1/255.0,
    validation_split=0.2
)

# when path is directly to the image folder - no images found
# when path is to parent folder, specifying which folder to use - it assigns labels too
train_low_iterator = low_generator.flow_from_directory(
    # 'path to parent directory'
    'path to directory',
    target_size=(480, 270),
    batch_size=10,
    class_mode='input',
    subset='training',
    # add this when path is to parent
    # classes=['low']
)

validation_low_iterator = low_generator.flow_from_directory(
    'same as above',
    target_size=(480, 270),
    batch_size=10,
    class_mode='input',
    subset='validation',
    # same as above
    classes=['low']
)

# analogic to above
high_generator
train_high_iterator
validation_high_iterator

Class_mode None

Source code says, that if None is used as class_mode, it won't be yielding the labels. Source

But neither of these examples worked (same issue as before, either nothing is found, or it's yielding labels again:

iterator = generator.flow_from_directory(
    'parent_path',
    class_mode=None,
    classes=['something']
)
iterator = generator.flow_from_directory(
    'parent_path',
    classes=['something']
)
iterator = generator.flow_from_directory(
    'direct_path',
    class_mode=None
)
iterator = generator.flow_from_directory(
    'direct_path'
)

I've also tried image_dataset_from_directory()

train_low_dataset = keras.utils.image_dataset_from_directory(
    'path/low',
    labels = None,
    label_mode = None,
    color_mode = 'rgb',
    batch_size = 32,
    image_size = (480, 270),
    shuffle = False,
    validation_split = 0.2,
    subset = 'training'
)

This is able to load all the data, return a dataset, but it throws an error (at train start):

raise ValueError("'y' argument is not supported when using " ValueError: 'y' argument is not supported when using python generator as input.

Which I'm not able to resolve right now (since I need to use both input and output data, and both train and validation).

Training

model.fit(
    train_low_iterator, train_high_iterator,
    epochs=15,
    batch_size=8,
    shuffle=True,
    validation_data=(validation_low_iterator, validation_high_iterator)
)
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I tried to create a custom iterator function (a fancy for loop with yield at the end), but unsuccessfully (I'll retry in the future, and will update this answer if I get it properly working).

However what worked, was a generator object converted into a numpy array.

Folder structure as before:

/data
  /low
    -0.png
    -1.png
    -...
  /high
    -0.png
    -1.png
    -...

Creating generator

low = np.array([cv2.imread(f.path) / 255 for f in os.scandir("/data/low")])
high = np.array([cv2.imread(f.path) / 255 for f in os.scandir("/data/high")])

train_low = low[:205]
validate_low = low[205:]
train_high = high[:205]
validate_out = high[205:]

Training

model.fit(x=train_low, y=train_high,
    epochs=10,
    batch_size=1,
    shuffle=True,
    validation_data=(validate_low, validate_high),
)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...