当前位置:网站首页>Deep learning - two ways to load training data

Deep learning - two ways to load training data

2021-09-15 03:48:20 Catch wind

Mode one : Load pictures into memory

1. advantage :
There is no need to classify the data
2. shortcoming :
High memory requirements , If the amount of data is too large , Insufficient memory capacity , Easy to cause computer system crash
3. Applicable file storage method :
Pictures are not grouped into folders according to labels
image.png

4. Code implementation

  1. Define folder path
TRAIN_DATA_PATH = 'E:/mldata/dogvscat/data/train/'
TEST_DATA_PATH = 'E:/mldata/dogvscat/data//test1/'
  1. Read picture production data set
imgs_per_cat = 1000
image_size = (200, 200)
labels = []
train_images = []
for item in os.listdir(TRAIN_DATA_PATH):
    print("train----" + str(item))
    img = image.load_img(os.path.join(TRAIN_DATA_PATH, str(item)), target_size=image_size)
    img = image.img_to_array(img)
    img = img / 255.
    train_images.append(img)
    labels.append(item.split('.')[0])
X = np.array(train_images)

y = []
for item in labels:
    print("labels------" + str(item))
    if item == 'cat':
        y.append(0)
    else:
        y.append(1)
new_labels = pd.get_dummies(labels)
  1. Use sklearn Medium train_test_split Divide the training set and the test set
X_train, X_test, y_train, y_test = train_test_split(X, new_labels, random_state=42, test_size=0.2)
  1. How to train :
model.fit(X_train, y_train, batch_size=32, epochs=100, validation_split=0.3)

Mode two : Use keras Of ImageDataGenerator Load data in batches

1. advantage :
Load data in batches
2. shortcoming :
The data needs to be divided into folders according to labels
3. Applicable file storage method :
The pictures divide the data into folders according to the labels
image.png
4. Realization way :

  1. Create folder
base_dir = 'E:/mldata/dogvscat/cats_and_dogs_small'
os.mkdir(base_dir)
train_dir = os.path.join(base_dir, 'train')
os.mkdir(train_dir)
validation_dir = os.path.join(base_dir, 'validation')
os.mkdir(validation_dir)
test_dir = os.path.join(base_dir, 'test')
os.mkdir(test_dir)
train_cats_dir = os.path.join(train_dir, 'cats')
os.mkdir(train_cats_dir)
train_dogs_dir = os.path.join(train_dir, 'dogs')
os.mkdir(train_dogs_dir)
validation_cats_dir = os.path.join(validation_dir, 'cats')
os.mkdir(validation_cats_dir)
validation_dogs_dir = os.path.join(validation_dir, 'dogs')
os.mkdir(validation_dogs_dir)
test_cats_dir = os.path.join(test_dir, 'cats')
os.mkdir(test_cats_dir)
test_dogs_dir = os.path.join(test_dir, 'dogs')
os.mkdir(test_dogs_dir)
  1. Copy files to folder
fnames = ['cat.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
   src = os.path.join(original_dataset_dir, fname)
   dst = os.path.join(train_cats_dir, fname)
   shutil.copyfile(src, dst)
fnames = ['cat.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
   src = os.path.join(original_dataset_dir, fname)
   dst = os.path.join(validation_cats_dir, fname)
   shutil.copyfile(src, dst)
fnames = ['cat.{}.jpg'.format(i) for i in range(1500, 2000)]
for fname in fnames:
   src = os.path.join(original_dataset_dir, fname)
   dst = os.path.join(test_cats_dir, fname)
   shutil.copyfile(src, dst)
fnames = ['dog.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
   src = os.path.join(original_dataset_dir, fname)
   dst = os.path.join(train_dogs_dir, fname)
   shutil.copyfile(src, dst)
fnames = ['dog.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
 src = os.path.join(original_dataset_dir, fname)
   dst = os.path.join(validation_dogs_dir, fname)
   shutil.copyfile(src, dst)
fnames = ['dog.{}.jpg'.format(i) for i in range(1500, 2000)]
for fname in fnames:
   src = os.path.join(original_dataset_dir, fname)
   dst = os.path.join(test_dogs_dir, fname)
   shutil.copyfile(src, dst)
  1. Use ImageDataGenerator Load data
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
 train_dir,
 target_size=(150, 150),
 batch_size=20,
 class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
 validation_dir,
 target_size=(150, 150),
 batch_size=20,
 class_mode='binary')
  1. How to train
history = model.fit_generator(
 train_generator,
 steps_per_epoch=100,
 epochs=30,
 validation_data=validation_generator,
 validation_steps=50)

版权声明
本文为[Catch wind]所创,转载请带上原文链接,感谢
https://chowdera.com/2021/09/20210909105348842A.html

随机推荐