How to build an image dataset for CNN? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsRecurrent (CNN) model on EEG dataHow to input & pre-process images for a Deep Convolutional Neural Network?Image classification: Strategies for minimal input countHow to use keras flow method?Large Numpy.Array for Multi-label Image Classification (CelebA Dataset)Training Accuracy stuck in KerasHow to prepare the varied size input in CNN predictionShould one normalize the frequency values when feeding it as an input to machine learning model?Audio files and their corresponding spectrograms for image classification process

When is phishing education going too far?

What is the correct way to use the pinch test for dehydration?

Why was the term "discrete" used in discrete logarithm?

do i need a schengen visa for a direct flight to amsterdam?

How to do this path/lattice with tikz

What are the pros and cons of Aerospike nosecones?

Single word antonym of "flightless"

How to bypass password on Windows XP account?

How can I fade player when goes inside or outside of the area?

What would be the ideal power source for a cybernetic eye?

Were Kohanim forbidden from serving in King David's army?

Letter Boxed validator

Do you forfeit tax refunds/credits if you aren't required to and don't file by April 15?

Does surprise arrest existing movement?

Can inflation occur in a positive-sum game currency system such as the Stack Exchange reputation system?

How to find all the available tools in macOS terminal?

When to stop saving and start investing?

Is the address of a local variable a constexpr?

Gastric acid as a weapon

If a contract sometimes uses the wrong name, is it still valid?

Is it ethical to give a final exam after the professor has quit before teaching the remaining chapters of the course?

List *all* the tuples!

Is a manifold-with-boundary with given interior and non-empty boundary essentially unique?

Does accepting a pardon have any bearing on trying that person for the same crime in a sovereign jurisdiction?



How to build an image dataset for CNN?



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsRecurrent (CNN) model on EEG dataHow to input & pre-process images for a Deep Convolutional Neural Network?Image classification: Strategies for minimal input countHow to use keras flow method?Large Numpy.Array for Multi-label Image Classification (CelebA Dataset)Training Accuracy stuck in KerasHow to prepare the varied size input in CNN predictionShould one normalize the frequency values when feeding it as an input to machine learning model?Audio files and their corresponding spectrograms for image classification process










0












$begingroup$


I don't understand how images are actually fed into a CNN? If I have a directory containing a few thousand images, what steps do I need to take in order to feed them to a neural network (for instance resizing, grey scale, labeling, etc) I don't understand how even the labeling of an image works. What would this dataset actually look like? Or can you not look at it at all (in a summarized form, I'm thinking something like a table)?










share|improve this question









$endgroup$











  • $begingroup$
    Can we see some example images please? What are you trying to predict from these images?
    $endgroup$
    – JahKnows
    Apr 2 at 0:46










  • $begingroup$
    I do not have the sample images at this time, but I think my explanation will make it a little clearer. I have photos of a scene, the same scene every time. In some photos an object is present (although it may move position slightly, it will be roughly the same) in the other photos the object is not present. I just want the CNN to classify if the object is present or not.
    $endgroup$
    – 55thSwiss
    Apr 2 at 18:27















0












$begingroup$


I don't understand how images are actually fed into a CNN? If I have a directory containing a few thousand images, what steps do I need to take in order to feed them to a neural network (for instance resizing, grey scale, labeling, etc) I don't understand how even the labeling of an image works. What would this dataset actually look like? Or can you not look at it at all (in a summarized form, I'm thinking something like a table)?










share|improve this question









$endgroup$











  • $begingroup$
    Can we see some example images please? What are you trying to predict from these images?
    $endgroup$
    – JahKnows
    Apr 2 at 0:46










  • $begingroup$
    I do not have the sample images at this time, but I think my explanation will make it a little clearer. I have photos of a scene, the same scene every time. In some photos an object is present (although it may move position slightly, it will be roughly the same) in the other photos the object is not present. I just want the CNN to classify if the object is present or not.
    $endgroup$
    – 55thSwiss
    Apr 2 at 18:27













0












0








0





$begingroup$


I don't understand how images are actually fed into a CNN? If I have a directory containing a few thousand images, what steps do I need to take in order to feed them to a neural network (for instance resizing, grey scale, labeling, etc) I don't understand how even the labeling of an image works. What would this dataset actually look like? Or can you not look at it at all (in a summarized form, I'm thinking something like a table)?










share|improve this question









$endgroup$




I don't understand how images are actually fed into a CNN? If I have a directory containing a few thousand images, what steps do I need to take in order to feed them to a neural network (for instance resizing, grey scale, labeling, etc) I don't understand how even the labeling of an image works. What would this dataset actually look like? Or can you not look at it at all (in a summarized form, I'm thinking something like a table)?







machine-learning python neural-network keras cnn






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Apr 2 at 0:18









55thSwiss55thSwiss

85




85











  • $begingroup$
    Can we see some example images please? What are you trying to predict from these images?
    $endgroup$
    – JahKnows
    Apr 2 at 0:46










  • $begingroup$
    I do not have the sample images at this time, but I think my explanation will make it a little clearer. I have photos of a scene, the same scene every time. In some photos an object is present (although it may move position slightly, it will be roughly the same) in the other photos the object is not present. I just want the CNN to classify if the object is present or not.
    $endgroup$
    – 55thSwiss
    Apr 2 at 18:27
















  • $begingroup$
    Can we see some example images please? What are you trying to predict from these images?
    $endgroup$
    – JahKnows
    Apr 2 at 0:46










  • $begingroup$
    I do not have the sample images at this time, but I think my explanation will make it a little clearer. I have photos of a scene, the same scene every time. In some photos an object is present (although it may move position slightly, it will be roughly the same) in the other photos the object is not present. I just want the CNN to classify if the object is present or not.
    $endgroup$
    – 55thSwiss
    Apr 2 at 18:27















$begingroup$
Can we see some example images please? What are you trying to predict from these images?
$endgroup$
– JahKnows
Apr 2 at 0:46




$begingroup$
Can we see some example images please? What are you trying to predict from these images?
$endgroup$
– JahKnows
Apr 2 at 0:46












$begingroup$
I do not have the sample images at this time, but I think my explanation will make it a little clearer. I have photos of a scene, the same scene every time. In some photos an object is present (although it may move position slightly, it will be roughly the same) in the other photos the object is not present. I just want the CNN to classify if the object is present or not.
$endgroup$
– 55thSwiss
Apr 2 at 18:27




$begingroup$
I do not have the sample images at this time, but I think my explanation will make it a little clearer. I have photos of a scene, the same scene every time. In some photos an object is present (although it may move position slightly, it will be roughly the same) in the other photos the object is not present. I just want the CNN to classify if the object is present or not.
$endgroup$
– 55thSwiss
Apr 2 at 18:27










2 Answers
2






active

oldest

votes


















1












$begingroup$

This is a very packed question. Let's try to go through it and I will try to provide some example for image processing using a CNN.



Pre-processing the data



Pre-processing the data such as resizing, and grey scale is the first step of your machine learning pipeline. Most deep learning frameworks will require your training data to all have the same shape. So it is best to resize your images to some standard.



Whenever training any kind of machine learning model it is important to remember the bias variance trade-off. The more complex the model the harder it will be to train it. That means it is best to limit the number of model parameters in your model. You can lower the number of inputs to your model by downsampling the images. Greyscaling is often used for the same reason. If the colors in the images do not contain any distinguishing information then you can reduce the number of inputs by a third by greyscaling.



There are a number of other pre-processing methods which can be used depending on your data. It is also a good idea to do some data augmentation, this is altering your input data slightly without changing the resulting label to increase the number of instances you have to train your model.



How to structure the data?



The shape of the variable which you will use as the input for your CNN will depend on the package you choose. I prefer using tensorflow, which is developed by Google. If you are planning on using a pretty standard architecture, then there is a very useful wrapper library named Keras which will help make designing and training a CNN very easy.



When using tensorflow you will want to get your set of images into a numpy matrix. The first dimension is your instances, then your image dimensions and finally the last dimension is for channels.



So for example if you are using MNIST data as shown below, then you are working with greyscale images which each have dimensions 28 by 28. Then the numpy matrix shape that you would feed into your deep learning model would be (n, 28, 28, 1), where $n$ is the number of images you have in your dataset.



enter image description here



How to label images?



For most data the labeling would need to be done manually. This is often named data collection and is the hardest and most expensive part of any machine learning solution. It is often best to either use readily available data, or to use less complex models and more pre-processing if the data is just unavailable.




Here is an example of the use of a CNN for the MNIST dataset



First we load the data



from keras.datasets import mnist
import numpy as np

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

print('Training data shape: ', x_train.shape)
print('Testing data shape : ', x_test.shape)



Training data shape: (60000, 28, 28)
Testing data shape : (10000,
28, 28)




from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Then we need to reshape our data to add the channel dimension at the end of our numpy matrix. Furthermore, we will one-hot encode the labels. So you will have 10 output neurons, where each represent a different class.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now we design our model



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Finally we can train the model



epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test_reshaped, y_test_binary))





share|improve this answer











$endgroup$












  • $begingroup$
    if I'm not using MNIST, how is the image directory loaded? Similar to mnist.load_data()? I will be using Keras with TensorFlow backend
    $endgroup$
    – 55thSwiss
    Apr 2 at 18:28











  • $begingroup$
    @55thSwiss, what is the storage method for these images?
    $endgroup$
    – JahKnows
    Apr 6 at 17:20










  • $begingroup$
    locally on a hard drive
    $endgroup$
    – 55thSwiss
    Apr 8 at 15:42










  • $begingroup$
    @55thSwiss, can you post an example file here so I can write you a code snippet to load them up?
    $endgroup$
    – JahKnows
    Apr 12 at 0:47










  • $begingroup$
    I am actually making some progress building a CNN, but it will likely take me another week or so to finish because I am only working on it in the evenings. I am unsure if some of my methods are the best practice, would I be able to show you the source code when finished for a review? Nothing serious, but if I made obvious mistakes etc
    $endgroup$
    – 55thSwiss
    Apr 12 at 19:33


















0












$begingroup$

Dataset just consists of Features and Labels. Here features are your images and labels are the classes.



There is a fit() method for every CNN model, which will take in Features and Labels, and performs training.



for the first layer, you need to mention the input dimension of image, and the output layer should be a softmax (if you're doing classification) with dimension as the number of classes you have.



model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=(64, 64, 3)))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])

model.fit(x_train, y_train, epochs=10)


The above is the code for training a Keras sequenctioal model.



General Points:



  1. input_shape should be the dimension of X_train.

  2. You need to get this shape when you do X_train.shape (numpy)

  3. Convolutions are then applied with respective Activations

  4. Dropout and Pooling layers are optional.

  5. After the convolution layers, the data is flattened. using Flatten()

  6. Then it is sent to few Fully Connected layers

  7. The last but one layer should have the dimensions of number of classes

  8. Last layer will be softmax.

  9. Now, compile the model with the loss, optimizer and metric

  10. Then fit()

Vote up ;) if you like it.






share|improve this answer









$endgroup$












  • $begingroup$
    Thank you for the explanation, my problem is though there are many code snippets online for setting up the CNN as you described, what I am confused about is preparing the data. How are the images actually loaded? What does it look like?
    $endgroup$
    – 55thSwiss
    Apr 2 at 18:30











Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48390%2fhow-to-build-an-image-dataset-for-cnn%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









1












$begingroup$

This is a very packed question. Let's try to go through it and I will try to provide some example for image processing using a CNN.



Pre-processing the data



Pre-processing the data such as resizing, and grey scale is the first step of your machine learning pipeline. Most deep learning frameworks will require your training data to all have the same shape. So it is best to resize your images to some standard.



Whenever training any kind of machine learning model it is important to remember the bias variance trade-off. The more complex the model the harder it will be to train it. That means it is best to limit the number of model parameters in your model. You can lower the number of inputs to your model by downsampling the images. Greyscaling is often used for the same reason. If the colors in the images do not contain any distinguishing information then you can reduce the number of inputs by a third by greyscaling.



There are a number of other pre-processing methods which can be used depending on your data. It is also a good idea to do some data augmentation, this is altering your input data slightly without changing the resulting label to increase the number of instances you have to train your model.



How to structure the data?



The shape of the variable which you will use as the input for your CNN will depend on the package you choose. I prefer using tensorflow, which is developed by Google. If you are planning on using a pretty standard architecture, then there is a very useful wrapper library named Keras which will help make designing and training a CNN very easy.



When using tensorflow you will want to get your set of images into a numpy matrix. The first dimension is your instances, then your image dimensions and finally the last dimension is for channels.



So for example if you are using MNIST data as shown below, then you are working with greyscale images which each have dimensions 28 by 28. Then the numpy matrix shape that you would feed into your deep learning model would be (n, 28, 28, 1), where $n$ is the number of images you have in your dataset.



enter image description here



How to label images?



For most data the labeling would need to be done manually. This is often named data collection and is the hardest and most expensive part of any machine learning solution. It is often best to either use readily available data, or to use less complex models and more pre-processing if the data is just unavailable.




Here is an example of the use of a CNN for the MNIST dataset



First we load the data



from keras.datasets import mnist
import numpy as np

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

print('Training data shape: ', x_train.shape)
print('Testing data shape : ', x_test.shape)



Training data shape: (60000, 28, 28)
Testing data shape : (10000,
28, 28)




from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Then we need to reshape our data to add the channel dimension at the end of our numpy matrix. Furthermore, we will one-hot encode the labels. So you will have 10 output neurons, where each represent a different class.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now we design our model



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Finally we can train the model



epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test_reshaped, y_test_binary))





share|improve this answer











$endgroup$












  • $begingroup$
    if I'm not using MNIST, how is the image directory loaded? Similar to mnist.load_data()? I will be using Keras with TensorFlow backend
    $endgroup$
    – 55thSwiss
    Apr 2 at 18:28











  • $begingroup$
    @55thSwiss, what is the storage method for these images?
    $endgroup$
    – JahKnows
    Apr 6 at 17:20










  • $begingroup$
    locally on a hard drive
    $endgroup$
    – 55thSwiss
    Apr 8 at 15:42










  • $begingroup$
    @55thSwiss, can you post an example file here so I can write you a code snippet to load them up?
    $endgroup$
    – JahKnows
    Apr 12 at 0:47










  • $begingroup$
    I am actually making some progress building a CNN, but it will likely take me another week or so to finish because I am only working on it in the evenings. I am unsure if some of my methods are the best practice, would I be able to show you the source code when finished for a review? Nothing serious, but if I made obvious mistakes etc
    $endgroup$
    – 55thSwiss
    Apr 12 at 19:33















1












$begingroup$

This is a very packed question. Let's try to go through it and I will try to provide some example for image processing using a CNN.



Pre-processing the data



Pre-processing the data such as resizing, and grey scale is the first step of your machine learning pipeline. Most deep learning frameworks will require your training data to all have the same shape. So it is best to resize your images to some standard.



Whenever training any kind of machine learning model it is important to remember the bias variance trade-off. The more complex the model the harder it will be to train it. That means it is best to limit the number of model parameters in your model. You can lower the number of inputs to your model by downsampling the images. Greyscaling is often used for the same reason. If the colors in the images do not contain any distinguishing information then you can reduce the number of inputs by a third by greyscaling.



There are a number of other pre-processing methods which can be used depending on your data. It is also a good idea to do some data augmentation, this is altering your input data slightly without changing the resulting label to increase the number of instances you have to train your model.



How to structure the data?



The shape of the variable which you will use as the input for your CNN will depend on the package you choose. I prefer using tensorflow, which is developed by Google. If you are planning on using a pretty standard architecture, then there is a very useful wrapper library named Keras which will help make designing and training a CNN very easy.



When using tensorflow you will want to get your set of images into a numpy matrix. The first dimension is your instances, then your image dimensions and finally the last dimension is for channels.



So for example if you are using MNIST data as shown below, then you are working with greyscale images which each have dimensions 28 by 28. Then the numpy matrix shape that you would feed into your deep learning model would be (n, 28, 28, 1), where $n$ is the number of images you have in your dataset.



enter image description here



How to label images?



For most data the labeling would need to be done manually. This is often named data collection and is the hardest and most expensive part of any machine learning solution. It is often best to either use readily available data, or to use less complex models and more pre-processing if the data is just unavailable.




Here is an example of the use of a CNN for the MNIST dataset



First we load the data



from keras.datasets import mnist
import numpy as np

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

print('Training data shape: ', x_train.shape)
print('Testing data shape : ', x_test.shape)



Training data shape: (60000, 28, 28)
Testing data shape : (10000,
28, 28)




from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Then we need to reshape our data to add the channel dimension at the end of our numpy matrix. Furthermore, we will one-hot encode the labels. So you will have 10 output neurons, where each represent a different class.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now we design our model



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Finally we can train the model



epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test_reshaped, y_test_binary))





share|improve this answer











$endgroup$












  • $begingroup$
    if I'm not using MNIST, how is the image directory loaded? Similar to mnist.load_data()? I will be using Keras with TensorFlow backend
    $endgroup$
    – 55thSwiss
    Apr 2 at 18:28











  • $begingroup$
    @55thSwiss, what is the storage method for these images?
    $endgroup$
    – JahKnows
    Apr 6 at 17:20










  • $begingroup$
    locally on a hard drive
    $endgroup$
    – 55thSwiss
    Apr 8 at 15:42










  • $begingroup$
    @55thSwiss, can you post an example file here so I can write you a code snippet to load them up?
    $endgroup$
    – JahKnows
    Apr 12 at 0:47










  • $begingroup$
    I am actually making some progress building a CNN, but it will likely take me another week or so to finish because I am only working on it in the evenings. I am unsure if some of my methods are the best practice, would I be able to show you the source code when finished for a review? Nothing serious, but if I made obvious mistakes etc
    $endgroup$
    – 55thSwiss
    Apr 12 at 19:33













1












1








1





$begingroup$

This is a very packed question. Let's try to go through it and I will try to provide some example for image processing using a CNN.



Pre-processing the data



Pre-processing the data such as resizing, and grey scale is the first step of your machine learning pipeline. Most deep learning frameworks will require your training data to all have the same shape. So it is best to resize your images to some standard.



Whenever training any kind of machine learning model it is important to remember the bias variance trade-off. The more complex the model the harder it will be to train it. That means it is best to limit the number of model parameters in your model. You can lower the number of inputs to your model by downsampling the images. Greyscaling is often used for the same reason. If the colors in the images do not contain any distinguishing information then you can reduce the number of inputs by a third by greyscaling.



There are a number of other pre-processing methods which can be used depending on your data. It is also a good idea to do some data augmentation, this is altering your input data slightly without changing the resulting label to increase the number of instances you have to train your model.



How to structure the data?



The shape of the variable which you will use as the input for your CNN will depend on the package you choose. I prefer using tensorflow, which is developed by Google. If you are planning on using a pretty standard architecture, then there is a very useful wrapper library named Keras which will help make designing and training a CNN very easy.



When using tensorflow you will want to get your set of images into a numpy matrix. The first dimension is your instances, then your image dimensions and finally the last dimension is for channels.



So for example if you are using MNIST data as shown below, then you are working with greyscale images which each have dimensions 28 by 28. Then the numpy matrix shape that you would feed into your deep learning model would be (n, 28, 28, 1), where $n$ is the number of images you have in your dataset.



enter image description here



How to label images?



For most data the labeling would need to be done manually. This is often named data collection and is the hardest and most expensive part of any machine learning solution. It is often best to either use readily available data, or to use less complex models and more pre-processing if the data is just unavailable.




Here is an example of the use of a CNN for the MNIST dataset



First we load the data



from keras.datasets import mnist
import numpy as np

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

print('Training data shape: ', x_train.shape)
print('Testing data shape : ', x_test.shape)



Training data shape: (60000, 28, 28)
Testing data shape : (10000,
28, 28)




from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Then we need to reshape our data to add the channel dimension at the end of our numpy matrix. Furthermore, we will one-hot encode the labels. So you will have 10 output neurons, where each represent a different class.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now we design our model



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Finally we can train the model



epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test_reshaped, y_test_binary))





share|improve this answer











$endgroup$



This is a very packed question. Let's try to go through it and I will try to provide some example for image processing using a CNN.



Pre-processing the data



Pre-processing the data such as resizing, and grey scale is the first step of your machine learning pipeline. Most deep learning frameworks will require your training data to all have the same shape. So it is best to resize your images to some standard.



Whenever training any kind of machine learning model it is important to remember the bias variance trade-off. The more complex the model the harder it will be to train it. That means it is best to limit the number of model parameters in your model. You can lower the number of inputs to your model by downsampling the images. Greyscaling is often used for the same reason. If the colors in the images do not contain any distinguishing information then you can reduce the number of inputs by a third by greyscaling.



There are a number of other pre-processing methods which can be used depending on your data. It is also a good idea to do some data augmentation, this is altering your input data slightly without changing the resulting label to increase the number of instances you have to train your model.



How to structure the data?



The shape of the variable which you will use as the input for your CNN will depend on the package you choose. I prefer using tensorflow, which is developed by Google. If you are planning on using a pretty standard architecture, then there is a very useful wrapper library named Keras which will help make designing and training a CNN very easy.



When using tensorflow you will want to get your set of images into a numpy matrix. The first dimension is your instances, then your image dimensions and finally the last dimension is for channels.



So for example if you are using MNIST data as shown below, then you are working with greyscale images which each have dimensions 28 by 28. Then the numpy matrix shape that you would feed into your deep learning model would be (n, 28, 28, 1), where $n$ is the number of images you have in your dataset.



enter image description here



How to label images?



For most data the labeling would need to be done manually. This is often named data collection and is the hardest and most expensive part of any machine learning solution. It is often best to either use readily available data, or to use less complex models and more pre-processing if the data is just unavailable.




Here is an example of the use of a CNN for the MNIST dataset



First we load the data



from keras.datasets import mnist
import numpy as np

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

print('Training data shape: ', x_train.shape)
print('Testing data shape : ', x_test.shape)



Training data shape: (60000, 28, 28)
Testing data shape : (10000,
28, 28)




from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K


Then we need to reshape our data to add the channel dimension at the end of our numpy matrix. Furthermore, we will one-hot encode the labels. So you will have 10 output neurons, where each represent a different class.



# The known number of output classes.
num_classes = 10

# Input image dimensions
img_rows, img_cols = 28, 28

# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)


Now we design our model



model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])


Finally we can train the model



epochs = 4
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test_reshaped, y_test_binary))






share|improve this answer














share|improve this answer



share|improve this answer








edited Apr 2 at 0:46

























answered Apr 2 at 0:36









JahKnowsJahKnows

5,277727




5,277727











  • $begingroup$
    if I'm not using MNIST, how is the image directory loaded? Similar to mnist.load_data()? I will be using Keras with TensorFlow backend
    $endgroup$
    – 55thSwiss
    Apr 2 at 18:28











  • $begingroup$
    @55thSwiss, what is the storage method for these images?
    $endgroup$
    – JahKnows
    Apr 6 at 17:20










  • $begingroup$
    locally on a hard drive
    $endgroup$
    – 55thSwiss
    Apr 8 at 15:42










  • $begingroup$
    @55thSwiss, can you post an example file here so I can write you a code snippet to load them up?
    $endgroup$
    – JahKnows
    Apr 12 at 0:47










  • $begingroup$
    I am actually making some progress building a CNN, but it will likely take me another week or so to finish because I am only working on it in the evenings. I am unsure if some of my methods are the best practice, would I be able to show you the source code when finished for a review? Nothing serious, but if I made obvious mistakes etc
    $endgroup$
    – 55thSwiss
    Apr 12 at 19:33
















  • $begingroup$
    if I'm not using MNIST, how is the image directory loaded? Similar to mnist.load_data()? I will be using Keras with TensorFlow backend
    $endgroup$
    – 55thSwiss
    Apr 2 at 18:28











  • $begingroup$
    @55thSwiss, what is the storage method for these images?
    $endgroup$
    – JahKnows
    Apr 6 at 17:20










  • $begingroup$
    locally on a hard drive
    $endgroup$
    – 55thSwiss
    Apr 8 at 15:42










  • $begingroup$
    @55thSwiss, can you post an example file here so I can write you a code snippet to load them up?
    $endgroup$
    – JahKnows
    Apr 12 at 0:47










  • $begingroup$
    I am actually making some progress building a CNN, but it will likely take me another week or so to finish because I am only working on it in the evenings. I am unsure if some of my methods are the best practice, would I be able to show you the source code when finished for a review? Nothing serious, but if I made obvious mistakes etc
    $endgroup$
    – 55thSwiss
    Apr 12 at 19:33















$begingroup$
if I'm not using MNIST, how is the image directory loaded? Similar to mnist.load_data()? I will be using Keras with TensorFlow backend
$endgroup$
– 55thSwiss
Apr 2 at 18:28





$begingroup$
if I'm not using MNIST, how is the image directory loaded? Similar to mnist.load_data()? I will be using Keras with TensorFlow backend
$endgroup$
– 55thSwiss
Apr 2 at 18:28













$begingroup$
@55thSwiss, what is the storage method for these images?
$endgroup$
– JahKnows
Apr 6 at 17:20




$begingroup$
@55thSwiss, what is the storage method for these images?
$endgroup$
– JahKnows
Apr 6 at 17:20












$begingroup$
locally on a hard drive
$endgroup$
– 55thSwiss
Apr 8 at 15:42




$begingroup$
locally on a hard drive
$endgroup$
– 55thSwiss
Apr 8 at 15:42












$begingroup$
@55thSwiss, can you post an example file here so I can write you a code snippet to load them up?
$endgroup$
– JahKnows
Apr 12 at 0:47




$begingroup$
@55thSwiss, can you post an example file here so I can write you a code snippet to load them up?
$endgroup$
– JahKnows
Apr 12 at 0:47












$begingroup$
I am actually making some progress building a CNN, but it will likely take me another week or so to finish because I am only working on it in the evenings. I am unsure if some of my methods are the best practice, would I be able to show you the source code when finished for a review? Nothing serious, but if I made obvious mistakes etc
$endgroup$
– 55thSwiss
Apr 12 at 19:33




$begingroup$
I am actually making some progress building a CNN, but it will likely take me another week or so to finish because I am only working on it in the evenings. I am unsure if some of my methods are the best practice, would I be able to show you the source code when finished for a review? Nothing serious, but if I made obvious mistakes etc
$endgroup$
– 55thSwiss
Apr 12 at 19:33











0












$begingroup$

Dataset just consists of Features and Labels. Here features are your images and labels are the classes.



There is a fit() method for every CNN model, which will take in Features and Labels, and performs training.



for the first layer, you need to mention the input dimension of image, and the output layer should be a softmax (if you're doing classification) with dimension as the number of classes you have.



model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=(64, 64, 3)))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])

model.fit(x_train, y_train, epochs=10)


The above is the code for training a Keras sequenctioal model.



General Points:



  1. input_shape should be the dimension of X_train.

  2. You need to get this shape when you do X_train.shape (numpy)

  3. Convolutions are then applied with respective Activations

  4. Dropout and Pooling layers are optional.

  5. After the convolution layers, the data is flattened. using Flatten()

  6. Then it is sent to few Fully Connected layers

  7. The last but one layer should have the dimensions of number of classes

  8. Last layer will be softmax.

  9. Now, compile the model with the loss, optimizer and metric

  10. Then fit()

Vote up ;) if you like it.






share|improve this answer









$endgroup$












  • $begingroup$
    Thank you for the explanation, my problem is though there are many code snippets online for setting up the CNN as you described, what I am confused about is preparing the data. How are the images actually loaded? What does it look like?
    $endgroup$
    – 55thSwiss
    Apr 2 at 18:30















0












$begingroup$

Dataset just consists of Features and Labels. Here features are your images and labels are the classes.



There is a fit() method for every CNN model, which will take in Features and Labels, and performs training.



for the first layer, you need to mention the input dimension of image, and the output layer should be a softmax (if you're doing classification) with dimension as the number of classes you have.



model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=(64, 64, 3)))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])

model.fit(x_train, y_train, epochs=10)


The above is the code for training a Keras sequenctioal model.



General Points:



  1. input_shape should be the dimension of X_train.

  2. You need to get this shape when you do X_train.shape (numpy)

  3. Convolutions are then applied with respective Activations

  4. Dropout and Pooling layers are optional.

  5. After the convolution layers, the data is flattened. using Flatten()

  6. Then it is sent to few Fully Connected layers

  7. The last but one layer should have the dimensions of number of classes

  8. Last layer will be softmax.

  9. Now, compile the model with the loss, optimizer and metric

  10. Then fit()

Vote up ;) if you like it.






share|improve this answer









$endgroup$












  • $begingroup$
    Thank you for the explanation, my problem is though there are many code snippets online for setting up the CNN as you described, what I am confused about is preparing the data. How are the images actually loaded? What does it look like?
    $endgroup$
    – 55thSwiss
    Apr 2 at 18:30













0












0








0





$begingroup$

Dataset just consists of Features and Labels. Here features are your images and labels are the classes.



There is a fit() method for every CNN model, which will take in Features and Labels, and performs training.



for the first layer, you need to mention the input dimension of image, and the output layer should be a softmax (if you're doing classification) with dimension as the number of classes you have.



model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=(64, 64, 3)))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])

model.fit(x_train, y_train, epochs=10)


The above is the code for training a Keras sequenctioal model.



General Points:



  1. input_shape should be the dimension of X_train.

  2. You need to get this shape when you do X_train.shape (numpy)

  3. Convolutions are then applied with respective Activations

  4. Dropout and Pooling layers are optional.

  5. After the convolution layers, the data is flattened. using Flatten()

  6. Then it is sent to few Fully Connected layers

  7. The last but one layer should have the dimensions of number of classes

  8. Last layer will be softmax.

  9. Now, compile the model with the loss, optimizer and metric

  10. Then fit()

Vote up ;) if you like it.






share|improve this answer









$endgroup$



Dataset just consists of Features and Labels. Here features are your images and labels are the classes.



There is a fit() method for every CNN model, which will take in Features and Labels, and performs training.



for the first layer, you need to mention the input dimension of image, and the output layer should be a softmax (if you're doing classification) with dimension as the number of classes you have.



model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=(64, 64, 3)))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])

model.fit(x_train, y_train, epochs=10)


The above is the code for training a Keras sequenctioal model.



General Points:



  1. input_shape should be the dimension of X_train.

  2. You need to get this shape when you do X_train.shape (numpy)

  3. Convolutions are then applied with respective Activations

  4. Dropout and Pooling layers are optional.

  5. After the convolution layers, the data is flattened. using Flatten()

  6. Then it is sent to few Fully Connected layers

  7. The last but one layer should have the dimensions of number of classes

  8. Last layer will be softmax.

  9. Now, compile the model with the loss, optimizer and metric

  10. Then fit()

Vote up ;) if you like it.







share|improve this answer












share|improve this answer



share|improve this answer










answered Apr 2 at 0:36









William ScottWilliam Scott

1063




1063











  • $begingroup$
    Thank you for the explanation, my problem is though there are many code snippets online for setting up the CNN as you described, what I am confused about is preparing the data. How are the images actually loaded? What does it look like?
    $endgroup$
    – 55thSwiss
    Apr 2 at 18:30
















  • $begingroup$
    Thank you for the explanation, my problem is though there are many code snippets online for setting up the CNN as you described, what I am confused about is preparing the data. How are the images actually loaded? What does it look like?
    $endgroup$
    – 55thSwiss
    Apr 2 at 18:30















$begingroup$
Thank you for the explanation, my problem is though there are many code snippets online for setting up the CNN as you described, what I am confused about is preparing the data. How are the images actually loaded? What does it look like?
$endgroup$
– 55thSwiss
Apr 2 at 18:30




$begingroup$
Thank you for the explanation, my problem is though there are many code snippets online for setting up the CNN as you described, what I am confused about is preparing the data. How are the images actually loaded? What does it look like?
$endgroup$
– 55thSwiss
Apr 2 at 18:30

















draft saved

draft discarded
















































Thanks for contributing an answer to Data Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48390%2fhow-to-build-an-image-dataset-for-cnn%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High