Over fitting in Transfer Learning with small dataset The Next CEO of Stack Overflow2019 Community Moderator ElectionHow to set batch_size, steps_per epoch and validation stepsWhat to set in steps_per_epoch in Keras' fit_generator?Is there a problem of over fitting in my dataset?Factorization Machine - prevent over fittingAccuracy drops if more layers trainable - weirdTransfer learning: Poor performance with last layer replacedDetecting over fitting of SVM/SVCVisualizing ConvNet filters using my own fine-tuned network resulting in a “NoneType” when running: K.gradients(loss, model.input)[0]Best practices to modelize top layers over CNNNot sure if over-fittingTransfer learning - small databaseHow to input different sized images into transfer learning network

Solving system of ODEs with extra parameter

How to get from Geneva Airport to Metabief?

How to place nodes around a circle from some initial angle?

Combine columns from several files into one

Can we say or write : "No, it'sn't"?

Why does standard notation not preserve intervals (visually)

What did we know about the Kessel run before the prologues?

What is the purpose of the Evocation wizard's Potent Cantrip feature?

Recycling old answers

No sign flipping while figuring out the emf of voltaic cell?

What happened in Rome, when the western empire "fell"?

Where do students learn to solve polynomial equations these days?

Would a completely good Muggle be able to use a wand?

Flying from Cape Town to England and return to another province

Why do airplanes bank sharply to the right after air-to-air refueling?

How many extra stops do monopods offer for tele photographs?

Is wanting to ask what to write an indication that you need to change your story?

Why don't programming languages automatically manage the synchronous/asynchronous problem?

Is it ever safe to open a suspicious HTML file (e.g. email attachment)?

Is it my responsibility to learn a new technology in my own time my employer wants to implement?

How to count occurrences of text in a file?

How to invert MapIndexed on a ragged structure? How to construct a tree from rules?

What happens if you roll doubles 3 times then land on "Go to jail?"

If the updated MCAS software needs two AOA sensors, doesn't that introduce a new single point of failure?

Over fitting in Transfer Learning with small dataset

The Next CEO of Stack Overflow

2019 Community Moderator ElectionHow to set batch_size, steps_per epoch and validation stepsWhat to set in steps_per_epoch in Keras' fit_generator?Is there a problem of over fitting in my dataset?Factorization Machine - prevent over fittingAccuracy drops if more layers trainable - weirdTransfer learning: Poor performance with last layer replacedDetecting over fitting of SVM/SVCVisualizing ConvNet filters using my own fine-tuned network resulting in a “NoneType” when running: K.gradients(loss, model.input)[0]Best practices to modelize top layers over CNNNot sure if over-fittingTransfer learning - small databaseHow to input different sized images into transfer learning network

I am using Transfer Learning to perform image classification.

Base model used : Resnet50 using ImageNet dataset
class_1 and class_2 are the classes each having 1000 samples each (small dataset). And the dataset is not similar to ImageNet dataset.
Number of FC layers used here are 3 with [1024, 512, 256].
I have used a drop out of 0.5 to reduce over-fitting.

When I trained the model with 100 epochs, I could clearly see the model over-fits with training accuracy of 0.9985 and testing accuracy of 0.875.

Is the number of FC layers used is too many which is causing this over-fit problem?
How can I make the model more generalised?

The code used is as given below:

from keras.applications.resnet50 import ResNet50, preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Dense, Activation, Flatten, Dropout
from keras.models import Sequential, Model 
from keras.optimizers import SGD, Adam
from keras.callbacks import TensorBoard
import keras
import matplotlib.pyplot as plt

HEIGHT = 300
WIDTH = 300
TRAIN_DIR = "/home/ubuntu/dataset/training_set/"
TEST_DIR = "/home/ubuntu/dataset/test_set/"
BATCH_SIZE = 8
class_list = ["class_1", "class_2"]
FC_LAYERS = [1024, 512, 256]
dropout = 0.5
NUM_EPOCHS = 100
BATCH_SIZE = 8

def build_finetune_model(base_model, dropout, fc_layers, num_classes):
 for layer in base_model.layers:
 layer.trainable = False

 x = base_model.output
 x = Flatten()(x)
 for fc in fc_layers:
 print(fc)
 x = Dense(fc, activation='relu')(x)
 x = Dropout(dropout)(x)
 preditions = Dense(num_classes, activation='softmax')(x)
 finetune_model = Model(inputs = base_model.input, outputs = preditions)
 return finetune_model

base_model = ResNet50(weights = 'imagenet',
 include_top = False,
 input_shape = (HEIGHT, WIDTH, 3))

train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
 rotation_range = 90,
 horizontal_flip = True,
 vertical_flip = False)

test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
 rotation_range = 90,
 horizontal_flip = True,
 vertical_flip = False)

train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
 target_size = (HEIGHT, WIDTH),
 batch_size = BATCH_SIZE)

test_generator = test_datagen.flow_from_directory(TEST_DIR,
 target_size = (HEIGHT, WIDTH),
 batch_size = BATCH_SIZE)

finetune_model = build_finetune_model(base_model,
 dropout = dropout,
 fc_layers = FC_LAYERS,
 num_classes = len(class_list))

adam = Adam(lr = 0.00001)
finetune_model.compile(adam, loss="categorical_crossentropy", metrics=["accuracy"])

filepath = "./checkpoints" + "RestNet50" + "_model_weights.h5"
checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor = ["acc"], verbose= 1, mode = "max")
cb=TensorBoard(log_dir=("/home/ubuntu/"))
callbacks_list = [checkpoint, cb]

print(train_generator.class_indices)

history = finetune_model.fit_generator(generator = train_generator, epochs = NUM_EPOCHS, steps_per_epoch = 100, 
 shuffle = True, callbacks=callbacks_list, validation_data = test_generator)

Update :
1. Weight file generated from the model after training is 2.7 GB. Is it normal considering the complexity of the model?

How would I select the steps_per_epoch value? Is there any standard?

Thank you,
KK

edited Mar 26 at 9:40

asked Mar 25 at 18:21

KK2491

345220

add a comment |

I am using Transfer Learning to perform image classification.

When I trained the model with 100 epochs, I could clearly see the model over-fits with training accuracy of 0.9985 and testing accuracy of 0.875.

Is the number of FC layers used is too many which is causing this over-fit problem?
How can I make the model more generalised?

The code used is as given below:

from keras.applications.resnet50 import ResNet50, preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Dense, Activation, Flatten, Dropout
from keras.models import Sequential, Model 
from keras.optimizers import SGD, Adam
from keras.callbacks import TensorBoard
import keras
import matplotlib.pyplot as plt

HEIGHT = 300
WIDTH = 300
TRAIN_DIR = "/home/ubuntu/dataset/training_set/"
TEST_DIR = "/home/ubuntu/dataset/test_set/"
BATCH_SIZE = 8
class_list = ["class_1", "class_2"]
FC_LAYERS = [1024, 512, 256]
dropout = 0.5
NUM_EPOCHS = 100
BATCH_SIZE = 8

def build_finetune_model(base_model, dropout, fc_layers, num_classes):
 for layer in base_model.layers:
 layer.trainable = False

 x = base_model.output
 x = Flatten()(x)
 for fc in fc_layers:
 print(fc)
 x = Dense(fc, activation='relu')(x)
 x = Dropout(dropout)(x)
 preditions = Dense(num_classes, activation='softmax')(x)
 finetune_model = Model(inputs = base_model.input, outputs = preditions)
 return finetune_model

base_model = ResNet50(weights = 'imagenet',
 include_top = False,
 input_shape = (HEIGHT, WIDTH, 3))

train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
 rotation_range = 90,
 horizontal_flip = True,
 vertical_flip = False)

test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
 rotation_range = 90,
 horizontal_flip = True,
 vertical_flip = False)

train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
 target_size = (HEIGHT, WIDTH),
 batch_size = BATCH_SIZE)

test_generator = test_datagen.flow_from_directory(TEST_DIR,
 target_size = (HEIGHT, WIDTH),
 batch_size = BATCH_SIZE)

finetune_model = build_finetune_model(base_model,
 dropout = dropout,
 fc_layers = FC_LAYERS,
 num_classes = len(class_list))

adam = Adam(lr = 0.00001)
finetune_model.compile(adam, loss="categorical_crossentropy", metrics=["accuracy"])

filepath = "./checkpoints" + "RestNet50" + "_model_weights.h5"
checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor = ["acc"], verbose= 1, mode = "max")
cb=TensorBoard(log_dir=("/home/ubuntu/"))
callbacks_list = [checkpoint, cb]

print(train_generator.class_indices)

history = finetune_model.fit_generator(generator = train_generator, epochs = NUM_EPOCHS, steps_per_epoch = 100, 
 shuffle = True, callbacks=callbacks_list, validation_data = test_generator)

Update :
1. Weight file generated from the model after training is 2.7 GB. Is it normal considering the complexity of the model?

How would I select the steps_per_epoch value? Is there any standard?

Thank you,
KK

edited Mar 26 at 9:40

asked Mar 25 at 18:21

KK2491

345220

add a comment |

I am using Transfer Learning to perform image classification.

When I trained the model with 100 epochs, I could clearly see the model over-fits with training accuracy of 0.9985 and testing accuracy of 0.875.

Is the number of FC layers used is too many which is causing this over-fit problem?
How can I make the model more generalised?

The code used is as given below:

from keras.applications.resnet50 import ResNet50, preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Dense, Activation, Flatten, Dropout
from keras.models import Sequential, Model 
from keras.optimizers import SGD, Adam
from keras.callbacks import TensorBoard
import keras
import matplotlib.pyplot as plt

HEIGHT = 300
WIDTH = 300
TRAIN_DIR = "/home/ubuntu/dataset/training_set/"
TEST_DIR = "/home/ubuntu/dataset/test_set/"
BATCH_SIZE = 8
class_list = ["class_1", "class_2"]
FC_LAYERS = [1024, 512, 256]
dropout = 0.5
NUM_EPOCHS = 100
BATCH_SIZE = 8

def build_finetune_model(base_model, dropout, fc_layers, num_classes):
 for layer in base_model.layers:
 layer.trainable = False

 x = base_model.output
 x = Flatten()(x)
 for fc in fc_layers:
 print(fc)
 x = Dense(fc, activation='relu')(x)
 x = Dropout(dropout)(x)
 preditions = Dense(num_classes, activation='softmax')(x)
 finetune_model = Model(inputs = base_model.input, outputs = preditions)
 return finetune_model

base_model = ResNet50(weights = 'imagenet',
 include_top = False,
 input_shape = (HEIGHT, WIDTH, 3))

train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
 rotation_range = 90,
 horizontal_flip = True,
 vertical_flip = False)

test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
 rotation_range = 90,
 horizontal_flip = True,
 vertical_flip = False)

train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
 target_size = (HEIGHT, WIDTH),
 batch_size = BATCH_SIZE)

test_generator = test_datagen.flow_from_directory(TEST_DIR,
 target_size = (HEIGHT, WIDTH),
 batch_size = BATCH_SIZE)

finetune_model = build_finetune_model(base_model,
 dropout = dropout,
 fc_layers = FC_LAYERS,
 num_classes = len(class_list))

adam = Adam(lr = 0.00001)
finetune_model.compile(adam, loss="categorical_crossentropy", metrics=["accuracy"])

filepath = "./checkpoints" + "RestNet50" + "_model_weights.h5"
checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor = ["acc"], verbose= 1, mode = "max")
cb=TensorBoard(log_dir=("/home/ubuntu/"))
callbacks_list = [checkpoint, cb]

print(train_generator.class_indices)

history = finetune_model.fit_generator(generator = train_generator, epochs = NUM_EPOCHS, steps_per_epoch = 100, 
 shuffle = True, callbacks=callbacks_list, validation_data = test_generator)

Update :
1. Weight file generated from the model after training is 2.7 GB. Is it normal considering the complexity of the model?

How would I select the steps_per_epoch value? Is there any standard?

Thank you,
KK

edited Mar 26 at 9:40

asked Mar 25 at 18:21

KK2491

345220

I am using Transfer Learning to perform image classification.

When I trained the model with 100 epochs, I could clearly see the model over-fits with training accuracy of 0.9985 and testing accuracy of 0.875.

Is the number of FC layers used is too many which is causing this over-fit problem?
How can I make the model more generalised?

The code used is as given below:

from keras.applications.resnet50 import ResNet50, preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Dense, Activation, Flatten, Dropout
from keras.models import Sequential, Model 
from keras.optimizers import SGD, Adam
from keras.callbacks import TensorBoard
import keras
import matplotlib.pyplot as plt

HEIGHT = 300
WIDTH = 300
TRAIN_DIR = "/home/ubuntu/dataset/training_set/"
TEST_DIR = "/home/ubuntu/dataset/test_set/"
BATCH_SIZE = 8
class_list = ["class_1", "class_2"]
FC_LAYERS = [1024, 512, 256]
dropout = 0.5
NUM_EPOCHS = 100
BATCH_SIZE = 8

def build_finetune_model(base_model, dropout, fc_layers, num_classes):
 for layer in base_model.layers:
 layer.trainable = False

 x = base_model.output
 x = Flatten()(x)
 for fc in fc_layers:
 print(fc)
 x = Dense(fc, activation='relu')(x)
 x = Dropout(dropout)(x)
 preditions = Dense(num_classes, activation='softmax')(x)
 finetune_model = Model(inputs = base_model.input, outputs = preditions)
 return finetune_model

base_model = ResNet50(weights = 'imagenet',
 include_top = False,
 input_shape = (HEIGHT, WIDTH, 3))

train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
 rotation_range = 90,
 horizontal_flip = True,
 vertical_flip = False)

test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
 rotation_range = 90,
 horizontal_flip = True,
 vertical_flip = False)

train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
 target_size = (HEIGHT, WIDTH),
 batch_size = BATCH_SIZE)

test_generator = test_datagen.flow_from_directory(TEST_DIR,
 target_size = (HEIGHT, WIDTH),
 batch_size = BATCH_SIZE)

finetune_model = build_finetune_model(base_model,
 dropout = dropout,
 fc_layers = FC_LAYERS,
 num_classes = len(class_list))

adam = Adam(lr = 0.00001)
finetune_model.compile(adam, loss="categorical_crossentropy", metrics=["accuracy"])

filepath = "./checkpoints" + "RestNet50" + "_model_weights.h5"
checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor = ["acc"], verbose= 1, mode = "max")
cb=TensorBoard(log_dir=("/home/ubuntu/"))
callbacks_list = [checkpoint, cb]

print(train_generator.class_indices)

history = finetune_model.fit_generator(generator = train_generator, epochs = NUM_EPOCHS, steps_per_epoch = 100, 
 shuffle = True, callbacks=callbacks_list, validation_data = test_generator)

Update :
1. Weight file generated from the model after training is 2.7 GB. Is it normal considering the complexity of the model?

How would I select the steps_per_epoch value? Is there any standard?

Thank you,
KK

deep-learning cnn training overfitting transfer-learning

edited Mar 26 at 9:40

asked Mar 25 at 18:21

KK2491

345220

edited Mar 26 at 9:40

asked Mar 25 at 18:21

KK2491

345220

edited Mar 26 at 9:40

asked Mar 25 at 18:21

KK2491

345220

asked Mar 25 at 18:21

KK2491

345220

asked Mar 25 at 18:21

KK2491

345220

add a comment |

1 Answer
1

active

oldest

votes

First of all:

I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.

Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.

Secondly, there is more than one way to reduce overfitting:

1- Enlarge your data set by using augmentation techniques such as flip, scale,...

2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.

3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.

4- Using cross-validation to train/test your model.

and many more...

for more, see here, or here, or here.

feel free to ask any questions

edited Mar 26 at 12:48

answered Mar 26 at 11:31

honar.cs

18313

$begingroup$
Thanks for the detailed explanation.
$endgroup$
– KK2491
Mar 26 at 14:04

$begingroup$
Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
$endgroup$
– KK2491
Mar 26 at 14:38

$begingroup$
Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
$endgroup$
– honar.cs
Mar 26 at 19:46

$begingroup$
Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select steps_per_epoch value?
$endgroup$
– KK2491
Mar 27 at 2:57

$begingroup$
sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
$endgroup$
– honar.cs
Mar 27 at 6:37

|
show 1 more comment

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47966%2fover-fitting-in-transfer-learning-with-small-dataset%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

First of all:

I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.

Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.

Secondly, there is more than one way to reduce overfitting:

1- Enlarge your data set by using augmentation techniques such as flip, scale,...

2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.

3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.

4- Using cross-validation to train/test your model.

and many more...

for more, see here, or here, or here.

feel free to ask any questions

edited Mar 26 at 12:48

answered Mar 26 at 11:31

honar.cs

18313

$begingroup$
Thanks for the detailed explanation.
$endgroup$
– KK2491
Mar 26 at 14:04

$begingroup$
Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
$endgroup$
– KK2491
Mar 26 at 14:38

$begingroup$
Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
$endgroup$
– honar.cs
Mar 26 at 19:46

$begingroup$
Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select steps_per_epoch value?
$endgroup$
– KK2491
Mar 27 at 2:57

$begingroup$
sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
$endgroup$
– honar.cs
Mar 27 at 6:37

|
show 1 more comment

First of all:

I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.

Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.

Secondly, there is more than one way to reduce overfitting:

1- Enlarge your data set by using augmentation techniques such as flip, scale,...

2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.

3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.

4- Using cross-validation to train/test your model.

and many more...

for more, see here, or here, or here.

feel free to ask any questions

edited Mar 26 at 12:48

answered Mar 26 at 11:31

honar.cs

18313

$begingroup$
Thanks for the detailed explanation.
$endgroup$
– KK2491
Mar 26 at 14:04

$begingroup$
Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
$endgroup$
– KK2491
Mar 26 at 14:38

$begingroup$
Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
$endgroup$
– honar.cs
Mar 26 at 19:46

$begingroup$
Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select steps_per_epoch value?
$endgroup$
– KK2491
Mar 27 at 2:57

$begingroup$
sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
$endgroup$
– honar.cs
Mar 27 at 6:37

|
show 1 more comment

First of all:

I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.

Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.

Secondly, there is more than one way to reduce overfitting:

1- Enlarge your data set by using augmentation techniques such as flip, scale,...

2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.

3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.

4- Using cross-validation to train/test your model.

and many more...

for more, see here, or here, or here.

feel free to ask any questions

edited Mar 26 at 12:48

answered Mar 26 at 11:31

honar.cs

18313

First of all:

I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.

Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.

Secondly, there is more than one way to reduce overfitting:

1- Enlarge your data set by using augmentation techniques such as flip, scale,...

2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.

3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.

4- Using cross-validation to train/test your model.

and many more...

for more, see here, or here, or here.

feel free to ask any questions

edited Mar 26 at 12:48

answered Mar 26 at 11:31

honar.cs

18313

edited Mar 26 at 12:48

answered Mar 26 at 11:31

honar.cs

18313

answered Mar 26 at 11:31

honar.cs

18313

answered Mar 26 at 11:31

honar.cs

18313

$begingroup$
Thanks for the detailed explanation.
$endgroup$
– KK2491
Mar 26 at 14:04

$begingroup$
Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
$endgroup$
– KK2491
Mar 26 at 14:38

$begingroup$
Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
$endgroup$
– honar.cs
Mar 26 at 19:46

$begingroup$
Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select steps_per_epoch value?
$endgroup$
– KK2491
Mar 27 at 2:57

$begingroup$
sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
$endgroup$
– honar.cs
Mar 27 at 6:37

|
show 1 more comment

$begingroup$
Thanks for the detailed explanation.
$endgroup$
– KK2491
Mar 26 at 14:04

$begingroup$
Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
$endgroup$
– KK2491
Mar 26 at 14:38

$begingroup$
Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
$endgroup$
– honar.cs
Mar 26 at 19:46

$begingroup$
Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select steps_per_epoch value?
$endgroup$
– KK2491
Mar 27 at 2:57

$begingroup$
sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
$endgroup$
– honar.cs
Mar 27 at 6:37

Thanks for the detailed explanation.

– KK2491
Mar 26 at 14:04

Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.

– KK2491
Mar 26 at 14:38

Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.

– honar.cs
Mar 26 at 19:46

Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select steps_per_epoch value?

– KK2491
Mar 27 at 2:57

sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.

– honar.cs
Mar 27 at 6:37

|
show 1 more comment

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

kaAtu23ZkrCHlKGrEvowDvF3u,h oXVZYa

搜尋此網誌

Trjtdtk

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

1 Answer
1

1 Answer
1

1 Answer
1