Over fitting in Transfer Learning with small dataset The Next CEO of Stack Overflow2019 Community Moderator ElectionHow to set batch_size, steps_per epoch and validation stepsWhat to set in steps_per_epoch in Keras' fit_generator?Is there a problem of over fitting in my dataset?Factorization Machine - prevent over fittingAccuracy drops if more layers trainable - weirdTransfer learning: Poor performance with last layer replacedDetecting over fitting of SVM/SVCVisualizing ConvNet filters using my own fine-tuned network resulting in a “NoneType” when running: K.gradients(loss, model.input)[0]Best practices to modelize top layers over CNNNot sure if over-fittingTransfer learning - small databaseHow to input different sized images into transfer learning network
Solving system of ODEs with extra parameter
How to get from Geneva Airport to Metabief?
How to place nodes around a circle from some initial angle?
Combine columns from several files into one
Can we say or write : "No, it'sn't"?
Why does standard notation not preserve intervals (visually)
What did we know about the Kessel run before the prologues?
What is the purpose of the Evocation wizard's Potent Cantrip feature?
Recycling old answers
No sign flipping while figuring out the emf of voltaic cell?
What happened in Rome, when the western empire "fell"?
Where do students learn to solve polynomial equations these days?
Would a completely good Muggle be able to use a wand?
Flying from Cape Town to England and return to another province
Why do airplanes bank sharply to the right after air-to-air refueling?
How many extra stops do monopods offer for tele photographs?
Is wanting to ask what to write an indication that you need to change your story?
Why don't programming languages automatically manage the synchronous/asynchronous problem?
Is it ever safe to open a suspicious HTML file (e.g. email attachment)?
Is it my responsibility to learn a new technology in my own time my employer wants to implement?
How to count occurrences of text in a file?
How to invert MapIndexed on a ragged structure? How to construct a tree from rules?
What happens if you roll doubles 3 times then land on "Go to jail?"
If the updated MCAS software needs two AOA sensors, doesn't that introduce a new single point of failure?
Over fitting in Transfer Learning with small dataset
The Next CEO of Stack Overflow2019 Community Moderator ElectionHow to set batch_size, steps_per epoch and validation stepsWhat to set in steps_per_epoch in Keras' fit_generator?Is there a problem of over fitting in my dataset?Factorization Machine - prevent over fittingAccuracy drops if more layers trainable - weirdTransfer learning: Poor performance with last layer replacedDetecting over fitting of SVM/SVCVisualizing ConvNet filters using my own fine-tuned network resulting in a “NoneType” when running: K.gradients(loss, model.input)[0]Best practices to modelize top layers over CNNNot sure if over-fittingTransfer learning - small databaseHow to input different sized images into transfer learning network
$begingroup$
I am using Transfer Learning to perform image classification.
Base model used : Resnet50
using ImageNet
datasetclass_1
and class_2
are the classes each having 1000 samples each (small dataset). And the dataset is not similar to ImageNet
dataset.
Number of FC layers
used here are 3 with [1024, 512, 256]
.
I have used a drop out
of 0.5 to reduce over-fitting.
When I trained the model with 100 epochs, I could clearly see the model over-fits with training accuracy
of 0.9985 and testing accuracy
of 0.875.
Is the number of FC layers used is too many which is causing this over-fit problem?
How can I make the model more generalised?
The code used is as given below:
from keras.applications.resnet50 import ResNet50, preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Dense, Activation, Flatten, Dropout
from keras.models import Sequential, Model
from keras.optimizers import SGD, Adam
from keras.callbacks import TensorBoard
import keras
import matplotlib.pyplot as plt
HEIGHT = 300
WIDTH = 300
TRAIN_DIR = "/home/ubuntu/dataset/training_set/"
TEST_DIR = "/home/ubuntu/dataset/test_set/"
BATCH_SIZE = 8
class_list = ["class_1", "class_2"]
FC_LAYERS = [1024, 512, 256]
dropout = 0.5
NUM_EPOCHS = 100
BATCH_SIZE = 8
def build_finetune_model(base_model, dropout, fc_layers, num_classes):
for layer in base_model.layers:
layer.trainable = False
x = base_model.output
x = Flatten()(x)
for fc in fc_layers:
print(fc)
x = Dense(fc, activation='relu')(x)
x = Dropout(dropout)(x)
preditions = Dense(num_classes, activation='softmax')(x)
finetune_model = Model(inputs = base_model.input, outputs = preditions)
return finetune_model
base_model = ResNet50(weights = 'imagenet',
include_top = False,
input_shape = (HEIGHT, WIDTH, 3))
train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
rotation_range = 90,
horizontal_flip = True,
vertical_flip = False)
test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
rotation_range = 90,
horizontal_flip = True,
vertical_flip = False)
train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
target_size = (HEIGHT, WIDTH),
batch_size = BATCH_SIZE)
test_generator = test_datagen.flow_from_directory(TEST_DIR,
target_size = (HEIGHT, WIDTH),
batch_size = BATCH_SIZE)
finetune_model = build_finetune_model(base_model,
dropout = dropout,
fc_layers = FC_LAYERS,
num_classes = len(class_list))
adam = Adam(lr = 0.00001)
finetune_model.compile(adam, loss="categorical_crossentropy", metrics=["accuracy"])
filepath = "./checkpoints" + "RestNet50" + "_model_weights.h5"
checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor = ["acc"], verbose= 1, mode = "max")
cb=TensorBoard(log_dir=("/home/ubuntu/"))
callbacks_list = [checkpoint, cb]
print(train_generator.class_indices)
history = finetune_model.fit_generator(generator = train_generator, epochs = NUM_EPOCHS, steps_per_epoch = 100,
shuffle = True, callbacks=callbacks_list, validation_data = test_generator)
Update :
1. Weight file generated from the model after training is 2.7 GB. Is it normal considering the complexity of the model?
- How would I select the
steps_per_epoch
value? Is there any standard?
Thank you,
KK
deep-learning cnn training overfitting transfer-learning
$endgroup$
add a comment |
$begingroup$
I am using Transfer Learning to perform image classification.
Base model used : Resnet50
using ImageNet
datasetclass_1
and class_2
are the classes each having 1000 samples each (small dataset). And the dataset is not similar to ImageNet
dataset.
Number of FC layers
used here are 3 with [1024, 512, 256]
.
I have used a drop out
of 0.5 to reduce over-fitting.
When I trained the model with 100 epochs, I could clearly see the model over-fits with training accuracy
of 0.9985 and testing accuracy
of 0.875.
Is the number of FC layers used is too many which is causing this over-fit problem?
How can I make the model more generalised?
The code used is as given below:
from keras.applications.resnet50 import ResNet50, preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Dense, Activation, Flatten, Dropout
from keras.models import Sequential, Model
from keras.optimizers import SGD, Adam
from keras.callbacks import TensorBoard
import keras
import matplotlib.pyplot as plt
HEIGHT = 300
WIDTH = 300
TRAIN_DIR = "/home/ubuntu/dataset/training_set/"
TEST_DIR = "/home/ubuntu/dataset/test_set/"
BATCH_SIZE = 8
class_list = ["class_1", "class_2"]
FC_LAYERS = [1024, 512, 256]
dropout = 0.5
NUM_EPOCHS = 100
BATCH_SIZE = 8
def build_finetune_model(base_model, dropout, fc_layers, num_classes):
for layer in base_model.layers:
layer.trainable = False
x = base_model.output
x = Flatten()(x)
for fc in fc_layers:
print(fc)
x = Dense(fc, activation='relu')(x)
x = Dropout(dropout)(x)
preditions = Dense(num_classes, activation='softmax')(x)
finetune_model = Model(inputs = base_model.input, outputs = preditions)
return finetune_model
base_model = ResNet50(weights = 'imagenet',
include_top = False,
input_shape = (HEIGHT, WIDTH, 3))
train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
rotation_range = 90,
horizontal_flip = True,
vertical_flip = False)
test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
rotation_range = 90,
horizontal_flip = True,
vertical_flip = False)
train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
target_size = (HEIGHT, WIDTH),
batch_size = BATCH_SIZE)
test_generator = test_datagen.flow_from_directory(TEST_DIR,
target_size = (HEIGHT, WIDTH),
batch_size = BATCH_SIZE)
finetune_model = build_finetune_model(base_model,
dropout = dropout,
fc_layers = FC_LAYERS,
num_classes = len(class_list))
adam = Adam(lr = 0.00001)
finetune_model.compile(adam, loss="categorical_crossentropy", metrics=["accuracy"])
filepath = "./checkpoints" + "RestNet50" + "_model_weights.h5"
checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor = ["acc"], verbose= 1, mode = "max")
cb=TensorBoard(log_dir=("/home/ubuntu/"))
callbacks_list = [checkpoint, cb]
print(train_generator.class_indices)
history = finetune_model.fit_generator(generator = train_generator, epochs = NUM_EPOCHS, steps_per_epoch = 100,
shuffle = True, callbacks=callbacks_list, validation_data = test_generator)
Update :
1. Weight file generated from the model after training is 2.7 GB. Is it normal considering the complexity of the model?
- How would I select the
steps_per_epoch
value? Is there any standard?
Thank you,
KK
deep-learning cnn training overfitting transfer-learning
$endgroup$
add a comment |
$begingroup$
I am using Transfer Learning to perform image classification.
Base model used : Resnet50
using ImageNet
datasetclass_1
and class_2
are the classes each having 1000 samples each (small dataset). And the dataset is not similar to ImageNet
dataset.
Number of FC layers
used here are 3 with [1024, 512, 256]
.
I have used a drop out
of 0.5 to reduce over-fitting.
When I trained the model with 100 epochs, I could clearly see the model over-fits with training accuracy
of 0.9985 and testing accuracy
of 0.875.
Is the number of FC layers used is too many which is causing this over-fit problem?
How can I make the model more generalised?
The code used is as given below:
from keras.applications.resnet50 import ResNet50, preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Dense, Activation, Flatten, Dropout
from keras.models import Sequential, Model
from keras.optimizers import SGD, Adam
from keras.callbacks import TensorBoard
import keras
import matplotlib.pyplot as plt
HEIGHT = 300
WIDTH = 300
TRAIN_DIR = "/home/ubuntu/dataset/training_set/"
TEST_DIR = "/home/ubuntu/dataset/test_set/"
BATCH_SIZE = 8
class_list = ["class_1", "class_2"]
FC_LAYERS = [1024, 512, 256]
dropout = 0.5
NUM_EPOCHS = 100
BATCH_SIZE = 8
def build_finetune_model(base_model, dropout, fc_layers, num_classes):
for layer in base_model.layers:
layer.trainable = False
x = base_model.output
x = Flatten()(x)
for fc in fc_layers:
print(fc)
x = Dense(fc, activation='relu')(x)
x = Dropout(dropout)(x)
preditions = Dense(num_classes, activation='softmax')(x)
finetune_model = Model(inputs = base_model.input, outputs = preditions)
return finetune_model
base_model = ResNet50(weights = 'imagenet',
include_top = False,
input_shape = (HEIGHT, WIDTH, 3))
train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
rotation_range = 90,
horizontal_flip = True,
vertical_flip = False)
test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
rotation_range = 90,
horizontal_flip = True,
vertical_flip = False)
train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
target_size = (HEIGHT, WIDTH),
batch_size = BATCH_SIZE)
test_generator = test_datagen.flow_from_directory(TEST_DIR,
target_size = (HEIGHT, WIDTH),
batch_size = BATCH_SIZE)
finetune_model = build_finetune_model(base_model,
dropout = dropout,
fc_layers = FC_LAYERS,
num_classes = len(class_list))
adam = Adam(lr = 0.00001)
finetune_model.compile(adam, loss="categorical_crossentropy", metrics=["accuracy"])
filepath = "./checkpoints" + "RestNet50" + "_model_weights.h5"
checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor = ["acc"], verbose= 1, mode = "max")
cb=TensorBoard(log_dir=("/home/ubuntu/"))
callbacks_list = [checkpoint, cb]
print(train_generator.class_indices)
history = finetune_model.fit_generator(generator = train_generator, epochs = NUM_EPOCHS, steps_per_epoch = 100,
shuffle = True, callbacks=callbacks_list, validation_data = test_generator)
Update :
1. Weight file generated from the model after training is 2.7 GB. Is it normal considering the complexity of the model?
- How would I select the
steps_per_epoch
value? Is there any standard?
Thank you,
KK
deep-learning cnn training overfitting transfer-learning
$endgroup$
I am using Transfer Learning to perform image classification.
Base model used : Resnet50
using ImageNet
datasetclass_1
and class_2
are the classes each having 1000 samples each (small dataset). And the dataset is not similar to ImageNet
dataset.
Number of FC layers
used here are 3 with [1024, 512, 256]
.
I have used a drop out
of 0.5 to reduce over-fitting.
When I trained the model with 100 epochs, I could clearly see the model over-fits with training accuracy
of 0.9985 and testing accuracy
of 0.875.
Is the number of FC layers used is too many which is causing this over-fit problem?
How can I make the model more generalised?
The code used is as given below:
from keras.applications.resnet50 import ResNet50, preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Dense, Activation, Flatten, Dropout
from keras.models import Sequential, Model
from keras.optimizers import SGD, Adam
from keras.callbacks import TensorBoard
import keras
import matplotlib.pyplot as plt
HEIGHT = 300
WIDTH = 300
TRAIN_DIR = "/home/ubuntu/dataset/training_set/"
TEST_DIR = "/home/ubuntu/dataset/test_set/"
BATCH_SIZE = 8
class_list = ["class_1", "class_2"]
FC_LAYERS = [1024, 512, 256]
dropout = 0.5
NUM_EPOCHS = 100
BATCH_SIZE = 8
def build_finetune_model(base_model, dropout, fc_layers, num_classes):
for layer in base_model.layers:
layer.trainable = False
x = base_model.output
x = Flatten()(x)
for fc in fc_layers:
print(fc)
x = Dense(fc, activation='relu')(x)
x = Dropout(dropout)(x)
preditions = Dense(num_classes, activation='softmax')(x)
finetune_model = Model(inputs = base_model.input, outputs = preditions)
return finetune_model
base_model = ResNet50(weights = 'imagenet',
include_top = False,
input_shape = (HEIGHT, WIDTH, 3))
train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
rotation_range = 90,
horizontal_flip = True,
vertical_flip = False)
test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
rotation_range = 90,
horizontal_flip = True,
vertical_flip = False)
train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
target_size = (HEIGHT, WIDTH),
batch_size = BATCH_SIZE)
test_generator = test_datagen.flow_from_directory(TEST_DIR,
target_size = (HEIGHT, WIDTH),
batch_size = BATCH_SIZE)
finetune_model = build_finetune_model(base_model,
dropout = dropout,
fc_layers = FC_LAYERS,
num_classes = len(class_list))
adam = Adam(lr = 0.00001)
finetune_model.compile(adam, loss="categorical_crossentropy", metrics=["accuracy"])
filepath = "./checkpoints" + "RestNet50" + "_model_weights.h5"
checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor = ["acc"], verbose= 1, mode = "max")
cb=TensorBoard(log_dir=("/home/ubuntu/"))
callbacks_list = [checkpoint, cb]
print(train_generator.class_indices)
history = finetune_model.fit_generator(generator = train_generator, epochs = NUM_EPOCHS, steps_per_epoch = 100,
shuffle = True, callbacks=callbacks_list, validation_data = test_generator)
Update :
1. Weight file generated from the model after training is 2.7 GB. Is it normal considering the complexity of the model?
- How would I select the
steps_per_epoch
value? Is there any standard?
Thank you,
KK
deep-learning cnn training overfitting transfer-learning
deep-learning cnn training overfitting transfer-learning
edited Mar 26 at 9:40
KK2491
asked Mar 25 at 18:21
KK2491KK2491
345220
345220
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
First of all:
I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.
Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.
Secondly, there is more than one way to reduce overfitting:
1- Enlarge your data set by using augmentation techniques such as flip, scale,...
2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.
3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.
4- Using cross-validation to train/test your model.
and many more...
- for more, see here, or here, or here.
feel free to ask any questions
$endgroup$
$begingroup$
Thanks for the detailed explanation.
$endgroup$
– KK2491
Mar 26 at 14:04
$begingroup$
Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
$endgroup$
– KK2491
Mar 26 at 14:38
$begingroup$
Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
$endgroup$
– honar.cs
Mar 26 at 19:46
$begingroup$
Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to selectsteps_per_epoch
value?
$endgroup$
– KK2491
Mar 27 at 2:57
$begingroup$
sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
$endgroup$
– honar.cs
Mar 27 at 6:37
|
show 1 more comment
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47966%2fover-fitting-in-transfer-learning-with-small-dataset%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
First of all:
I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.
Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.
Secondly, there is more than one way to reduce overfitting:
1- Enlarge your data set by using augmentation techniques such as flip, scale,...
2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.
3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.
4- Using cross-validation to train/test your model.
and many more...
- for more, see here, or here, or here.
feel free to ask any questions
$endgroup$
$begingroup$
Thanks for the detailed explanation.
$endgroup$
– KK2491
Mar 26 at 14:04
$begingroup$
Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
$endgroup$
– KK2491
Mar 26 at 14:38
$begingroup$
Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
$endgroup$
– honar.cs
Mar 26 at 19:46
$begingroup$
Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to selectsteps_per_epoch
value?
$endgroup$
– KK2491
Mar 27 at 2:57
$begingroup$
sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
$endgroup$
– honar.cs
Mar 27 at 6:37
|
show 1 more comment
$begingroup$
First of all:
I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.
Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.
Secondly, there is more than one way to reduce overfitting:
1- Enlarge your data set by using augmentation techniques such as flip, scale,...
2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.
3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.
4- Using cross-validation to train/test your model.
and many more...
- for more, see here, or here, or here.
feel free to ask any questions
$endgroup$
$begingroup$
Thanks for the detailed explanation.
$endgroup$
– KK2491
Mar 26 at 14:04
$begingroup$
Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
$endgroup$
– KK2491
Mar 26 at 14:38
$begingroup$
Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
$endgroup$
– honar.cs
Mar 26 at 19:46
$begingroup$
Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to selectsteps_per_epoch
value?
$endgroup$
– KK2491
Mar 27 at 2:57
$begingroup$
sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
$endgroup$
– honar.cs
Mar 27 at 6:37
|
show 1 more comment
$begingroup$
First of all:
I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.
Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.
Secondly, there is more than one way to reduce overfitting:
1- Enlarge your data set by using augmentation techniques such as flip, scale,...
2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.
3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.
4- Using cross-validation to train/test your model.
and many more...
- for more, see here, or here, or here.
feel free to ask any questions
$endgroup$
First of all:
I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.
Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.
Secondly, there is more than one way to reduce overfitting:
1- Enlarge your data set by using augmentation techniques such as flip, scale,...
2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.
3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.
4- Using cross-validation to train/test your model.
and many more...
- for more, see here, or here, or here.
feel free to ask any questions
edited Mar 26 at 12:48
answered Mar 26 at 11:31
honar.cshonar.cs
18313
18313
$begingroup$
Thanks for the detailed explanation.
$endgroup$
– KK2491
Mar 26 at 14:04
$begingroup$
Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
$endgroup$
– KK2491
Mar 26 at 14:38
$begingroup$
Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
$endgroup$
– honar.cs
Mar 26 at 19:46
$begingroup$
Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to selectsteps_per_epoch
value?
$endgroup$
– KK2491
Mar 27 at 2:57
$begingroup$
sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
$endgroup$
– honar.cs
Mar 27 at 6:37
|
show 1 more comment
$begingroup$
Thanks for the detailed explanation.
$endgroup$
– KK2491
Mar 26 at 14:04
$begingroup$
Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
$endgroup$
– KK2491
Mar 26 at 14:38
$begingroup$
Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
$endgroup$
– honar.cs
Mar 26 at 19:46
$begingroup$
Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to selectsteps_per_epoch
value?
$endgroup$
– KK2491
Mar 27 at 2:57
$begingroup$
sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
$endgroup$
– honar.cs
Mar 27 at 6:37
$begingroup$
Thanks for the detailed explanation.
$endgroup$
– KK2491
Mar 26 at 14:04
$begingroup$
Thanks for the detailed explanation.
$endgroup$
– KK2491
Mar 26 at 14:04
$begingroup$
Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
$endgroup$
– KK2491
Mar 26 at 14:38
$begingroup$
Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
$endgroup$
– KK2491
Mar 26 at 14:38
$begingroup$
Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
$endgroup$
– honar.cs
Mar 26 at 19:46
$begingroup$
Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
$endgroup$
– honar.cs
Mar 26 at 19:46
$begingroup$
Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select
steps_per_epoch
value?$endgroup$
– KK2491
Mar 27 at 2:57
$begingroup$
Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select
steps_per_epoch
value?$endgroup$
– KK2491
Mar 27 at 2:57
$begingroup$
sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
$endgroup$
– honar.cs
Mar 27 at 6:37
$begingroup$
sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
$endgroup$
– honar.cs
Mar 27 at 6:37
|
show 1 more comment
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47966%2fover-fitting-in-transfer-learning-with-small-dataset%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown