Over fitting in Transfer Learning with small dataset The Next CEO of Stack Overflow2019 Community Moderator ElectionHow to set batch_size, steps_per epoch and validation stepsWhat to set in steps_per_epoch in Keras' fit_generator?Is there a problem of over fitting in my dataset?Factorization Machine - prevent over fittingAccuracy drops if more layers trainable - weirdTransfer learning: Poor performance with last layer replacedDetecting over fitting of SVM/SVCVisualizing ConvNet filters using my own fine-tuned network resulting in a “NoneType” when running: K.gradients(loss, model.input)[0]Best practices to modelize top layers over CNNNot sure if over-fittingTransfer learning - small databaseHow to input different sized images into transfer learning network

Solving system of ODEs with extra parameter

How to get from Geneva Airport to Metabief?

How to place nodes around a circle from some initial angle?

Combine columns from several files into one

Can we say or write : "No, it'sn't"?

Why does standard notation not preserve intervals (visually)

What did we know about the Kessel run before the prologues?

What is the purpose of the Evocation wizard's Potent Cantrip feature?

Recycling old answers

No sign flipping while figuring out the emf of voltaic cell?

What happened in Rome, when the western empire "fell"?

Where do students learn to solve polynomial equations these days?

Would a completely good Muggle be able to use a wand?

Flying from Cape Town to England and return to another province

Why do airplanes bank sharply to the right after air-to-air refueling?

How many extra stops do monopods offer for tele photographs?

Is wanting to ask what to write an indication that you need to change your story?

Why don't programming languages automatically manage the synchronous/asynchronous problem?

Is it ever safe to open a suspicious HTML file (e.g. email attachment)?

Is it my responsibility to learn a new technology in my own time my employer wants to implement?

How to count occurrences of text in a file?

How to invert MapIndexed on a ragged structure? How to construct a tree from rules?

What happens if you roll doubles 3 times then land on "Go to jail?"

If the updated MCAS software needs two AOA sensors, doesn't that introduce a new single point of failure?



Over fitting in Transfer Learning with small dataset



The Next CEO of Stack Overflow
2019 Community Moderator ElectionHow to set batch_size, steps_per epoch and validation stepsWhat to set in steps_per_epoch in Keras' fit_generator?Is there a problem of over fitting in my dataset?Factorization Machine - prevent over fittingAccuracy drops if more layers trainable - weirdTransfer learning: Poor performance with last layer replacedDetecting over fitting of SVM/SVCVisualizing ConvNet filters using my own fine-tuned network resulting in a “NoneType” when running: K.gradients(loss, model.input)[0]Best practices to modelize top layers over CNNNot sure if over-fittingTransfer learning - small databaseHow to input different sized images into transfer learning network










0












$begingroup$


I am using Transfer Learning to perform image classification.



Base model used : Resnet50 using ImageNet dataset
class_1 and class_2 are the classes each having 1000 samples each (small dataset). And the dataset is not similar to ImageNet dataset.
Number of FC layers used here are 3 with [1024, 512, 256].
I have used a drop out of 0.5 to reduce over-fitting.



When I trained the model with 100 epochs, I could clearly see the model over-fits with training accuracy of 0.9985 and testing accuracy of 0.875.



Is the number of FC layers used is too many which is causing this over-fit problem?
How can I make the model more generalised?



The code used is as given below:



from keras.applications.resnet50 import ResNet50, preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Dense, Activation, Flatten, Dropout
from keras.models import Sequential, Model
from keras.optimizers import SGD, Adam
from keras.callbacks import TensorBoard
import keras
import matplotlib.pyplot as plt

HEIGHT = 300
WIDTH = 300
TRAIN_DIR = "/home/ubuntu/dataset/training_set/"
TEST_DIR = "/home/ubuntu/dataset/test_set/"
BATCH_SIZE = 8
class_list = ["class_1", "class_2"]
FC_LAYERS = [1024, 512, 256]
dropout = 0.5
NUM_EPOCHS = 100
BATCH_SIZE = 8

def build_finetune_model(base_model, dropout, fc_layers, num_classes):
for layer in base_model.layers:
layer.trainable = False

x = base_model.output
x = Flatten()(x)
for fc in fc_layers:
print(fc)
x = Dense(fc, activation='relu')(x)
x = Dropout(dropout)(x)
preditions = Dense(num_classes, activation='softmax')(x)
finetune_model = Model(inputs = base_model.input, outputs = preditions)
return finetune_model

base_model = ResNet50(weights = 'imagenet',
include_top = False,
input_shape = (HEIGHT, WIDTH, 3))

train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
rotation_range = 90,
horizontal_flip = True,
vertical_flip = False)

test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
rotation_range = 90,
horizontal_flip = True,
vertical_flip = False)

train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
target_size = (HEIGHT, WIDTH),
batch_size = BATCH_SIZE)

test_generator = test_datagen.flow_from_directory(TEST_DIR,
target_size = (HEIGHT, WIDTH),
batch_size = BATCH_SIZE)

finetune_model = build_finetune_model(base_model,
dropout = dropout,
fc_layers = FC_LAYERS,
num_classes = len(class_list))

adam = Adam(lr = 0.00001)
finetune_model.compile(adam, loss="categorical_crossentropy", metrics=["accuracy"])

filepath = "./checkpoints" + "RestNet50" + "_model_weights.h5"
checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor = ["acc"], verbose= 1, mode = "max")
cb=TensorBoard(log_dir=("/home/ubuntu/"))
callbacks_list = [checkpoint, cb]

print(train_generator.class_indices)

history = finetune_model.fit_generator(generator = train_generator, epochs = NUM_EPOCHS, steps_per_epoch = 100,
shuffle = True, callbacks=callbacks_list, validation_data = test_generator)


Update :
1. Weight file generated from the model after training is 2.7 GB. Is it normal considering the complexity of the model?



  1. How would I select the steps_per_epoch value? Is there any standard?

Thank you,
KK










share|improve this question











$endgroup$
















    0












    $begingroup$


    I am using Transfer Learning to perform image classification.



    Base model used : Resnet50 using ImageNet dataset
    class_1 and class_2 are the classes each having 1000 samples each (small dataset). And the dataset is not similar to ImageNet dataset.
    Number of FC layers used here are 3 with [1024, 512, 256].
    I have used a drop out of 0.5 to reduce over-fitting.



    When I trained the model with 100 epochs, I could clearly see the model over-fits with training accuracy of 0.9985 and testing accuracy of 0.875.



    Is the number of FC layers used is too many which is causing this over-fit problem?
    How can I make the model more generalised?



    The code used is as given below:



    from keras.applications.resnet50 import ResNet50, preprocess_input
    from keras.preprocessing.image import ImageDataGenerator
    from keras.layers import Dense, Activation, Flatten, Dropout
    from keras.models import Sequential, Model
    from keras.optimizers import SGD, Adam
    from keras.callbacks import TensorBoard
    import keras
    import matplotlib.pyplot as plt

    HEIGHT = 300
    WIDTH = 300
    TRAIN_DIR = "/home/ubuntu/dataset/training_set/"
    TEST_DIR = "/home/ubuntu/dataset/test_set/"
    BATCH_SIZE = 8
    class_list = ["class_1", "class_2"]
    FC_LAYERS = [1024, 512, 256]
    dropout = 0.5
    NUM_EPOCHS = 100
    BATCH_SIZE = 8

    def build_finetune_model(base_model, dropout, fc_layers, num_classes):
    for layer in base_model.layers:
    layer.trainable = False

    x = base_model.output
    x = Flatten()(x)
    for fc in fc_layers:
    print(fc)
    x = Dense(fc, activation='relu')(x)
    x = Dropout(dropout)(x)
    preditions = Dense(num_classes, activation='softmax')(x)
    finetune_model = Model(inputs = base_model.input, outputs = preditions)
    return finetune_model

    base_model = ResNet50(weights = 'imagenet',
    include_top = False,
    input_shape = (HEIGHT, WIDTH, 3))

    train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
    rotation_range = 90,
    horizontal_flip = True,
    vertical_flip = False)

    test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
    rotation_range = 90,
    horizontal_flip = True,
    vertical_flip = False)

    train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
    target_size = (HEIGHT, WIDTH),
    batch_size = BATCH_SIZE)

    test_generator = test_datagen.flow_from_directory(TEST_DIR,
    target_size = (HEIGHT, WIDTH),
    batch_size = BATCH_SIZE)

    finetune_model = build_finetune_model(base_model,
    dropout = dropout,
    fc_layers = FC_LAYERS,
    num_classes = len(class_list))

    adam = Adam(lr = 0.00001)
    finetune_model.compile(adam, loss="categorical_crossentropy", metrics=["accuracy"])

    filepath = "./checkpoints" + "RestNet50" + "_model_weights.h5"
    checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor = ["acc"], verbose= 1, mode = "max")
    cb=TensorBoard(log_dir=("/home/ubuntu/"))
    callbacks_list = [checkpoint, cb]

    print(train_generator.class_indices)

    history = finetune_model.fit_generator(generator = train_generator, epochs = NUM_EPOCHS, steps_per_epoch = 100,
    shuffle = True, callbacks=callbacks_list, validation_data = test_generator)


    Update :
    1. Weight file generated from the model after training is 2.7 GB. Is it normal considering the complexity of the model?



    1. How would I select the steps_per_epoch value? Is there any standard?

    Thank you,
    KK










    share|improve this question











    $endgroup$














      0












      0








      0


      1



      $begingroup$


      I am using Transfer Learning to perform image classification.



      Base model used : Resnet50 using ImageNet dataset
      class_1 and class_2 are the classes each having 1000 samples each (small dataset). And the dataset is not similar to ImageNet dataset.
      Number of FC layers used here are 3 with [1024, 512, 256].
      I have used a drop out of 0.5 to reduce over-fitting.



      When I trained the model with 100 epochs, I could clearly see the model over-fits with training accuracy of 0.9985 and testing accuracy of 0.875.



      Is the number of FC layers used is too many which is causing this over-fit problem?
      How can I make the model more generalised?



      The code used is as given below:



      from keras.applications.resnet50 import ResNet50, preprocess_input
      from keras.preprocessing.image import ImageDataGenerator
      from keras.layers import Dense, Activation, Flatten, Dropout
      from keras.models import Sequential, Model
      from keras.optimizers import SGD, Adam
      from keras.callbacks import TensorBoard
      import keras
      import matplotlib.pyplot as plt

      HEIGHT = 300
      WIDTH = 300
      TRAIN_DIR = "/home/ubuntu/dataset/training_set/"
      TEST_DIR = "/home/ubuntu/dataset/test_set/"
      BATCH_SIZE = 8
      class_list = ["class_1", "class_2"]
      FC_LAYERS = [1024, 512, 256]
      dropout = 0.5
      NUM_EPOCHS = 100
      BATCH_SIZE = 8

      def build_finetune_model(base_model, dropout, fc_layers, num_classes):
      for layer in base_model.layers:
      layer.trainable = False

      x = base_model.output
      x = Flatten()(x)
      for fc in fc_layers:
      print(fc)
      x = Dense(fc, activation='relu')(x)
      x = Dropout(dropout)(x)
      preditions = Dense(num_classes, activation='softmax')(x)
      finetune_model = Model(inputs = base_model.input, outputs = preditions)
      return finetune_model

      base_model = ResNet50(weights = 'imagenet',
      include_top = False,
      input_shape = (HEIGHT, WIDTH, 3))

      train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
      rotation_range = 90,
      horizontal_flip = True,
      vertical_flip = False)

      test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
      rotation_range = 90,
      horizontal_flip = True,
      vertical_flip = False)

      train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
      target_size = (HEIGHT, WIDTH),
      batch_size = BATCH_SIZE)

      test_generator = test_datagen.flow_from_directory(TEST_DIR,
      target_size = (HEIGHT, WIDTH),
      batch_size = BATCH_SIZE)

      finetune_model = build_finetune_model(base_model,
      dropout = dropout,
      fc_layers = FC_LAYERS,
      num_classes = len(class_list))

      adam = Adam(lr = 0.00001)
      finetune_model.compile(adam, loss="categorical_crossentropy", metrics=["accuracy"])

      filepath = "./checkpoints" + "RestNet50" + "_model_weights.h5"
      checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor = ["acc"], verbose= 1, mode = "max")
      cb=TensorBoard(log_dir=("/home/ubuntu/"))
      callbacks_list = [checkpoint, cb]

      print(train_generator.class_indices)

      history = finetune_model.fit_generator(generator = train_generator, epochs = NUM_EPOCHS, steps_per_epoch = 100,
      shuffle = True, callbacks=callbacks_list, validation_data = test_generator)


      Update :
      1. Weight file generated from the model after training is 2.7 GB. Is it normal considering the complexity of the model?



      1. How would I select the steps_per_epoch value? Is there any standard?

      Thank you,
      KK










      share|improve this question











      $endgroup$




      I am using Transfer Learning to perform image classification.



      Base model used : Resnet50 using ImageNet dataset
      class_1 and class_2 are the classes each having 1000 samples each (small dataset). And the dataset is not similar to ImageNet dataset.
      Number of FC layers used here are 3 with [1024, 512, 256].
      I have used a drop out of 0.5 to reduce over-fitting.



      When I trained the model with 100 epochs, I could clearly see the model over-fits with training accuracy of 0.9985 and testing accuracy of 0.875.



      Is the number of FC layers used is too many which is causing this over-fit problem?
      How can I make the model more generalised?



      The code used is as given below:



      from keras.applications.resnet50 import ResNet50, preprocess_input
      from keras.preprocessing.image import ImageDataGenerator
      from keras.layers import Dense, Activation, Flatten, Dropout
      from keras.models import Sequential, Model
      from keras.optimizers import SGD, Adam
      from keras.callbacks import TensorBoard
      import keras
      import matplotlib.pyplot as plt

      HEIGHT = 300
      WIDTH = 300
      TRAIN_DIR = "/home/ubuntu/dataset/training_set/"
      TEST_DIR = "/home/ubuntu/dataset/test_set/"
      BATCH_SIZE = 8
      class_list = ["class_1", "class_2"]
      FC_LAYERS = [1024, 512, 256]
      dropout = 0.5
      NUM_EPOCHS = 100
      BATCH_SIZE = 8

      def build_finetune_model(base_model, dropout, fc_layers, num_classes):
      for layer in base_model.layers:
      layer.trainable = False

      x = base_model.output
      x = Flatten()(x)
      for fc in fc_layers:
      print(fc)
      x = Dense(fc, activation='relu')(x)
      x = Dropout(dropout)(x)
      preditions = Dense(num_classes, activation='softmax')(x)
      finetune_model = Model(inputs = base_model.input, outputs = preditions)
      return finetune_model

      base_model = ResNet50(weights = 'imagenet',
      include_top = False,
      input_shape = (HEIGHT, WIDTH, 3))

      train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
      rotation_range = 90,
      horizontal_flip = True,
      vertical_flip = False)

      test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input,
      rotation_range = 90,
      horizontal_flip = True,
      vertical_flip = False)

      train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
      target_size = (HEIGHT, WIDTH),
      batch_size = BATCH_SIZE)

      test_generator = test_datagen.flow_from_directory(TEST_DIR,
      target_size = (HEIGHT, WIDTH),
      batch_size = BATCH_SIZE)

      finetune_model = build_finetune_model(base_model,
      dropout = dropout,
      fc_layers = FC_LAYERS,
      num_classes = len(class_list))

      adam = Adam(lr = 0.00001)
      finetune_model.compile(adam, loss="categorical_crossentropy", metrics=["accuracy"])

      filepath = "./checkpoints" + "RestNet50" + "_model_weights.h5"
      checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor = ["acc"], verbose= 1, mode = "max")
      cb=TensorBoard(log_dir=("/home/ubuntu/"))
      callbacks_list = [checkpoint, cb]

      print(train_generator.class_indices)

      history = finetune_model.fit_generator(generator = train_generator, epochs = NUM_EPOCHS, steps_per_epoch = 100,
      shuffle = True, callbacks=callbacks_list, validation_data = test_generator)


      Update :
      1. Weight file generated from the model after training is 2.7 GB. Is it normal considering the complexity of the model?



      1. How would I select the steps_per_epoch value? Is there any standard?

      Thank you,
      KK







      deep-learning cnn training overfitting transfer-learning






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 26 at 9:40







      KK2491

















      asked Mar 25 at 18:21









      KK2491KK2491

      345220




      345220




















          1 Answer
          1






          active

          oldest

          votes


















          2












          $begingroup$

          First of all:



          • I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.


          • Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.


          Secondly, there is more than one way to reduce overfitting:



          1- Enlarge your data set by using augmentation techniques such as flip, scale,...



          2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.



          3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.



          4- Using cross-validation to train/test your model.



          and many more...



          • for more, see here, or here, or here.

          feel free to ask any questions






          share|improve this answer











          $endgroup$












          • $begingroup$
            Thanks for the detailed explanation.
            $endgroup$
            – KK2491
            Mar 26 at 14:04










          • $begingroup$
            Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
            $endgroup$
            – KK2491
            Mar 26 at 14:38










          • $begingroup$
            Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
            $endgroup$
            – honar.cs
            Mar 26 at 19:46










          • $begingroup$
            Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select steps_per_epoch value?
            $endgroup$
            – KK2491
            Mar 27 at 2:57










          • $begingroup$
            sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
            $endgroup$
            – honar.cs
            Mar 27 at 6:37











          Your Answer





          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "557"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47966%2fover-fitting-in-transfer-learning-with-small-dataset%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2












          $begingroup$

          First of all:



          • I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.


          • Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.


          Secondly, there is more than one way to reduce overfitting:



          1- Enlarge your data set by using augmentation techniques such as flip, scale,...



          2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.



          3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.



          4- Using cross-validation to train/test your model.



          and many more...



          • for more, see here, or here, or here.

          feel free to ask any questions






          share|improve this answer











          $endgroup$












          • $begingroup$
            Thanks for the detailed explanation.
            $endgroup$
            – KK2491
            Mar 26 at 14:04










          • $begingroup$
            Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
            $endgroup$
            – KK2491
            Mar 26 at 14:38










          • $begingroup$
            Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
            $endgroup$
            – honar.cs
            Mar 26 at 19:46










          • $begingroup$
            Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select steps_per_epoch value?
            $endgroup$
            – KK2491
            Mar 27 at 2:57










          • $begingroup$
            sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
            $endgroup$
            – honar.cs
            Mar 27 at 6:37















          2












          $begingroup$

          First of all:



          • I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.


          • Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.


          Secondly, there is more than one way to reduce overfitting:



          1- Enlarge your data set by using augmentation techniques such as flip, scale,...



          2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.



          3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.



          4- Using cross-validation to train/test your model.



          and many more...



          • for more, see here, or here, or here.

          feel free to ask any questions






          share|improve this answer











          $endgroup$












          • $begingroup$
            Thanks for the detailed explanation.
            $endgroup$
            – KK2491
            Mar 26 at 14:04










          • $begingroup$
            Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
            $endgroup$
            – KK2491
            Mar 26 at 14:38










          • $begingroup$
            Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
            $endgroup$
            – honar.cs
            Mar 26 at 19:46










          • $begingroup$
            Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select steps_per_epoch value?
            $endgroup$
            – KK2491
            Mar 27 at 2:57










          • $begingroup$
            sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
            $endgroup$
            – honar.cs
            Mar 27 at 6:37













          2












          2








          2





          $begingroup$

          First of all:



          • I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.


          • Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.


          Secondly, there is more than one way to reduce overfitting:



          1- Enlarge your data set by using augmentation techniques such as flip, scale,...



          2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.



          3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.



          4- Using cross-validation to train/test your model.



          and many more...



          • for more, see here, or here, or here.

          feel free to ask any questions






          share|improve this answer











          $endgroup$



          First of all:



          • I think you should reduce the number of FC layers and number of nodes of FC layers, for example, one FC with 256 or 512, or 2 FC with 256 and 512, try this.


          • Try to make your batch size 30, and decrease number of epochs to nearly 10 or 20, 100 epochs is too many for your small size dataset.


          Secondly, there is more than one way to reduce overfitting:



          1- Enlarge your data set by using augmentation techniques such as flip, scale,...



          2- Using regularization techniques like dropout (you already did it), but you can play with dropout rate, try more than or less than 0.5.



          3- One of the good techniques in your case is to do early stopping, in any epoch when you see that the model goes to overfit, stop it.



          4- Using cross-validation to train/test your model.



          and many more...



          • for more, see here, or here, or here.

          feel free to ask any questions







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Mar 26 at 12:48

























          answered Mar 26 at 11:31









          honar.cshonar.cs

          18313




          18313











          • $begingroup$
            Thanks for the detailed explanation.
            $endgroup$
            – KK2491
            Mar 26 at 14:04










          • $begingroup$
            Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
            $endgroup$
            – KK2491
            Mar 26 at 14:38










          • $begingroup$
            Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
            $endgroup$
            – honar.cs
            Mar 26 at 19:46










          • $begingroup$
            Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select steps_per_epoch value?
            $endgroup$
            – KK2491
            Mar 27 at 2:57










          • $begingroup$
            sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
            $endgroup$
            – honar.cs
            Mar 27 at 6:37
















          • $begingroup$
            Thanks for the detailed explanation.
            $endgroup$
            – KK2491
            Mar 26 at 14:04










          • $begingroup$
            Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
            $endgroup$
            – KK2491
            Mar 26 at 14:38










          • $begingroup$
            Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
            $endgroup$
            – honar.cs
            Mar 26 at 19:46










          • $begingroup$
            Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select steps_per_epoch value?
            $endgroup$
            – KK2491
            Mar 27 at 2:57










          • $begingroup$
            sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
            $endgroup$
            – honar.cs
            Mar 27 at 6:37















          $begingroup$
          Thanks for the detailed explanation.
          $endgroup$
          – KK2491
          Mar 26 at 14:04




          $begingroup$
          Thanks for the detailed explanation.
          $endgroup$
          – KK2491
          Mar 26 at 14:04












          $begingroup$
          Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
          $endgroup$
          – KK2491
          Mar 26 at 14:38




          $begingroup$
          Thanks for the detailed explanation. I will re-train the model with the inputs you have given. I am not sure about Early stopping (automatic not manual) and using cross-validation (is it K-fold cross validation ?) in practical sense. I will go through and post the results here. Also I have updated my question, could you please have a look.
          $endgroup$
          – KK2491
          Mar 26 at 14:38












          $begingroup$
          Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
          $endgroup$
          – honar.cs
          Mar 26 at 19:46




          $begingroup$
          Yes cross-validation is K-fold cross-validation and you can sklearn library to implement it, about early stopping, as I know you can do it manually, also you can use keras.callbacks.EarlyStopping function.
          $endgroup$
          – honar.cs
          Mar 26 at 19:46












          $begingroup$
          Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select steps_per_epoch value?
          $endgroup$
          – KK2491
          Mar 27 at 2:57




          $begingroup$
          Thank you, I will use the keras.callbacks.EarlyStopping. And one more is there any standard to select steps_per_epoch value?
          $endgroup$
          – KK2491
          Mar 27 at 2:57












          $begingroup$
          sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
          $endgroup$
          – honar.cs
          Mar 27 at 6:37




          $begingroup$
          sorry, I don't have much info about how to select it, but I think steps_per_epoch is related to batch size, and as I know there is no such standard that tells you how to choose it, it a hyper-parameter that you should find the optimal one for your data.
          $endgroup$
          – honar.cs
          Mar 27 at 6:37

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47966%2fover-fitting-in-transfer-learning-with-small-dataset%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

          Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

          Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High