Neural Network Model using Transfer Learning not learning Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsConvolutional neural network overfitting. Dropout not helpingAccuracy drops if more layers trainable - weirdHow to improve loss and avoid overfittingValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (0,)How to set input for proper fit with lstm?Multi-label classification, recall and precision increase but accuracy decrease, why?Training Accuracy stuck in KerasValue error in Merging two different models in kerasImage features (produced by VGG19) do not properly train an ANN in KerasSteps taking too long to complete

How does light 'choose' between wave and particle behaviour?

Why is it faster to reheat something than it is to cook it?

New Order #6: Easter Egg

How would you say "es muy psicólogo"?

A proverb that is used to imply that you have unexpectedly faced a big problem

retrieve food groups from food item list

As a dual citizen, my US passport will expire one day after traveling to the US. Will this work?

Delete free apps from library

GDP with Intermediate Production

Is multiple magic items in one inherently imbalanced?

Found this skink in my tomato plant bucket. Is he trapped? Or could he leave if he wanted?

What is the chair depicted in Cesare Maccari's 1889 painting "Cicerone denuncia Catilina"?

Is it dangerous to install hacking tools on my private linux machine?

Can you force honesty by using the Speak with Dead and Zone of Truth spells together?

Did Mueller's report provide an evidentiary basis for the claim of Russian govt election interference via social media?

Co-worker has annoying ringtone

Should a wizard buy fine inks every time he want to copy spells into his spellbook?

Why datecode is SO IMPORTANT to chip manufacturers?

Asymptotics question

The Nth Gryphon Number

Differences to CCompactSize and CVarInt

After Sam didn't return home in the end, were he and Al still friends?

Is CEO the "profession" with the most psychopaths?

How to ternary Plot3D a function



Neural Network Model using Transfer Learning not learning



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsConvolutional neural network overfitting. Dropout not helpingAccuracy drops if more layers trainable - weirdHow to improve loss and avoid overfittingValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (0,)How to set input for proper fit with lstm?Multi-label classification, recall and precision increase but accuracy decrease, why?Training Accuracy stuck in KerasValue error in Merging two different models in kerasImage features (produced by VGG19) do not properly train an ANN in KerasSteps taking too long to complete










1












$begingroup$


I am a beginner in Deep Learning and working on Road Crack detection using transfer learning. I am working on binary classification with two classes , crack and no crack.



My distribution of two classes is as follows:



Cracks - 600 images



No cracks - 480 images



I have used data augmentation also :



 train_generator = train_datagen.flow(trainX, trainY, batch_size=16)

val_generator = test_datagen.flow(testX, testY, batch_size= 16)


I am using VGG16 and I have frozen the lower 4 layers like this :



vgg = vgg16.VGG16(include_top=False, weights='imagenet', 
input_shape=input_shape)

output = vgg.layers[-1].output

output = keras.layers.Flatten()(output)

vgg_model = Model(vgg.input, output)

for layer in vgg_model.layers[:4]:

layer.trainable = False


After that, I added two hidden layers :



model = Sequential()
model.add(vgg_model)
model.add(Dense(256, activation='relu', input_dim=input_shape))
model.add(Dense(256, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])


But after 1-2 epochs nothing seems to change, neither validation accuracy nor loss. I tried using SGD optimizer also but that also didn't help. I added more layers also but didn't have any effect on accuracy and loss.The maximum validation accuracy achieved is 62%.



I tried testing an image from my dataset, for that also model gives wrong prediction. For every test image it predicts as crack, i.e label 1.



Could someone suggest how i can improve this?
Thanks!










share|improve this question











$endgroup$
















    1












    $begingroup$


    I am a beginner in Deep Learning and working on Road Crack detection using transfer learning. I am working on binary classification with two classes , crack and no crack.



    My distribution of two classes is as follows:



    Cracks - 600 images



    No cracks - 480 images



    I have used data augmentation also :



     train_generator = train_datagen.flow(trainX, trainY, batch_size=16)

    val_generator = test_datagen.flow(testX, testY, batch_size= 16)


    I am using VGG16 and I have frozen the lower 4 layers like this :



    vgg = vgg16.VGG16(include_top=False, weights='imagenet', 
    input_shape=input_shape)

    output = vgg.layers[-1].output

    output = keras.layers.Flatten()(output)

    vgg_model = Model(vgg.input, output)

    for layer in vgg_model.layers[:4]:

    layer.trainable = False


    After that, I added two hidden layers :



    model = Sequential()
    model.add(vgg_model)
    model.add(Dense(256, activation='relu', input_dim=input_shape))
    model.add(Dense(256, activation='relu'))
    model.add(Dense(2, activation='softmax'))
    model.compile(loss='binary_crossentropy',
    optimizer=Adam(lr = 1e-6),
    metrics=['accuracy'])


    But after 1-2 epochs nothing seems to change, neither validation accuracy nor loss. I tried using SGD optimizer also but that also didn't help. I added more layers also but didn't have any effect on accuracy and loss.The maximum validation accuracy achieved is 62%.



    I tried testing an image from my dataset, for that also model gives wrong prediction. For every test image it predicts as crack, i.e label 1.



    Could someone suggest how i can improve this?
    Thanks!










    share|improve this question











    $endgroup$














      1












      1








      1





      $begingroup$


      I am a beginner in Deep Learning and working on Road Crack detection using transfer learning. I am working on binary classification with two classes , crack and no crack.



      My distribution of two classes is as follows:



      Cracks - 600 images



      No cracks - 480 images



      I have used data augmentation also :



       train_generator = train_datagen.flow(trainX, trainY, batch_size=16)

      val_generator = test_datagen.flow(testX, testY, batch_size= 16)


      I am using VGG16 and I have frozen the lower 4 layers like this :



      vgg = vgg16.VGG16(include_top=False, weights='imagenet', 
      input_shape=input_shape)

      output = vgg.layers[-1].output

      output = keras.layers.Flatten()(output)

      vgg_model = Model(vgg.input, output)

      for layer in vgg_model.layers[:4]:

      layer.trainable = False


      After that, I added two hidden layers :



      model = Sequential()
      model.add(vgg_model)
      model.add(Dense(256, activation='relu', input_dim=input_shape))
      model.add(Dense(256, activation='relu'))
      model.add(Dense(2, activation='softmax'))
      model.compile(loss='binary_crossentropy',
      optimizer=Adam(lr = 1e-6),
      metrics=['accuracy'])


      But after 1-2 epochs nothing seems to change, neither validation accuracy nor loss. I tried using SGD optimizer also but that also didn't help. I added more layers also but didn't have any effect on accuracy and loss.The maximum validation accuracy achieved is 62%.



      I tried testing an image from my dataset, for that also model gives wrong prediction. For every test image it predicts as crack, i.e label 1.



      Could someone suggest how i can improve this?
      Thanks!










      share|improve this question











      $endgroup$




      I am a beginner in Deep Learning and working on Road Crack detection using transfer learning. I am working on binary classification with two classes , crack and no crack.



      My distribution of two classes is as follows:



      Cracks - 600 images



      No cracks - 480 images



      I have used data augmentation also :



       train_generator = train_datagen.flow(trainX, trainY, batch_size=16)

      val_generator = test_datagen.flow(testX, testY, batch_size= 16)


      I am using VGG16 and I have frozen the lower 4 layers like this :



      vgg = vgg16.VGG16(include_top=False, weights='imagenet', 
      input_shape=input_shape)

      output = vgg.layers[-1].output

      output = keras.layers.Flatten()(output)

      vgg_model = Model(vgg.input, output)

      for layer in vgg_model.layers[:4]:

      layer.trainable = False


      After that, I added two hidden layers :



      model = Sequential()
      model.add(vgg_model)
      model.add(Dense(256, activation='relu', input_dim=input_shape))
      model.add(Dense(256, activation='relu'))
      model.add(Dense(2, activation='softmax'))
      model.compile(loss='binary_crossentropy',
      optimizer=Adam(lr = 1e-6),
      metrics=['accuracy'])


      But after 1-2 epochs nothing seems to change, neither validation accuracy nor loss. I tried using SGD optimizer also but that also didn't help. I added more layers also but didn't have any effect on accuracy and loss.The maximum validation accuracy achieved is 62%.



      I tried testing an image from my dataset, for that also model gives wrong prediction. For every test image it predicts as crack, i.e label 1.



      Could someone suggest how i can improve this?
      Thanks!







      neural-network deep-learning transfer-learning vgg16






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Apr 6 at 4:49







      Shreya

















      asked Apr 4 at 8:37









      ShreyaShreya

      62




      62




















          3 Answers
          3






          active

          oldest

          votes


















          1












          $begingroup$

          Just take 2 images from your training data. One from class 'crack', another one from class 'not crack'. Now, check if your model can get a training accuracy of 100% which means it can overfit on the training dataset or not. If it is not able to do that, something is extremely wrong about the model.






          share|improve this answer









          $endgroup$












          • $begingroup$
            Yes i tried with two images, training accuracy is 100% only
            $endgroup$
            – Shreya
            Apr 4 at 9:24










          • $begingroup$
            Try freezing some more lower layers(the ones closer to the input). Start with freezing all the conv layers. Your dataset seems to be too small for learning that many parameters without overfitting. For checking whether the model overfits, track the training accuracies along with validation accuracies throughout the training epochs. Also, consider lowering the learning rate when you unfreeze some conv layers starting from the top(closer to the output node) so that the previously learned parameters are not completely morphed.
            $endgroup$
            – Sajid Ahmed
            Apr 4 at 9:35










          • $begingroup$
            Actually my model is underfitting because the training loss is always higher than validation loss and training accuracy lower than validation accuracy
            $endgroup$
            – Shreya
            Apr 4 at 10:06










          • $begingroup$
            Use 'categorical_crossentropy' as loss function instead of 'binary_crossentropy' since you are using 'softmax' as activation function in your output layer. I hope you have already encoded your classes in one-hot format before applying softmax....................................................... for binary_crossentropy: sigmoid activation, scalar target for categorical_crossentropy: softmax activation, one-hot encoded target
            $endgroup$
            – Sajid Ahmed
            Apr 4 at 11:23











          • $begingroup$
            Yes i one-hot encoded before using binary_cross entropy. I changed to categorical_cross entropy as well but there is no change in accuracy, its constant 57%
            $endgroup$
            – Shreya
            Apr 4 at 13:02


















          1












          $begingroup$

          Transfer learning is done by chopping off the last layer in the pre-trained network (in your case it is VGG16) and add a dense layer depending on the number of classes you need and then train the new model.



          The reason why your model is not working is that you are taking the output from the last layer of the vgg16, which is activated by a softmax layer. And one cannot learn something from a softmax layer especially for transfer learning.



          Rewrite your code as



          from keras.models import Model
          from keras.layers import Dense

          X = vgg_model.layers[-1].output #will give 4096 feature vector as an output
          X = Dense(256, activation ='relu')(X)
          X = Dense(256, activation ='relu')(X)
          X = Dense(2, activation ='softmax')(X)
          newmodel = Model(vgg_model.layers[0].output,X)
          newmodel.compile(loss='binary_crossentropy',
          optimizer=Adam(lr = 1e-6),
          metrics=['accuracy'])





          share|improve this answer









          $endgroup$












          • $begingroup$
            I have already done this.Forgot to add in my code snippet here. Will edit it
            $endgroup$
            – Shreya
            Apr 6 at 4:47


















          1












          $begingroup$

          The problem with your code is inconsistency of the goal you are willing to operate upon. Softmax calculates the probability of individual neuron and a binary classifier contains single neuron to operate. Hence softmax is never used in binary classification and we rather use sigmoid.



          So simply change the following



          model = Sequential()
          model.add(vgg_model)
          model.add(Dense(256, activation='relu', input_dim=input_shape))
          model.add(Dense(256, activation='relu'))
          model.add(Dense(1, activation='sigmoid'))
          model.compile(loss='binary_crossentropy',
          optimizer=Adam(lr = 1e-6),
          metrics=['accuracy'])


          Talking about this model, I should say that the lower layers you have unfrozen are the layers that learn low level features like edges, so even though you have trained the lower layers it still might not have been able to detect the cracks from a higher level. As a recommendation, I would suggest you to make the higher layers of the model trainable and reduce the complexity of your model. This would improve your model.






          share|improve this answer











          $endgroup$












          • $begingroup$
            But I have frozen the lower layers not the higher layers.
            $endgroup$
            – Shreya
            Apr 10 at 11:24










          • $begingroup$
            I tried to run your model and somehow it didn't threw the same problem as you are facing. In my case, the model was learning (accuracy was improving). Can you share your notebook for better understanding?
            $endgroup$
            – thanatoz
            Apr 10 at 12:04










          • $begingroup$
            Yes sure! How can I share?
            $endgroup$
            – Shreya
            Apr 10 at 14:58










          • $begingroup$
            Upload it somewhere and add the link in the comment.
            $endgroup$
            – thanatoz
            Apr 11 at 4:53











          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "557"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48584%2fneural-network-model-using-transfer-learning-not-learning%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          3 Answers
          3






          active

          oldest

          votes








          3 Answers
          3






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1












          $begingroup$

          Just take 2 images from your training data. One from class 'crack', another one from class 'not crack'. Now, check if your model can get a training accuracy of 100% which means it can overfit on the training dataset or not. If it is not able to do that, something is extremely wrong about the model.






          share|improve this answer









          $endgroup$












          • $begingroup$
            Yes i tried with two images, training accuracy is 100% only
            $endgroup$
            – Shreya
            Apr 4 at 9:24










          • $begingroup$
            Try freezing some more lower layers(the ones closer to the input). Start with freezing all the conv layers. Your dataset seems to be too small for learning that many parameters without overfitting. For checking whether the model overfits, track the training accuracies along with validation accuracies throughout the training epochs. Also, consider lowering the learning rate when you unfreeze some conv layers starting from the top(closer to the output node) so that the previously learned parameters are not completely morphed.
            $endgroup$
            – Sajid Ahmed
            Apr 4 at 9:35










          • $begingroup$
            Actually my model is underfitting because the training loss is always higher than validation loss and training accuracy lower than validation accuracy
            $endgroup$
            – Shreya
            Apr 4 at 10:06










          • $begingroup$
            Use 'categorical_crossentropy' as loss function instead of 'binary_crossentropy' since you are using 'softmax' as activation function in your output layer. I hope you have already encoded your classes in one-hot format before applying softmax....................................................... for binary_crossentropy: sigmoid activation, scalar target for categorical_crossentropy: softmax activation, one-hot encoded target
            $endgroup$
            – Sajid Ahmed
            Apr 4 at 11:23











          • $begingroup$
            Yes i one-hot encoded before using binary_cross entropy. I changed to categorical_cross entropy as well but there is no change in accuracy, its constant 57%
            $endgroup$
            – Shreya
            Apr 4 at 13:02















          1












          $begingroup$

          Just take 2 images from your training data. One from class 'crack', another one from class 'not crack'. Now, check if your model can get a training accuracy of 100% which means it can overfit on the training dataset or not. If it is not able to do that, something is extremely wrong about the model.






          share|improve this answer









          $endgroup$












          • $begingroup$
            Yes i tried with two images, training accuracy is 100% only
            $endgroup$
            – Shreya
            Apr 4 at 9:24










          • $begingroup$
            Try freezing some more lower layers(the ones closer to the input). Start with freezing all the conv layers. Your dataset seems to be too small for learning that many parameters without overfitting. For checking whether the model overfits, track the training accuracies along with validation accuracies throughout the training epochs. Also, consider lowering the learning rate when you unfreeze some conv layers starting from the top(closer to the output node) so that the previously learned parameters are not completely morphed.
            $endgroup$
            – Sajid Ahmed
            Apr 4 at 9:35










          • $begingroup$
            Actually my model is underfitting because the training loss is always higher than validation loss and training accuracy lower than validation accuracy
            $endgroup$
            – Shreya
            Apr 4 at 10:06










          • $begingroup$
            Use 'categorical_crossentropy' as loss function instead of 'binary_crossentropy' since you are using 'softmax' as activation function in your output layer. I hope you have already encoded your classes in one-hot format before applying softmax....................................................... for binary_crossentropy: sigmoid activation, scalar target for categorical_crossentropy: softmax activation, one-hot encoded target
            $endgroup$
            – Sajid Ahmed
            Apr 4 at 11:23











          • $begingroup$
            Yes i one-hot encoded before using binary_cross entropy. I changed to categorical_cross entropy as well but there is no change in accuracy, its constant 57%
            $endgroup$
            – Shreya
            Apr 4 at 13:02













          1












          1








          1





          $begingroup$

          Just take 2 images from your training data. One from class 'crack', another one from class 'not crack'. Now, check if your model can get a training accuracy of 100% which means it can overfit on the training dataset or not. If it is not able to do that, something is extremely wrong about the model.






          share|improve this answer









          $endgroup$



          Just take 2 images from your training data. One from class 'crack', another one from class 'not crack'. Now, check if your model can get a training accuracy of 100% which means it can overfit on the training dataset or not. If it is not able to do that, something is extremely wrong about the model.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Apr 4 at 8:51









          Sajid AhmedSajid Ahmed

          315




          315











          • $begingroup$
            Yes i tried with two images, training accuracy is 100% only
            $endgroup$
            – Shreya
            Apr 4 at 9:24










          • $begingroup$
            Try freezing some more lower layers(the ones closer to the input). Start with freezing all the conv layers. Your dataset seems to be too small for learning that many parameters without overfitting. For checking whether the model overfits, track the training accuracies along with validation accuracies throughout the training epochs. Also, consider lowering the learning rate when you unfreeze some conv layers starting from the top(closer to the output node) so that the previously learned parameters are not completely morphed.
            $endgroup$
            – Sajid Ahmed
            Apr 4 at 9:35










          • $begingroup$
            Actually my model is underfitting because the training loss is always higher than validation loss and training accuracy lower than validation accuracy
            $endgroup$
            – Shreya
            Apr 4 at 10:06










          • $begingroup$
            Use 'categorical_crossentropy' as loss function instead of 'binary_crossentropy' since you are using 'softmax' as activation function in your output layer. I hope you have already encoded your classes in one-hot format before applying softmax....................................................... for binary_crossentropy: sigmoid activation, scalar target for categorical_crossentropy: softmax activation, one-hot encoded target
            $endgroup$
            – Sajid Ahmed
            Apr 4 at 11:23











          • $begingroup$
            Yes i one-hot encoded before using binary_cross entropy. I changed to categorical_cross entropy as well but there is no change in accuracy, its constant 57%
            $endgroup$
            – Shreya
            Apr 4 at 13:02
















          • $begingroup$
            Yes i tried with two images, training accuracy is 100% only
            $endgroup$
            – Shreya
            Apr 4 at 9:24










          • $begingroup$
            Try freezing some more lower layers(the ones closer to the input). Start with freezing all the conv layers. Your dataset seems to be too small for learning that many parameters without overfitting. For checking whether the model overfits, track the training accuracies along with validation accuracies throughout the training epochs. Also, consider lowering the learning rate when you unfreeze some conv layers starting from the top(closer to the output node) so that the previously learned parameters are not completely morphed.
            $endgroup$
            – Sajid Ahmed
            Apr 4 at 9:35










          • $begingroup$
            Actually my model is underfitting because the training loss is always higher than validation loss and training accuracy lower than validation accuracy
            $endgroup$
            – Shreya
            Apr 4 at 10:06










          • $begingroup$
            Use 'categorical_crossentropy' as loss function instead of 'binary_crossentropy' since you are using 'softmax' as activation function in your output layer. I hope you have already encoded your classes in one-hot format before applying softmax....................................................... for binary_crossentropy: sigmoid activation, scalar target for categorical_crossentropy: softmax activation, one-hot encoded target
            $endgroup$
            – Sajid Ahmed
            Apr 4 at 11:23











          • $begingroup$
            Yes i one-hot encoded before using binary_cross entropy. I changed to categorical_cross entropy as well but there is no change in accuracy, its constant 57%
            $endgroup$
            – Shreya
            Apr 4 at 13:02















          $begingroup$
          Yes i tried with two images, training accuracy is 100% only
          $endgroup$
          – Shreya
          Apr 4 at 9:24




          $begingroup$
          Yes i tried with two images, training accuracy is 100% only
          $endgroup$
          – Shreya
          Apr 4 at 9:24












          $begingroup$
          Try freezing some more lower layers(the ones closer to the input). Start with freezing all the conv layers. Your dataset seems to be too small for learning that many parameters without overfitting. For checking whether the model overfits, track the training accuracies along with validation accuracies throughout the training epochs. Also, consider lowering the learning rate when you unfreeze some conv layers starting from the top(closer to the output node) so that the previously learned parameters are not completely morphed.
          $endgroup$
          – Sajid Ahmed
          Apr 4 at 9:35




          $begingroup$
          Try freezing some more lower layers(the ones closer to the input). Start with freezing all the conv layers. Your dataset seems to be too small for learning that many parameters without overfitting. For checking whether the model overfits, track the training accuracies along with validation accuracies throughout the training epochs. Also, consider lowering the learning rate when you unfreeze some conv layers starting from the top(closer to the output node) so that the previously learned parameters are not completely morphed.
          $endgroup$
          – Sajid Ahmed
          Apr 4 at 9:35












          $begingroup$
          Actually my model is underfitting because the training loss is always higher than validation loss and training accuracy lower than validation accuracy
          $endgroup$
          – Shreya
          Apr 4 at 10:06




          $begingroup$
          Actually my model is underfitting because the training loss is always higher than validation loss and training accuracy lower than validation accuracy
          $endgroup$
          – Shreya
          Apr 4 at 10:06












          $begingroup$
          Use 'categorical_crossentropy' as loss function instead of 'binary_crossentropy' since you are using 'softmax' as activation function in your output layer. I hope you have already encoded your classes in one-hot format before applying softmax....................................................... for binary_crossentropy: sigmoid activation, scalar target for categorical_crossentropy: softmax activation, one-hot encoded target
          $endgroup$
          – Sajid Ahmed
          Apr 4 at 11:23





          $begingroup$
          Use 'categorical_crossentropy' as loss function instead of 'binary_crossentropy' since you are using 'softmax' as activation function in your output layer. I hope you have already encoded your classes in one-hot format before applying softmax....................................................... for binary_crossentropy: sigmoid activation, scalar target for categorical_crossentropy: softmax activation, one-hot encoded target
          $endgroup$
          – Sajid Ahmed
          Apr 4 at 11:23













          $begingroup$
          Yes i one-hot encoded before using binary_cross entropy. I changed to categorical_cross entropy as well but there is no change in accuracy, its constant 57%
          $endgroup$
          – Shreya
          Apr 4 at 13:02




          $begingroup$
          Yes i one-hot encoded before using binary_cross entropy. I changed to categorical_cross entropy as well but there is no change in accuracy, its constant 57%
          $endgroup$
          – Shreya
          Apr 4 at 13:02











          1












          $begingroup$

          Transfer learning is done by chopping off the last layer in the pre-trained network (in your case it is VGG16) and add a dense layer depending on the number of classes you need and then train the new model.



          The reason why your model is not working is that you are taking the output from the last layer of the vgg16, which is activated by a softmax layer. And one cannot learn something from a softmax layer especially for transfer learning.



          Rewrite your code as



          from keras.models import Model
          from keras.layers import Dense

          X = vgg_model.layers[-1].output #will give 4096 feature vector as an output
          X = Dense(256, activation ='relu')(X)
          X = Dense(256, activation ='relu')(X)
          X = Dense(2, activation ='softmax')(X)
          newmodel = Model(vgg_model.layers[0].output,X)
          newmodel.compile(loss='binary_crossentropy',
          optimizer=Adam(lr = 1e-6),
          metrics=['accuracy'])





          share|improve this answer









          $endgroup$












          • $begingroup$
            I have already done this.Forgot to add in my code snippet here. Will edit it
            $endgroup$
            – Shreya
            Apr 6 at 4:47















          1












          $begingroup$

          Transfer learning is done by chopping off the last layer in the pre-trained network (in your case it is VGG16) and add a dense layer depending on the number of classes you need and then train the new model.



          The reason why your model is not working is that you are taking the output from the last layer of the vgg16, which is activated by a softmax layer. And one cannot learn something from a softmax layer especially for transfer learning.



          Rewrite your code as



          from keras.models import Model
          from keras.layers import Dense

          X = vgg_model.layers[-1].output #will give 4096 feature vector as an output
          X = Dense(256, activation ='relu')(X)
          X = Dense(256, activation ='relu')(X)
          X = Dense(2, activation ='softmax')(X)
          newmodel = Model(vgg_model.layers[0].output,X)
          newmodel.compile(loss='binary_crossentropy',
          optimizer=Adam(lr = 1e-6),
          metrics=['accuracy'])





          share|improve this answer









          $endgroup$












          • $begingroup$
            I have already done this.Forgot to add in my code snippet here. Will edit it
            $endgroup$
            – Shreya
            Apr 6 at 4:47













          1












          1








          1





          $begingroup$

          Transfer learning is done by chopping off the last layer in the pre-trained network (in your case it is VGG16) and add a dense layer depending on the number of classes you need and then train the new model.



          The reason why your model is not working is that you are taking the output from the last layer of the vgg16, which is activated by a softmax layer. And one cannot learn something from a softmax layer especially for transfer learning.



          Rewrite your code as



          from keras.models import Model
          from keras.layers import Dense

          X = vgg_model.layers[-1].output #will give 4096 feature vector as an output
          X = Dense(256, activation ='relu')(X)
          X = Dense(256, activation ='relu')(X)
          X = Dense(2, activation ='softmax')(X)
          newmodel = Model(vgg_model.layers[0].output,X)
          newmodel.compile(loss='binary_crossentropy',
          optimizer=Adam(lr = 1e-6),
          metrics=['accuracy'])





          share|improve this answer









          $endgroup$



          Transfer learning is done by chopping off the last layer in the pre-trained network (in your case it is VGG16) and add a dense layer depending on the number of classes you need and then train the new model.



          The reason why your model is not working is that you are taking the output from the last layer of the vgg16, which is activated by a softmax layer. And one cannot learn something from a softmax layer especially for transfer learning.



          Rewrite your code as



          from keras.models import Model
          from keras.layers import Dense

          X = vgg_model.layers[-1].output #will give 4096 feature vector as an output
          X = Dense(256, activation ='relu')(X)
          X = Dense(256, activation ='relu')(X)
          X = Dense(2, activation ='softmax')(X)
          newmodel = Model(vgg_model.layers[0].output,X)
          newmodel.compile(loss='binary_crossentropy',
          optimizer=Adam(lr = 1e-6),
          metrics=['accuracy'])






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Apr 5 at 10:14









          ram nithinram nithin

          113




          113











          • $begingroup$
            I have already done this.Forgot to add in my code snippet here. Will edit it
            $endgroup$
            – Shreya
            Apr 6 at 4:47
















          • $begingroup$
            I have already done this.Forgot to add in my code snippet here. Will edit it
            $endgroup$
            – Shreya
            Apr 6 at 4:47















          $begingroup$
          I have already done this.Forgot to add in my code snippet here. Will edit it
          $endgroup$
          – Shreya
          Apr 6 at 4:47




          $begingroup$
          I have already done this.Forgot to add in my code snippet here. Will edit it
          $endgroup$
          – Shreya
          Apr 6 at 4:47











          1












          $begingroup$

          The problem with your code is inconsistency of the goal you are willing to operate upon. Softmax calculates the probability of individual neuron and a binary classifier contains single neuron to operate. Hence softmax is never used in binary classification and we rather use sigmoid.



          So simply change the following



          model = Sequential()
          model.add(vgg_model)
          model.add(Dense(256, activation='relu', input_dim=input_shape))
          model.add(Dense(256, activation='relu'))
          model.add(Dense(1, activation='sigmoid'))
          model.compile(loss='binary_crossentropy',
          optimizer=Adam(lr = 1e-6),
          metrics=['accuracy'])


          Talking about this model, I should say that the lower layers you have unfrozen are the layers that learn low level features like edges, so even though you have trained the lower layers it still might not have been able to detect the cracks from a higher level. As a recommendation, I would suggest you to make the higher layers of the model trainable and reduce the complexity of your model. This would improve your model.






          share|improve this answer











          $endgroup$












          • $begingroup$
            But I have frozen the lower layers not the higher layers.
            $endgroup$
            – Shreya
            Apr 10 at 11:24










          • $begingroup$
            I tried to run your model and somehow it didn't threw the same problem as you are facing. In my case, the model was learning (accuracy was improving). Can you share your notebook for better understanding?
            $endgroup$
            – thanatoz
            Apr 10 at 12:04










          • $begingroup$
            Yes sure! How can I share?
            $endgroup$
            – Shreya
            Apr 10 at 14:58










          • $begingroup$
            Upload it somewhere and add the link in the comment.
            $endgroup$
            – thanatoz
            Apr 11 at 4:53















          1












          $begingroup$

          The problem with your code is inconsistency of the goal you are willing to operate upon. Softmax calculates the probability of individual neuron and a binary classifier contains single neuron to operate. Hence softmax is never used in binary classification and we rather use sigmoid.



          So simply change the following



          model = Sequential()
          model.add(vgg_model)
          model.add(Dense(256, activation='relu', input_dim=input_shape))
          model.add(Dense(256, activation='relu'))
          model.add(Dense(1, activation='sigmoid'))
          model.compile(loss='binary_crossentropy',
          optimizer=Adam(lr = 1e-6),
          metrics=['accuracy'])


          Talking about this model, I should say that the lower layers you have unfrozen are the layers that learn low level features like edges, so even though you have trained the lower layers it still might not have been able to detect the cracks from a higher level. As a recommendation, I would suggest you to make the higher layers of the model trainable and reduce the complexity of your model. This would improve your model.






          share|improve this answer











          $endgroup$












          • $begingroup$
            But I have frozen the lower layers not the higher layers.
            $endgroup$
            – Shreya
            Apr 10 at 11:24










          • $begingroup$
            I tried to run your model and somehow it didn't threw the same problem as you are facing. In my case, the model was learning (accuracy was improving). Can you share your notebook for better understanding?
            $endgroup$
            – thanatoz
            Apr 10 at 12:04










          • $begingroup$
            Yes sure! How can I share?
            $endgroup$
            – Shreya
            Apr 10 at 14:58










          • $begingroup$
            Upload it somewhere and add the link in the comment.
            $endgroup$
            – thanatoz
            Apr 11 at 4:53













          1












          1








          1





          $begingroup$

          The problem with your code is inconsistency of the goal you are willing to operate upon. Softmax calculates the probability of individual neuron and a binary classifier contains single neuron to operate. Hence softmax is never used in binary classification and we rather use sigmoid.



          So simply change the following



          model = Sequential()
          model.add(vgg_model)
          model.add(Dense(256, activation='relu', input_dim=input_shape))
          model.add(Dense(256, activation='relu'))
          model.add(Dense(1, activation='sigmoid'))
          model.compile(loss='binary_crossentropy',
          optimizer=Adam(lr = 1e-6),
          metrics=['accuracy'])


          Talking about this model, I should say that the lower layers you have unfrozen are the layers that learn low level features like edges, so even though you have trained the lower layers it still might not have been able to detect the cracks from a higher level. As a recommendation, I would suggest you to make the higher layers of the model trainable and reduce the complexity of your model. This would improve your model.






          share|improve this answer











          $endgroup$



          The problem with your code is inconsistency of the goal you are willing to operate upon. Softmax calculates the probability of individual neuron and a binary classifier contains single neuron to operate. Hence softmax is never used in binary classification and we rather use sigmoid.



          So simply change the following



          model = Sequential()
          model.add(vgg_model)
          model.add(Dense(256, activation='relu', input_dim=input_shape))
          model.add(Dense(256, activation='relu'))
          model.add(Dense(1, activation='sigmoid'))
          model.compile(loss='binary_crossentropy',
          optimizer=Adam(lr = 1e-6),
          metrics=['accuracy'])


          Talking about this model, I should say that the lower layers you have unfrozen are the layers that learn low level features like edges, so even though you have trained the lower layers it still might not have been able to detect the cracks from a higher level. As a recommendation, I would suggest you to make the higher layers of the model trainable and reduce the complexity of your model. This would improve your model.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Apr 9 at 7:50

























          answered Apr 9 at 5:53









          thanatozthanatoz

          689421




          689421











          • $begingroup$
            But I have frozen the lower layers not the higher layers.
            $endgroup$
            – Shreya
            Apr 10 at 11:24










          • $begingroup$
            I tried to run your model and somehow it didn't threw the same problem as you are facing. In my case, the model was learning (accuracy was improving). Can you share your notebook for better understanding?
            $endgroup$
            – thanatoz
            Apr 10 at 12:04










          • $begingroup$
            Yes sure! How can I share?
            $endgroup$
            – Shreya
            Apr 10 at 14:58










          • $begingroup$
            Upload it somewhere and add the link in the comment.
            $endgroup$
            – thanatoz
            Apr 11 at 4:53
















          • $begingroup$
            But I have frozen the lower layers not the higher layers.
            $endgroup$
            – Shreya
            Apr 10 at 11:24










          • $begingroup$
            I tried to run your model and somehow it didn't threw the same problem as you are facing. In my case, the model was learning (accuracy was improving). Can you share your notebook for better understanding?
            $endgroup$
            – thanatoz
            Apr 10 at 12:04










          • $begingroup$
            Yes sure! How can I share?
            $endgroup$
            – Shreya
            Apr 10 at 14:58










          • $begingroup$
            Upload it somewhere and add the link in the comment.
            $endgroup$
            – thanatoz
            Apr 11 at 4:53















          $begingroup$
          But I have frozen the lower layers not the higher layers.
          $endgroup$
          – Shreya
          Apr 10 at 11:24




          $begingroup$
          But I have frozen the lower layers not the higher layers.
          $endgroup$
          – Shreya
          Apr 10 at 11:24












          $begingroup$
          I tried to run your model and somehow it didn't threw the same problem as you are facing. In my case, the model was learning (accuracy was improving). Can you share your notebook for better understanding?
          $endgroup$
          – thanatoz
          Apr 10 at 12:04




          $begingroup$
          I tried to run your model and somehow it didn't threw the same problem as you are facing. In my case, the model was learning (accuracy was improving). Can you share your notebook for better understanding?
          $endgroup$
          – thanatoz
          Apr 10 at 12:04












          $begingroup$
          Yes sure! How can I share?
          $endgroup$
          – Shreya
          Apr 10 at 14:58




          $begingroup$
          Yes sure! How can I share?
          $endgroup$
          – Shreya
          Apr 10 at 14:58












          $begingroup$
          Upload it somewhere and add the link in the comment.
          $endgroup$
          – thanatoz
          Apr 11 at 4:53




          $begingroup$
          Upload it somewhere and add the link in the comment.
          $endgroup$
          – thanatoz
          Apr 11 at 4:53

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48584%2fneural-network-model-using-transfer-learning-not-learning%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

          Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

          Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High