Neural Network Model using Transfer Learning not learning Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsConvolutional neural network overfitting. Dropout not helpingAccuracy drops if more layers trainable - weirdHow to improve loss and avoid overfittingValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (0,)How to set input for proper fit with lstm?Multi-label classification, recall and precision increase but accuracy decrease, why?Training Accuracy stuck in KerasValue error in Merging two different models in kerasImage features (produced by VGG19) do not properly train an ANN in KerasSteps taking too long to complete
How does light 'choose' between wave and particle behaviour?
Why is it faster to reheat something than it is to cook it?
New Order #6: Easter Egg
How would you say "es muy psicólogo"?
A proverb that is used to imply that you have unexpectedly faced a big problem
retrieve food groups from food item list
As a dual citizen, my US passport will expire one day after traveling to the US. Will this work?
Delete free apps from library
GDP with Intermediate Production
Is multiple magic items in one inherently imbalanced?
Found this skink in my tomato plant bucket. Is he trapped? Or could he leave if he wanted?
What is the chair depicted in Cesare Maccari's 1889 painting "Cicerone denuncia Catilina"?
Is it dangerous to install hacking tools on my private linux machine?
Can you force honesty by using the Speak with Dead and Zone of Truth spells together?
Did Mueller's report provide an evidentiary basis for the claim of Russian govt election interference via social media?
Co-worker has annoying ringtone
Should a wizard buy fine inks every time he want to copy spells into his spellbook?
Why datecode is SO IMPORTANT to chip manufacturers?
Asymptotics question
The Nth Gryphon Number
Differences to CCompactSize and CVarInt
After Sam didn't return home in the end, were he and Al still friends?
Is CEO the "profession" with the most psychopaths?
How to ternary Plot3D a function
Neural Network Model using Transfer Learning not learning
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsConvolutional neural network overfitting. Dropout not helpingAccuracy drops if more layers trainable - weirdHow to improve loss and avoid overfittingValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (0,)How to set input for proper fit with lstm?Multi-label classification, recall and precision increase but accuracy decrease, why?Training Accuracy stuck in KerasValue error in Merging two different models in kerasImage features (produced by VGG19) do not properly train an ANN in KerasSteps taking too long to complete
$begingroup$
I am a beginner in Deep Learning and working on Road Crack detection using transfer learning. I am working on binary classification with two classes , crack and no crack.
My distribution of two classes is as follows:
Cracks - 600 images
No cracks - 480 images
I have used data augmentation also :
train_generator = train_datagen.flow(trainX, trainY, batch_size=16)
val_generator = test_datagen.flow(testX, testY, batch_size= 16)
I am using VGG16 and I have frozen the lower 4 layers like this :
vgg = vgg16.VGG16(include_top=False, weights='imagenet',
input_shape=input_shape)
output = vgg.layers[-1].output
output = keras.layers.Flatten()(output)
vgg_model = Model(vgg.input, output)
for layer in vgg_model.layers[:4]:
layer.trainable = False
After that, I added two hidden layers :
model = Sequential()
model.add(vgg_model)
model.add(Dense(256, activation='relu', input_dim=input_shape))
model.add(Dense(256, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])
But after 1-2 epochs nothing seems to change, neither validation accuracy nor loss. I tried using SGD optimizer also but that also didn't help. I added more layers also but didn't have any effect on accuracy and loss.The maximum validation accuracy achieved is 62%.
I tried testing an image from my dataset, for that also model gives wrong prediction. For every test image it predicts as crack, i.e label 1.
Could someone suggest how i can improve this?
Thanks!
neural-network deep-learning transfer-learning vgg16
$endgroup$
add a comment |
$begingroup$
I am a beginner in Deep Learning and working on Road Crack detection using transfer learning. I am working on binary classification with two classes , crack and no crack.
My distribution of two classes is as follows:
Cracks - 600 images
No cracks - 480 images
I have used data augmentation also :
train_generator = train_datagen.flow(trainX, trainY, batch_size=16)
val_generator = test_datagen.flow(testX, testY, batch_size= 16)
I am using VGG16 and I have frozen the lower 4 layers like this :
vgg = vgg16.VGG16(include_top=False, weights='imagenet',
input_shape=input_shape)
output = vgg.layers[-1].output
output = keras.layers.Flatten()(output)
vgg_model = Model(vgg.input, output)
for layer in vgg_model.layers[:4]:
layer.trainable = False
After that, I added two hidden layers :
model = Sequential()
model.add(vgg_model)
model.add(Dense(256, activation='relu', input_dim=input_shape))
model.add(Dense(256, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])
But after 1-2 epochs nothing seems to change, neither validation accuracy nor loss. I tried using SGD optimizer also but that also didn't help. I added more layers also but didn't have any effect on accuracy and loss.The maximum validation accuracy achieved is 62%.
I tried testing an image from my dataset, for that also model gives wrong prediction. For every test image it predicts as crack, i.e label 1.
Could someone suggest how i can improve this?
Thanks!
neural-network deep-learning transfer-learning vgg16
$endgroup$
add a comment |
$begingroup$
I am a beginner in Deep Learning and working on Road Crack detection using transfer learning. I am working on binary classification with two classes , crack and no crack.
My distribution of two classes is as follows:
Cracks - 600 images
No cracks - 480 images
I have used data augmentation also :
train_generator = train_datagen.flow(trainX, trainY, batch_size=16)
val_generator = test_datagen.flow(testX, testY, batch_size= 16)
I am using VGG16 and I have frozen the lower 4 layers like this :
vgg = vgg16.VGG16(include_top=False, weights='imagenet',
input_shape=input_shape)
output = vgg.layers[-1].output
output = keras.layers.Flatten()(output)
vgg_model = Model(vgg.input, output)
for layer in vgg_model.layers[:4]:
layer.trainable = False
After that, I added two hidden layers :
model = Sequential()
model.add(vgg_model)
model.add(Dense(256, activation='relu', input_dim=input_shape))
model.add(Dense(256, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])
But after 1-2 epochs nothing seems to change, neither validation accuracy nor loss. I tried using SGD optimizer also but that also didn't help. I added more layers also but didn't have any effect on accuracy and loss.The maximum validation accuracy achieved is 62%.
I tried testing an image from my dataset, for that also model gives wrong prediction. For every test image it predicts as crack, i.e label 1.
Could someone suggest how i can improve this?
Thanks!
neural-network deep-learning transfer-learning vgg16
$endgroup$
I am a beginner in Deep Learning and working on Road Crack detection using transfer learning. I am working on binary classification with two classes , crack and no crack.
My distribution of two classes is as follows:
Cracks - 600 images
No cracks - 480 images
I have used data augmentation also :
train_generator = train_datagen.flow(trainX, trainY, batch_size=16)
val_generator = test_datagen.flow(testX, testY, batch_size= 16)
I am using VGG16 and I have frozen the lower 4 layers like this :
vgg = vgg16.VGG16(include_top=False, weights='imagenet',
input_shape=input_shape)
output = vgg.layers[-1].output
output = keras.layers.Flatten()(output)
vgg_model = Model(vgg.input, output)
for layer in vgg_model.layers[:4]:
layer.trainable = False
After that, I added two hidden layers :
model = Sequential()
model.add(vgg_model)
model.add(Dense(256, activation='relu', input_dim=input_shape))
model.add(Dense(256, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])
But after 1-2 epochs nothing seems to change, neither validation accuracy nor loss. I tried using SGD optimizer also but that also didn't help. I added more layers also but didn't have any effect on accuracy and loss.The maximum validation accuracy achieved is 62%.
I tried testing an image from my dataset, for that also model gives wrong prediction. For every test image it predicts as crack, i.e label 1.
Could someone suggest how i can improve this?
Thanks!
neural-network deep-learning transfer-learning vgg16
neural-network deep-learning transfer-learning vgg16
edited Apr 6 at 4:49
Shreya
asked Apr 4 at 8:37
ShreyaShreya
62
62
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
$begingroup$
Just take 2 images from your training data. One from class 'crack', another one from class 'not crack'. Now, check if your model can get a training accuracy of 100% which means it can overfit on the training dataset or not. If it is not able to do that, something is extremely wrong about the model.
$endgroup$
$begingroup$
Yes i tried with two images, training accuracy is 100% only
$endgroup$
– Shreya
Apr 4 at 9:24
$begingroup$
Try freezing some more lower layers(the ones closer to the input). Start with freezing all the conv layers. Your dataset seems to be too small for learning that many parameters without overfitting. For checking whether the model overfits, track the training accuracies along with validation accuracies throughout the training epochs. Also, consider lowering the learning rate when you unfreeze some conv layers starting from the top(closer to the output node) so that the previously learned parameters are not completely morphed.
$endgroup$
– Sajid Ahmed
Apr 4 at 9:35
$begingroup$
Actually my model is underfitting because the training loss is always higher than validation loss and training accuracy lower than validation accuracy
$endgroup$
– Shreya
Apr 4 at 10:06
$begingroup$
Use 'categorical_crossentropy' as loss function instead of 'binary_crossentropy' since you are using 'softmax' as activation function in your output layer. I hope you have already encoded your classes in one-hot format before applying softmax....................................................... for binary_crossentropy: sigmoid activation, scalar target for categorical_crossentropy: softmax activation, one-hot encoded target
$endgroup$
– Sajid Ahmed
Apr 4 at 11:23
$begingroup$
Yes i one-hot encoded before using binary_cross entropy. I changed to categorical_cross entropy as well but there is no change in accuracy, its constant 57%
$endgroup$
– Shreya
Apr 4 at 13:02
add a comment |
$begingroup$
Transfer learning is done by chopping off the last layer in the pre-trained network (in your case it is VGG16) and add a dense layer depending on the number of classes you need and then train the new model.
The reason why your model is not working is that you are taking the output from the last layer of the vgg16, which is activated by a softmax layer. And one cannot learn something from a softmax layer especially for transfer learning.
Rewrite your code as
from keras.models import Model
from keras.layers import Dense
X = vgg_model.layers[-1].output #will give 4096 feature vector as an output
X = Dense(256, activation ='relu')(X)
X = Dense(256, activation ='relu')(X)
X = Dense(2, activation ='softmax')(X)
newmodel = Model(vgg_model.layers[0].output,X)
newmodel.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])
$endgroup$
$begingroup$
I have already done this.Forgot to add in my code snippet here. Will edit it
$endgroup$
– Shreya
Apr 6 at 4:47
add a comment |
$begingroup$
The problem with your code is inconsistency of the goal you are willing to operate upon. Softmax calculates the probability of individual neuron and a binary classifier contains single neuron to operate. Hence softmax is never used in binary classification and we rather use sigmoid.
So simply change the following
model = Sequential()
model.add(vgg_model)
model.add(Dense(256, activation='relu', input_dim=input_shape))
model.add(Dense(256, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])
Talking about this model, I should say that the lower layers you have unfrozen are the layers that learn low level features like edges, so even though you have trained the lower layers it still might not have been able to detect the cracks from a higher level. As a recommendation, I would suggest you to make the higher layers of the model trainable and reduce the complexity of your model. This would improve your model.
$endgroup$
$begingroup$
But I have frozen the lower layers not the higher layers.
$endgroup$
– Shreya
Apr 10 at 11:24
$begingroup$
I tried to run your model and somehow it didn't threw the same problem as you are facing. In my case, the model was learning (accuracy was improving). Can you share your notebook for better understanding?
$endgroup$
– thanatoz
Apr 10 at 12:04
$begingroup$
Yes sure! How can I share?
$endgroup$
– Shreya
Apr 10 at 14:58
$begingroup$
Upload it somewhere and add the link in the comment.
$endgroup$
– thanatoz
Apr 11 at 4:53
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48584%2fneural-network-model-using-transfer-learning-not-learning%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Just take 2 images from your training data. One from class 'crack', another one from class 'not crack'. Now, check if your model can get a training accuracy of 100% which means it can overfit on the training dataset or not. If it is not able to do that, something is extremely wrong about the model.
$endgroup$
$begingroup$
Yes i tried with two images, training accuracy is 100% only
$endgroup$
– Shreya
Apr 4 at 9:24
$begingroup$
Try freezing some more lower layers(the ones closer to the input). Start with freezing all the conv layers. Your dataset seems to be too small for learning that many parameters without overfitting. For checking whether the model overfits, track the training accuracies along with validation accuracies throughout the training epochs. Also, consider lowering the learning rate when you unfreeze some conv layers starting from the top(closer to the output node) so that the previously learned parameters are not completely morphed.
$endgroup$
– Sajid Ahmed
Apr 4 at 9:35
$begingroup$
Actually my model is underfitting because the training loss is always higher than validation loss and training accuracy lower than validation accuracy
$endgroup$
– Shreya
Apr 4 at 10:06
$begingroup$
Use 'categorical_crossentropy' as loss function instead of 'binary_crossentropy' since you are using 'softmax' as activation function in your output layer. I hope you have already encoded your classes in one-hot format before applying softmax....................................................... for binary_crossentropy: sigmoid activation, scalar target for categorical_crossentropy: softmax activation, one-hot encoded target
$endgroup$
– Sajid Ahmed
Apr 4 at 11:23
$begingroup$
Yes i one-hot encoded before using binary_cross entropy. I changed to categorical_cross entropy as well but there is no change in accuracy, its constant 57%
$endgroup$
– Shreya
Apr 4 at 13:02
add a comment |
$begingroup$
Just take 2 images from your training data. One from class 'crack', another one from class 'not crack'. Now, check if your model can get a training accuracy of 100% which means it can overfit on the training dataset or not. If it is not able to do that, something is extremely wrong about the model.
$endgroup$
$begingroup$
Yes i tried with two images, training accuracy is 100% only
$endgroup$
– Shreya
Apr 4 at 9:24
$begingroup$
Try freezing some more lower layers(the ones closer to the input). Start with freezing all the conv layers. Your dataset seems to be too small for learning that many parameters without overfitting. For checking whether the model overfits, track the training accuracies along with validation accuracies throughout the training epochs. Also, consider lowering the learning rate when you unfreeze some conv layers starting from the top(closer to the output node) so that the previously learned parameters are not completely morphed.
$endgroup$
– Sajid Ahmed
Apr 4 at 9:35
$begingroup$
Actually my model is underfitting because the training loss is always higher than validation loss and training accuracy lower than validation accuracy
$endgroup$
– Shreya
Apr 4 at 10:06
$begingroup$
Use 'categorical_crossentropy' as loss function instead of 'binary_crossentropy' since you are using 'softmax' as activation function in your output layer. I hope you have already encoded your classes in one-hot format before applying softmax....................................................... for binary_crossentropy: sigmoid activation, scalar target for categorical_crossentropy: softmax activation, one-hot encoded target
$endgroup$
– Sajid Ahmed
Apr 4 at 11:23
$begingroup$
Yes i one-hot encoded before using binary_cross entropy. I changed to categorical_cross entropy as well but there is no change in accuracy, its constant 57%
$endgroup$
– Shreya
Apr 4 at 13:02
add a comment |
$begingroup$
Just take 2 images from your training data. One from class 'crack', another one from class 'not crack'. Now, check if your model can get a training accuracy of 100% which means it can overfit on the training dataset or not. If it is not able to do that, something is extremely wrong about the model.
$endgroup$
Just take 2 images from your training data. One from class 'crack', another one from class 'not crack'. Now, check if your model can get a training accuracy of 100% which means it can overfit on the training dataset or not. If it is not able to do that, something is extremely wrong about the model.
answered Apr 4 at 8:51
Sajid AhmedSajid Ahmed
315
315
$begingroup$
Yes i tried with two images, training accuracy is 100% only
$endgroup$
– Shreya
Apr 4 at 9:24
$begingroup$
Try freezing some more lower layers(the ones closer to the input). Start with freezing all the conv layers. Your dataset seems to be too small for learning that many parameters without overfitting. For checking whether the model overfits, track the training accuracies along with validation accuracies throughout the training epochs. Also, consider lowering the learning rate when you unfreeze some conv layers starting from the top(closer to the output node) so that the previously learned parameters are not completely morphed.
$endgroup$
– Sajid Ahmed
Apr 4 at 9:35
$begingroup$
Actually my model is underfitting because the training loss is always higher than validation loss and training accuracy lower than validation accuracy
$endgroup$
– Shreya
Apr 4 at 10:06
$begingroup$
Use 'categorical_crossentropy' as loss function instead of 'binary_crossentropy' since you are using 'softmax' as activation function in your output layer. I hope you have already encoded your classes in one-hot format before applying softmax....................................................... for binary_crossentropy: sigmoid activation, scalar target for categorical_crossentropy: softmax activation, one-hot encoded target
$endgroup$
– Sajid Ahmed
Apr 4 at 11:23
$begingroup$
Yes i one-hot encoded before using binary_cross entropy. I changed to categorical_cross entropy as well but there is no change in accuracy, its constant 57%
$endgroup$
– Shreya
Apr 4 at 13:02
add a comment |
$begingroup$
Yes i tried with two images, training accuracy is 100% only
$endgroup$
– Shreya
Apr 4 at 9:24
$begingroup$
Try freezing some more lower layers(the ones closer to the input). Start with freezing all the conv layers. Your dataset seems to be too small for learning that many parameters without overfitting. For checking whether the model overfits, track the training accuracies along with validation accuracies throughout the training epochs. Also, consider lowering the learning rate when you unfreeze some conv layers starting from the top(closer to the output node) so that the previously learned parameters are not completely morphed.
$endgroup$
– Sajid Ahmed
Apr 4 at 9:35
$begingroup$
Actually my model is underfitting because the training loss is always higher than validation loss and training accuracy lower than validation accuracy
$endgroup$
– Shreya
Apr 4 at 10:06
$begingroup$
Use 'categorical_crossentropy' as loss function instead of 'binary_crossentropy' since you are using 'softmax' as activation function in your output layer. I hope you have already encoded your classes in one-hot format before applying softmax....................................................... for binary_crossentropy: sigmoid activation, scalar target for categorical_crossentropy: softmax activation, one-hot encoded target
$endgroup$
– Sajid Ahmed
Apr 4 at 11:23
$begingroup$
Yes i one-hot encoded before using binary_cross entropy. I changed to categorical_cross entropy as well but there is no change in accuracy, its constant 57%
$endgroup$
– Shreya
Apr 4 at 13:02
$begingroup$
Yes i tried with two images, training accuracy is 100% only
$endgroup$
– Shreya
Apr 4 at 9:24
$begingroup$
Yes i tried with two images, training accuracy is 100% only
$endgroup$
– Shreya
Apr 4 at 9:24
$begingroup$
Try freezing some more lower layers(the ones closer to the input). Start with freezing all the conv layers. Your dataset seems to be too small for learning that many parameters without overfitting. For checking whether the model overfits, track the training accuracies along with validation accuracies throughout the training epochs. Also, consider lowering the learning rate when you unfreeze some conv layers starting from the top(closer to the output node) so that the previously learned parameters are not completely morphed.
$endgroup$
– Sajid Ahmed
Apr 4 at 9:35
$begingroup$
Try freezing some more lower layers(the ones closer to the input). Start with freezing all the conv layers. Your dataset seems to be too small for learning that many parameters without overfitting. For checking whether the model overfits, track the training accuracies along with validation accuracies throughout the training epochs. Also, consider lowering the learning rate when you unfreeze some conv layers starting from the top(closer to the output node) so that the previously learned parameters are not completely morphed.
$endgroup$
– Sajid Ahmed
Apr 4 at 9:35
$begingroup$
Actually my model is underfitting because the training loss is always higher than validation loss and training accuracy lower than validation accuracy
$endgroup$
– Shreya
Apr 4 at 10:06
$begingroup$
Actually my model is underfitting because the training loss is always higher than validation loss and training accuracy lower than validation accuracy
$endgroup$
– Shreya
Apr 4 at 10:06
$begingroup$
Use 'categorical_crossentropy' as loss function instead of 'binary_crossentropy' since you are using 'softmax' as activation function in your output layer. I hope you have already encoded your classes in one-hot format before applying softmax....................................................... for binary_crossentropy: sigmoid activation, scalar target for categorical_crossentropy: softmax activation, one-hot encoded target
$endgroup$
– Sajid Ahmed
Apr 4 at 11:23
$begingroup$
Use 'categorical_crossentropy' as loss function instead of 'binary_crossentropy' since you are using 'softmax' as activation function in your output layer. I hope you have already encoded your classes in one-hot format before applying softmax....................................................... for binary_crossentropy: sigmoid activation, scalar target for categorical_crossentropy: softmax activation, one-hot encoded target
$endgroup$
– Sajid Ahmed
Apr 4 at 11:23
$begingroup$
Yes i one-hot encoded before using binary_cross entropy. I changed to categorical_cross entropy as well but there is no change in accuracy, its constant 57%
$endgroup$
– Shreya
Apr 4 at 13:02
$begingroup$
Yes i one-hot encoded before using binary_cross entropy. I changed to categorical_cross entropy as well but there is no change in accuracy, its constant 57%
$endgroup$
– Shreya
Apr 4 at 13:02
add a comment |
$begingroup$
Transfer learning is done by chopping off the last layer in the pre-trained network (in your case it is VGG16) and add a dense layer depending on the number of classes you need and then train the new model.
The reason why your model is not working is that you are taking the output from the last layer of the vgg16, which is activated by a softmax layer. And one cannot learn something from a softmax layer especially for transfer learning.
Rewrite your code as
from keras.models import Model
from keras.layers import Dense
X = vgg_model.layers[-1].output #will give 4096 feature vector as an output
X = Dense(256, activation ='relu')(X)
X = Dense(256, activation ='relu')(X)
X = Dense(2, activation ='softmax')(X)
newmodel = Model(vgg_model.layers[0].output,X)
newmodel.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])
$endgroup$
$begingroup$
I have already done this.Forgot to add in my code snippet here. Will edit it
$endgroup$
– Shreya
Apr 6 at 4:47
add a comment |
$begingroup$
Transfer learning is done by chopping off the last layer in the pre-trained network (in your case it is VGG16) and add a dense layer depending on the number of classes you need and then train the new model.
The reason why your model is not working is that you are taking the output from the last layer of the vgg16, which is activated by a softmax layer. And one cannot learn something from a softmax layer especially for transfer learning.
Rewrite your code as
from keras.models import Model
from keras.layers import Dense
X = vgg_model.layers[-1].output #will give 4096 feature vector as an output
X = Dense(256, activation ='relu')(X)
X = Dense(256, activation ='relu')(X)
X = Dense(2, activation ='softmax')(X)
newmodel = Model(vgg_model.layers[0].output,X)
newmodel.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])
$endgroup$
$begingroup$
I have already done this.Forgot to add in my code snippet here. Will edit it
$endgroup$
– Shreya
Apr 6 at 4:47
add a comment |
$begingroup$
Transfer learning is done by chopping off the last layer in the pre-trained network (in your case it is VGG16) and add a dense layer depending on the number of classes you need and then train the new model.
The reason why your model is not working is that you are taking the output from the last layer of the vgg16, which is activated by a softmax layer. And one cannot learn something from a softmax layer especially for transfer learning.
Rewrite your code as
from keras.models import Model
from keras.layers import Dense
X = vgg_model.layers[-1].output #will give 4096 feature vector as an output
X = Dense(256, activation ='relu')(X)
X = Dense(256, activation ='relu')(X)
X = Dense(2, activation ='softmax')(X)
newmodel = Model(vgg_model.layers[0].output,X)
newmodel.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])
$endgroup$
Transfer learning is done by chopping off the last layer in the pre-trained network (in your case it is VGG16) and add a dense layer depending on the number of classes you need and then train the new model.
The reason why your model is not working is that you are taking the output from the last layer of the vgg16, which is activated by a softmax layer. And one cannot learn something from a softmax layer especially for transfer learning.
Rewrite your code as
from keras.models import Model
from keras.layers import Dense
X = vgg_model.layers[-1].output #will give 4096 feature vector as an output
X = Dense(256, activation ='relu')(X)
X = Dense(256, activation ='relu')(X)
X = Dense(2, activation ='softmax')(X)
newmodel = Model(vgg_model.layers[0].output,X)
newmodel.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])
answered Apr 5 at 10:14
ram nithinram nithin
113
113
$begingroup$
I have already done this.Forgot to add in my code snippet here. Will edit it
$endgroup$
– Shreya
Apr 6 at 4:47
add a comment |
$begingroup$
I have already done this.Forgot to add in my code snippet here. Will edit it
$endgroup$
– Shreya
Apr 6 at 4:47
$begingroup$
I have already done this.Forgot to add in my code snippet here. Will edit it
$endgroup$
– Shreya
Apr 6 at 4:47
$begingroup$
I have already done this.Forgot to add in my code snippet here. Will edit it
$endgroup$
– Shreya
Apr 6 at 4:47
add a comment |
$begingroup$
The problem with your code is inconsistency of the goal you are willing to operate upon. Softmax calculates the probability of individual neuron and a binary classifier contains single neuron to operate. Hence softmax is never used in binary classification and we rather use sigmoid.
So simply change the following
model = Sequential()
model.add(vgg_model)
model.add(Dense(256, activation='relu', input_dim=input_shape))
model.add(Dense(256, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])
Talking about this model, I should say that the lower layers you have unfrozen are the layers that learn low level features like edges, so even though you have trained the lower layers it still might not have been able to detect the cracks from a higher level. As a recommendation, I would suggest you to make the higher layers of the model trainable and reduce the complexity of your model. This would improve your model.
$endgroup$
$begingroup$
But I have frozen the lower layers not the higher layers.
$endgroup$
– Shreya
Apr 10 at 11:24
$begingroup$
I tried to run your model and somehow it didn't threw the same problem as you are facing. In my case, the model was learning (accuracy was improving). Can you share your notebook for better understanding?
$endgroup$
– thanatoz
Apr 10 at 12:04
$begingroup$
Yes sure! How can I share?
$endgroup$
– Shreya
Apr 10 at 14:58
$begingroup$
Upload it somewhere and add the link in the comment.
$endgroup$
– thanatoz
Apr 11 at 4:53
add a comment |
$begingroup$
The problem with your code is inconsistency of the goal you are willing to operate upon. Softmax calculates the probability of individual neuron and a binary classifier contains single neuron to operate. Hence softmax is never used in binary classification and we rather use sigmoid.
So simply change the following
model = Sequential()
model.add(vgg_model)
model.add(Dense(256, activation='relu', input_dim=input_shape))
model.add(Dense(256, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])
Talking about this model, I should say that the lower layers you have unfrozen are the layers that learn low level features like edges, so even though you have trained the lower layers it still might not have been able to detect the cracks from a higher level. As a recommendation, I would suggest you to make the higher layers of the model trainable and reduce the complexity of your model. This would improve your model.
$endgroup$
$begingroup$
But I have frozen the lower layers not the higher layers.
$endgroup$
– Shreya
Apr 10 at 11:24
$begingroup$
I tried to run your model and somehow it didn't threw the same problem as you are facing. In my case, the model was learning (accuracy was improving). Can you share your notebook for better understanding?
$endgroup$
– thanatoz
Apr 10 at 12:04
$begingroup$
Yes sure! How can I share?
$endgroup$
– Shreya
Apr 10 at 14:58
$begingroup$
Upload it somewhere and add the link in the comment.
$endgroup$
– thanatoz
Apr 11 at 4:53
add a comment |
$begingroup$
The problem with your code is inconsistency of the goal you are willing to operate upon. Softmax calculates the probability of individual neuron and a binary classifier contains single neuron to operate. Hence softmax is never used in binary classification and we rather use sigmoid.
So simply change the following
model = Sequential()
model.add(vgg_model)
model.add(Dense(256, activation='relu', input_dim=input_shape))
model.add(Dense(256, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])
Talking about this model, I should say that the lower layers you have unfrozen are the layers that learn low level features like edges, so even though you have trained the lower layers it still might not have been able to detect the cracks from a higher level. As a recommendation, I would suggest you to make the higher layers of the model trainable and reduce the complexity of your model. This would improve your model.
$endgroup$
The problem with your code is inconsistency of the goal you are willing to operate upon. Softmax calculates the probability of individual neuron and a binary classifier contains single neuron to operate. Hence softmax is never used in binary classification and we rather use sigmoid.
So simply change the following
model = Sequential()
model.add(vgg_model)
model.add(Dense(256, activation='relu', input_dim=input_shape))
model.add(Dense(256, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr = 1e-6),
metrics=['accuracy'])
Talking about this model, I should say that the lower layers you have unfrozen are the layers that learn low level features like edges, so even though you have trained the lower layers it still might not have been able to detect the cracks from a higher level. As a recommendation, I would suggest you to make the higher layers of the model trainable and reduce the complexity of your model. This would improve your model.
edited Apr 9 at 7:50
answered Apr 9 at 5:53
thanatozthanatoz
689421
689421
$begingroup$
But I have frozen the lower layers not the higher layers.
$endgroup$
– Shreya
Apr 10 at 11:24
$begingroup$
I tried to run your model and somehow it didn't threw the same problem as you are facing. In my case, the model was learning (accuracy was improving). Can you share your notebook for better understanding?
$endgroup$
– thanatoz
Apr 10 at 12:04
$begingroup$
Yes sure! How can I share?
$endgroup$
– Shreya
Apr 10 at 14:58
$begingroup$
Upload it somewhere and add the link in the comment.
$endgroup$
– thanatoz
Apr 11 at 4:53
add a comment |
$begingroup$
But I have frozen the lower layers not the higher layers.
$endgroup$
– Shreya
Apr 10 at 11:24
$begingroup$
I tried to run your model and somehow it didn't threw the same problem as you are facing. In my case, the model was learning (accuracy was improving). Can you share your notebook for better understanding?
$endgroup$
– thanatoz
Apr 10 at 12:04
$begingroup$
Yes sure! How can I share?
$endgroup$
– Shreya
Apr 10 at 14:58
$begingroup$
Upload it somewhere and add the link in the comment.
$endgroup$
– thanatoz
Apr 11 at 4:53
$begingroup$
But I have frozen the lower layers not the higher layers.
$endgroup$
– Shreya
Apr 10 at 11:24
$begingroup$
But I have frozen the lower layers not the higher layers.
$endgroup$
– Shreya
Apr 10 at 11:24
$begingroup$
I tried to run your model and somehow it didn't threw the same problem as you are facing. In my case, the model was learning (accuracy was improving). Can you share your notebook for better understanding?
$endgroup$
– thanatoz
Apr 10 at 12:04
$begingroup$
I tried to run your model and somehow it didn't threw the same problem as you are facing. In my case, the model was learning (accuracy was improving). Can you share your notebook for better understanding?
$endgroup$
– thanatoz
Apr 10 at 12:04
$begingroup$
Yes sure! How can I share?
$endgroup$
– Shreya
Apr 10 at 14:58
$begingroup$
Yes sure! How can I share?
$endgroup$
– Shreya
Apr 10 at 14:58
$begingroup$
Upload it somewhere and add the link in the comment.
$endgroup$
– thanatoz
Apr 11 at 4:53
$begingroup$
Upload it somewhere and add the link in the comment.
$endgroup$
– thanatoz
Apr 11 at 4:53
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48584%2fneural-network-model-using-transfer-learning-not-learning%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown