High model accuracy vs very low validation accuarcy2019 Community Moderator ElectionValidation loss increases and validation accuracy decreasesPossible Reason for low Test accuracy and high AUCIs it always possible for validation accuracy to be as high as training accuracy?Neural Network: Backpropagation Works on MNIST, but Training/Test Set Accuracy Very LowLow Kappa score but high accuracyLow accuracy using Multi Layer perceptron network on CIFAR10 dataset?Pretrained InceptionV3 - very low accuracy on Tobacco datasetVery low accuracy of new data compared to validation dataHow to interpret my neural network with high accuracy but low probability in test resultsHow to improve a model with a high cross validation score yet with low accuracy on unseen data?Stop CNN model at high accuracy and low loss rate?

Coordinate position not precise

Was Spock the First Vulcan in Starfleet?

Was the picture area of a CRT a parallelogram (instead of a true rectangle)?

Hide Select Output from T-SQL

Greatest common substring

How will losing mobility of one hand affect my career as a programmer?

Star/Wye electrical connection math symbol

Why did Kant, Hegel, and Adorno leave some words and phrases in the Greek alphabet?

Modify casing of marked letters

Curses work by shouting - How to avoid collateral damage?

Applicability of Single Responsibility Principle

Should my PhD thesis be submitted under my legal name?

Mapping a list into a phase plot

Are there any comparative studies done between Ashtavakra Gita and Buddhim?

Displaying the order of the columns of a table

apt-get update is failing in debian

How can I get through very long and very dry, but also very useful technical documents when learning a new tool?

What's a natural way to say that someone works somewhere (for a job)?

Go Pregnant or Go Home

Have I saved too much for retirement so far?

Bash method for viewing beginning and end of file

If a character can use a +X magic weapon as a spellcasting focus, does it add the bonus to spell attacks or spell save DCs?

There is only s̶i̶x̶t̶y one place he can be

How can I use the arrow sign in my bash prompt?



High model accuracy vs very low validation accuarcy



2019 Community Moderator ElectionValidation loss increases and validation accuracy decreasesPossible Reason for low Test accuracy and high AUCIs it always possible for validation accuracy to be as high as training accuracy?Neural Network: Backpropagation Works on MNIST, but Training/Test Set Accuracy Very LowLow Kappa score but high accuracyLow accuracy using Multi Layer perceptron network on CIFAR10 dataset?Pretrained InceptionV3 - very low accuracy on Tobacco datasetVery low accuracy of new data compared to validation dataHow to interpret my neural network with high accuracy but low probability in test resultsHow to improve a model with a high cross validation score yet with low accuracy on unseen data?Stop CNN model at high accuracy and low loss rate?










3












$begingroup$


I'm building a sentiment analysis program in python using Keras Sequential model for deep learning



my data is 20,000 tweets:



  • positive tweets: 9152 tweets

  • negative tweets: 10849 tweets

I wrote a sequential model script to make the binary classification as follows:



model=Sequential()
model.add(Embedding(vocab_size, 100, input_length=max_words))
model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(250, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit the model

print(model.summary())
history=model.fit(X_train[train], y1[train], validation_split=0.30,epochs=2, batch_size=128,verbose=2)


however I get very strange results! The model accuracy is almost perfect (>90)
whereas the validation accuracy is very low (<1) (shown bellow)



Train on 9417 samples, validate on 4036 samples
Epoch 1/2
- 13s - loss: 0.5478 - acc: 0.7133 - val_loss: 3.6157 - val_acc: 0.0243
Epoch 2/2
- 11s - loss: 0.2287 - acc: 0.8995 - val_loss: 5.4746 - val_acc: 0.0339


I tried to increase the number of epoch, and it only increases the model accuracy and lowers the validation accuracy



Any advice on how to overcome this issue?



Update:



this is how I handle my data



#read training data
pos_file=open('pos2.txt', 'r', encoding="Latin-1")
neg_file=open('neg3.txt', 'r', encoding="Latin-1")
# Load data from files
pos = list(pos_file.readlines())
neg = list(neg_file.readlines())
x = pos + neg
docs = numpy.array(x)
#read Testing Data
pos_test=open('posTest2.txt', 'r',encoding="Latin-1")
posT = list(pos_test.readlines())
neg_test=open('negTest2.txt', 'r',encoding="Latin-1")
negT = list(neg_test.readlines())
xTest = posT + negT
total2 = numpy.array(xTest)

CombinedDocs=numpy.append(total2,docs)

# Generate labels
positive_labels = [1 for _ in pos]
negative_labels = [0 for _ in neg]
labels = numpy.concatenate([positive_labels, negative_labels], 0)

# prepare tokenizer
t = Tokenizer()
t.fit_on_texts(CombinedDocs)
vocab_size = len(t.word_index) + 1
# integer encode the documents
encoded_docs = t.texts_to_sequences(docs)
#print(encoded_docs)

# pad documents to a max length of 140 words
max_length = 140
padded_docs = pad_sequences(encoded_docs, maxlen=max_length, padding='post')


Here I used Google public word2vec



# load the whole embedding into memory
embeddings_index = dict()
f = open('Google28.bin',encoding="latin-1")
for line in f:
values = line.split()
word = values[0]
coefs = asarray(values[1:], dtype='str')
embeddings_index[word] = coefs
f.close()

# create a weight matrix for words in training docs
embedding_matrix = zeros((vocab_size, 100))

for word, i in t.word_index.items():
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
embedding_matrix[i] = embedding_vector


#Convert to numpy
NewTraining=numpy.array(padded_docs)
NewLabels=numpy.array(labels)
encoded_docs2 = t.texts_to_sequences(total2)

# pad documents to a max length of 140 words

padded_docs2 = pad_sequences(encoded_docs2, maxlen=max_length, padding='post')


# Generate labels
positive_labels2 = [1 for _ in posT]
negative_labels2 = [0 for _ in negT]
yTest = numpy.concatenate([positive_labels2, negative_labels2], 0)
NewTesting=numpy.array(padded_docs2)
NewLabelsTsting=numpy.array(yTest)









share|improve this question











$endgroup$











  • $begingroup$
    Try to deepen the architecture, try adding more filters and dense layers
    $endgroup$
    – Aditya
    Apr 4 '18 at 9:51











  • $begingroup$
    I added 3 dense layers (128,64,32) and it still produces similar results @Aditya
    $endgroup$
    – Amy.Dj
    Apr 4 '18 at 9:55






  • 2




    $begingroup$
    Keras does not shuffle the data before doing the training/validation split. This means that if the data appearing at the beginning (i.e. the one you train on) is very different from the data appearing by the end (i.e. the one you use for validation), the validation accuracy will be low. Try shuffling the data before feeding it to fit.
    $endgroup$
    – ncasas
    Apr 4 '18 at 10:56










  • $begingroup$
    @ncasas Thank you a lot ! it helped ! validation accuracy jumped to 78%.. great improvement.. but my question is there a way to increase accuracy?
    $endgroup$
    – Amy.Dj
    Apr 5 '18 at 8:17










  • $begingroup$
    I have a similar issue with my model. I'm trying to use the most basic Conv1D model to analyze review data and output a rating of 1-5 class, therefore the loss is categorical_crossentropy. Model structure is as below # define model model = Sequential() model.add(Embedding(vocab_size, 100, input_length=max_length)) model.add(Conv1D(filters=32, kernel_size=8, activation='relu')) model.add(MaxPooling1D(pool_size=2)) model.add(Flatten()) model.add(Dense(10, activation='relu')) model.add(Dense(5, activation='softmax')) Total params: 15,865,417 Trainable params: 15,865,417 Non-trainable params: 0 Tr
    $endgroup$
    – stranger
    Mar 21 at 6:10















3












$begingroup$


I'm building a sentiment analysis program in python using Keras Sequential model for deep learning



my data is 20,000 tweets:



  • positive tweets: 9152 tweets

  • negative tweets: 10849 tweets

I wrote a sequential model script to make the binary classification as follows:



model=Sequential()
model.add(Embedding(vocab_size, 100, input_length=max_words))
model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(250, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit the model

print(model.summary())
history=model.fit(X_train[train], y1[train], validation_split=0.30,epochs=2, batch_size=128,verbose=2)


however I get very strange results! The model accuracy is almost perfect (>90)
whereas the validation accuracy is very low (<1) (shown bellow)



Train on 9417 samples, validate on 4036 samples
Epoch 1/2
- 13s - loss: 0.5478 - acc: 0.7133 - val_loss: 3.6157 - val_acc: 0.0243
Epoch 2/2
- 11s - loss: 0.2287 - acc: 0.8995 - val_loss: 5.4746 - val_acc: 0.0339


I tried to increase the number of epoch, and it only increases the model accuracy and lowers the validation accuracy



Any advice on how to overcome this issue?



Update:



this is how I handle my data



#read training data
pos_file=open('pos2.txt', 'r', encoding="Latin-1")
neg_file=open('neg3.txt', 'r', encoding="Latin-1")
# Load data from files
pos = list(pos_file.readlines())
neg = list(neg_file.readlines())
x = pos + neg
docs = numpy.array(x)
#read Testing Data
pos_test=open('posTest2.txt', 'r',encoding="Latin-1")
posT = list(pos_test.readlines())
neg_test=open('negTest2.txt', 'r',encoding="Latin-1")
negT = list(neg_test.readlines())
xTest = posT + negT
total2 = numpy.array(xTest)

CombinedDocs=numpy.append(total2,docs)

# Generate labels
positive_labels = [1 for _ in pos]
negative_labels = [0 for _ in neg]
labels = numpy.concatenate([positive_labels, negative_labels], 0)

# prepare tokenizer
t = Tokenizer()
t.fit_on_texts(CombinedDocs)
vocab_size = len(t.word_index) + 1
# integer encode the documents
encoded_docs = t.texts_to_sequences(docs)
#print(encoded_docs)

# pad documents to a max length of 140 words
max_length = 140
padded_docs = pad_sequences(encoded_docs, maxlen=max_length, padding='post')


Here I used Google public word2vec



# load the whole embedding into memory
embeddings_index = dict()
f = open('Google28.bin',encoding="latin-1")
for line in f:
values = line.split()
word = values[0]
coefs = asarray(values[1:], dtype='str')
embeddings_index[word] = coefs
f.close()

# create a weight matrix for words in training docs
embedding_matrix = zeros((vocab_size, 100))

for word, i in t.word_index.items():
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
embedding_matrix[i] = embedding_vector


#Convert to numpy
NewTraining=numpy.array(padded_docs)
NewLabels=numpy.array(labels)
encoded_docs2 = t.texts_to_sequences(total2)

# pad documents to a max length of 140 words

padded_docs2 = pad_sequences(encoded_docs2, maxlen=max_length, padding='post')


# Generate labels
positive_labels2 = [1 for _ in posT]
negative_labels2 = [0 for _ in negT]
yTest = numpy.concatenate([positive_labels2, negative_labels2], 0)
NewTesting=numpy.array(padded_docs2)
NewLabelsTsting=numpy.array(yTest)









share|improve this question











$endgroup$











  • $begingroup$
    Try to deepen the architecture, try adding more filters and dense layers
    $endgroup$
    – Aditya
    Apr 4 '18 at 9:51











  • $begingroup$
    I added 3 dense layers (128,64,32) and it still produces similar results @Aditya
    $endgroup$
    – Amy.Dj
    Apr 4 '18 at 9:55






  • 2




    $begingroup$
    Keras does not shuffle the data before doing the training/validation split. This means that if the data appearing at the beginning (i.e. the one you train on) is very different from the data appearing by the end (i.e. the one you use for validation), the validation accuracy will be low. Try shuffling the data before feeding it to fit.
    $endgroup$
    – ncasas
    Apr 4 '18 at 10:56










  • $begingroup$
    @ncasas Thank you a lot ! it helped ! validation accuracy jumped to 78%.. great improvement.. but my question is there a way to increase accuracy?
    $endgroup$
    – Amy.Dj
    Apr 5 '18 at 8:17










  • $begingroup$
    I have a similar issue with my model. I'm trying to use the most basic Conv1D model to analyze review data and output a rating of 1-5 class, therefore the loss is categorical_crossentropy. Model structure is as below # define model model = Sequential() model.add(Embedding(vocab_size, 100, input_length=max_length)) model.add(Conv1D(filters=32, kernel_size=8, activation='relu')) model.add(MaxPooling1D(pool_size=2)) model.add(Flatten()) model.add(Dense(10, activation='relu')) model.add(Dense(5, activation='softmax')) Total params: 15,865,417 Trainable params: 15,865,417 Non-trainable params: 0 Tr
    $endgroup$
    – stranger
    Mar 21 at 6:10













3












3








3





$begingroup$


I'm building a sentiment analysis program in python using Keras Sequential model for deep learning



my data is 20,000 tweets:



  • positive tweets: 9152 tweets

  • negative tweets: 10849 tweets

I wrote a sequential model script to make the binary classification as follows:



model=Sequential()
model.add(Embedding(vocab_size, 100, input_length=max_words))
model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(250, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit the model

print(model.summary())
history=model.fit(X_train[train], y1[train], validation_split=0.30,epochs=2, batch_size=128,verbose=2)


however I get very strange results! The model accuracy is almost perfect (>90)
whereas the validation accuracy is very low (<1) (shown bellow)



Train on 9417 samples, validate on 4036 samples
Epoch 1/2
- 13s - loss: 0.5478 - acc: 0.7133 - val_loss: 3.6157 - val_acc: 0.0243
Epoch 2/2
- 11s - loss: 0.2287 - acc: 0.8995 - val_loss: 5.4746 - val_acc: 0.0339


I tried to increase the number of epoch, and it only increases the model accuracy and lowers the validation accuracy



Any advice on how to overcome this issue?



Update:



this is how I handle my data



#read training data
pos_file=open('pos2.txt', 'r', encoding="Latin-1")
neg_file=open('neg3.txt', 'r', encoding="Latin-1")
# Load data from files
pos = list(pos_file.readlines())
neg = list(neg_file.readlines())
x = pos + neg
docs = numpy.array(x)
#read Testing Data
pos_test=open('posTest2.txt', 'r',encoding="Latin-1")
posT = list(pos_test.readlines())
neg_test=open('negTest2.txt', 'r',encoding="Latin-1")
negT = list(neg_test.readlines())
xTest = posT + negT
total2 = numpy.array(xTest)

CombinedDocs=numpy.append(total2,docs)

# Generate labels
positive_labels = [1 for _ in pos]
negative_labels = [0 for _ in neg]
labels = numpy.concatenate([positive_labels, negative_labels], 0)

# prepare tokenizer
t = Tokenizer()
t.fit_on_texts(CombinedDocs)
vocab_size = len(t.word_index) + 1
# integer encode the documents
encoded_docs = t.texts_to_sequences(docs)
#print(encoded_docs)

# pad documents to a max length of 140 words
max_length = 140
padded_docs = pad_sequences(encoded_docs, maxlen=max_length, padding='post')


Here I used Google public word2vec



# load the whole embedding into memory
embeddings_index = dict()
f = open('Google28.bin',encoding="latin-1")
for line in f:
values = line.split()
word = values[0]
coefs = asarray(values[1:], dtype='str')
embeddings_index[word] = coefs
f.close()

# create a weight matrix for words in training docs
embedding_matrix = zeros((vocab_size, 100))

for word, i in t.word_index.items():
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
embedding_matrix[i] = embedding_vector


#Convert to numpy
NewTraining=numpy.array(padded_docs)
NewLabels=numpy.array(labels)
encoded_docs2 = t.texts_to_sequences(total2)

# pad documents to a max length of 140 words

padded_docs2 = pad_sequences(encoded_docs2, maxlen=max_length, padding='post')


# Generate labels
positive_labels2 = [1 for _ in posT]
negative_labels2 = [0 for _ in negT]
yTest = numpy.concatenate([positive_labels2, negative_labels2], 0)
NewTesting=numpy.array(padded_docs2)
NewLabelsTsting=numpy.array(yTest)









share|improve this question











$endgroup$




I'm building a sentiment analysis program in python using Keras Sequential model for deep learning



my data is 20,000 tweets:



  • positive tweets: 9152 tweets

  • negative tweets: 10849 tweets

I wrote a sequential model script to make the binary classification as follows:



model=Sequential()
model.add(Embedding(vocab_size, 100, input_length=max_words))
model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(250, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit the model

print(model.summary())
history=model.fit(X_train[train], y1[train], validation_split=0.30,epochs=2, batch_size=128,verbose=2)


however I get very strange results! The model accuracy is almost perfect (>90)
whereas the validation accuracy is very low (<1) (shown bellow)



Train on 9417 samples, validate on 4036 samples
Epoch 1/2
- 13s - loss: 0.5478 - acc: 0.7133 - val_loss: 3.6157 - val_acc: 0.0243
Epoch 2/2
- 11s - loss: 0.2287 - acc: 0.8995 - val_loss: 5.4746 - val_acc: 0.0339


I tried to increase the number of epoch, and it only increases the model accuracy and lowers the validation accuracy



Any advice on how to overcome this issue?



Update:



this is how I handle my data



#read training data
pos_file=open('pos2.txt', 'r', encoding="Latin-1")
neg_file=open('neg3.txt', 'r', encoding="Latin-1")
# Load data from files
pos = list(pos_file.readlines())
neg = list(neg_file.readlines())
x = pos + neg
docs = numpy.array(x)
#read Testing Data
pos_test=open('posTest2.txt', 'r',encoding="Latin-1")
posT = list(pos_test.readlines())
neg_test=open('negTest2.txt', 'r',encoding="Latin-1")
negT = list(neg_test.readlines())
xTest = posT + negT
total2 = numpy.array(xTest)

CombinedDocs=numpy.append(total2,docs)

# Generate labels
positive_labels = [1 for _ in pos]
negative_labels = [0 for _ in neg]
labels = numpy.concatenate([positive_labels, negative_labels], 0)

# prepare tokenizer
t = Tokenizer()
t.fit_on_texts(CombinedDocs)
vocab_size = len(t.word_index) + 1
# integer encode the documents
encoded_docs = t.texts_to_sequences(docs)
#print(encoded_docs)

# pad documents to a max length of 140 words
max_length = 140
padded_docs = pad_sequences(encoded_docs, maxlen=max_length, padding='post')


Here I used Google public word2vec



# load the whole embedding into memory
embeddings_index = dict()
f = open('Google28.bin',encoding="latin-1")
for line in f:
values = line.split()
word = values[0]
coefs = asarray(values[1:], dtype='str')
embeddings_index[word] = coefs
f.close()

# create a weight matrix for words in training docs
embedding_matrix = zeros((vocab_size, 100))

for word, i in t.word_index.items():
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
embedding_matrix[i] = embedding_vector


#Convert to numpy
NewTraining=numpy.array(padded_docs)
NewLabels=numpy.array(labels)
encoded_docs2 = t.texts_to_sequences(total2)

# pad documents to a max length of 140 words

padded_docs2 = pad_sequences(encoded_docs2, maxlen=max_length, padding='post')


# Generate labels
positive_labels2 = [1 for _ in posT]
negative_labels2 = [0 for _ in negT]
yTest = numpy.concatenate([positive_labels2, negative_labels2], 0)
NewTesting=numpy.array(padded_docs2)
NewLabelsTsting=numpy.array(yTest)






python deep-learning classification keras overfitting






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Apr 5 '18 at 8:44







Amy.Dj

















asked Apr 4 '18 at 9:02









Amy.DjAmy.Dj

1814




1814











  • $begingroup$
    Try to deepen the architecture, try adding more filters and dense layers
    $endgroup$
    – Aditya
    Apr 4 '18 at 9:51











  • $begingroup$
    I added 3 dense layers (128,64,32) and it still produces similar results @Aditya
    $endgroup$
    – Amy.Dj
    Apr 4 '18 at 9:55






  • 2




    $begingroup$
    Keras does not shuffle the data before doing the training/validation split. This means that if the data appearing at the beginning (i.e. the one you train on) is very different from the data appearing by the end (i.e. the one you use for validation), the validation accuracy will be low. Try shuffling the data before feeding it to fit.
    $endgroup$
    – ncasas
    Apr 4 '18 at 10:56










  • $begingroup$
    @ncasas Thank you a lot ! it helped ! validation accuracy jumped to 78%.. great improvement.. but my question is there a way to increase accuracy?
    $endgroup$
    – Amy.Dj
    Apr 5 '18 at 8:17










  • $begingroup$
    I have a similar issue with my model. I'm trying to use the most basic Conv1D model to analyze review data and output a rating of 1-5 class, therefore the loss is categorical_crossentropy. Model structure is as below # define model model = Sequential() model.add(Embedding(vocab_size, 100, input_length=max_length)) model.add(Conv1D(filters=32, kernel_size=8, activation='relu')) model.add(MaxPooling1D(pool_size=2)) model.add(Flatten()) model.add(Dense(10, activation='relu')) model.add(Dense(5, activation='softmax')) Total params: 15,865,417 Trainable params: 15,865,417 Non-trainable params: 0 Tr
    $endgroup$
    – stranger
    Mar 21 at 6:10
















  • $begingroup$
    Try to deepen the architecture, try adding more filters and dense layers
    $endgroup$
    – Aditya
    Apr 4 '18 at 9:51











  • $begingroup$
    I added 3 dense layers (128,64,32) and it still produces similar results @Aditya
    $endgroup$
    – Amy.Dj
    Apr 4 '18 at 9:55






  • 2




    $begingroup$
    Keras does not shuffle the data before doing the training/validation split. This means that if the data appearing at the beginning (i.e. the one you train on) is very different from the data appearing by the end (i.e. the one you use for validation), the validation accuracy will be low. Try shuffling the data before feeding it to fit.
    $endgroup$
    – ncasas
    Apr 4 '18 at 10:56










  • $begingroup$
    @ncasas Thank you a lot ! it helped ! validation accuracy jumped to 78%.. great improvement.. but my question is there a way to increase accuracy?
    $endgroup$
    – Amy.Dj
    Apr 5 '18 at 8:17










  • $begingroup$
    I have a similar issue with my model. I'm trying to use the most basic Conv1D model to analyze review data and output a rating of 1-5 class, therefore the loss is categorical_crossentropy. Model structure is as below # define model model = Sequential() model.add(Embedding(vocab_size, 100, input_length=max_length)) model.add(Conv1D(filters=32, kernel_size=8, activation='relu')) model.add(MaxPooling1D(pool_size=2)) model.add(Flatten()) model.add(Dense(10, activation='relu')) model.add(Dense(5, activation='softmax')) Total params: 15,865,417 Trainable params: 15,865,417 Non-trainable params: 0 Tr
    $endgroup$
    – stranger
    Mar 21 at 6:10















$begingroup$
Try to deepen the architecture, try adding more filters and dense layers
$endgroup$
– Aditya
Apr 4 '18 at 9:51





$begingroup$
Try to deepen the architecture, try adding more filters and dense layers
$endgroup$
– Aditya
Apr 4 '18 at 9:51













$begingroup$
I added 3 dense layers (128,64,32) and it still produces similar results @Aditya
$endgroup$
– Amy.Dj
Apr 4 '18 at 9:55




$begingroup$
I added 3 dense layers (128,64,32) and it still produces similar results @Aditya
$endgroup$
– Amy.Dj
Apr 4 '18 at 9:55




2




2




$begingroup$
Keras does not shuffle the data before doing the training/validation split. This means that if the data appearing at the beginning (i.e. the one you train on) is very different from the data appearing by the end (i.e. the one you use for validation), the validation accuracy will be low. Try shuffling the data before feeding it to fit.
$endgroup$
– ncasas
Apr 4 '18 at 10:56




$begingroup$
Keras does not shuffle the data before doing the training/validation split. This means that if the data appearing at the beginning (i.e. the one you train on) is very different from the data appearing by the end (i.e. the one you use for validation), the validation accuracy will be low. Try shuffling the data before feeding it to fit.
$endgroup$
– ncasas
Apr 4 '18 at 10:56












$begingroup$
@ncasas Thank you a lot ! it helped ! validation accuracy jumped to 78%.. great improvement.. but my question is there a way to increase accuracy?
$endgroup$
– Amy.Dj
Apr 5 '18 at 8:17




$begingroup$
@ncasas Thank you a lot ! it helped ! validation accuracy jumped to 78%.. great improvement.. but my question is there a way to increase accuracy?
$endgroup$
– Amy.Dj
Apr 5 '18 at 8:17












$begingroup$
I have a similar issue with my model. I'm trying to use the most basic Conv1D model to analyze review data and output a rating of 1-5 class, therefore the loss is categorical_crossentropy. Model structure is as below # define model model = Sequential() model.add(Embedding(vocab_size, 100, input_length=max_length)) model.add(Conv1D(filters=32, kernel_size=8, activation='relu')) model.add(MaxPooling1D(pool_size=2)) model.add(Flatten()) model.add(Dense(10, activation='relu')) model.add(Dense(5, activation='softmax')) Total params: 15,865,417 Trainable params: 15,865,417 Non-trainable params: 0 Tr
$endgroup$
– stranger
Mar 21 at 6:10




$begingroup$
I have a similar issue with my model. I'm trying to use the most basic Conv1D model to analyze review data and output a rating of 1-5 class, therefore the loss is categorical_crossentropy. Model structure is as below # define model model = Sequential() model.add(Embedding(vocab_size, 100, input_length=max_length)) model.add(Conv1D(filters=32, kernel_size=8, activation='relu')) model.add(MaxPooling1D(pool_size=2)) model.add(Flatten()) model.add(Dense(10, activation='relu')) model.add(Dense(5, activation='softmax')) Total params: 15,865,417 Trainable params: 15,865,417 Non-trainable params: 0 Tr
$endgroup$
– stranger
Mar 21 at 6:10










4 Answers
4






active

oldest

votes


















2












$begingroup$

You should try to shuffle all of your data and split them to the train and test and valid set then train again.






share|improve this answer











$endgroup$












  • $begingroup$
    I used (validation_split=0.30) in my code. is that what you are referring to? @Anh Pham
    $endgroup$
    – Amy.Dj
    Apr 4 '18 at 10:39










  • $begingroup$
    From keras.io/getting-started/faq/… "If you set the validation_split argument in model.fit to e.g. 0.1, then the validation data used will be the last 10% of the data." My hypothesis is possibly the distribution of your data is not "the same" in the beginning to the end so if the data are not shuffled then training acc will be significantly different from validation acc. P/s: Loss for words at "the same" but hope you get it.
    $endgroup$
    – Anh Phạm
    Apr 4 '18 at 12:10











  • $begingroup$
    Your model is fine, there has to be a problem with the way you feed your data. Probably, as other people commented, your train and eval sets are differently distributed. I suggest you add the code where you handle the data to see whether we can help you.
    $endgroup$
    – TitoOrt
    Apr 4 '18 at 17:57


















1












$begingroup$

I had the same condition: High acc and low vad_acc.



It was because the parameter of Keras.model.fit, validation_split.



This will separate the last section of data as validation data.
Therefore, if your data was in order, your validity data will be in the same case.
Try to shuffle the training data.






share|improve this answer











$endgroup$




















    0












    $begingroup$

    It seems that with validation split, validation accuracy is not working properly. Instead of using validation split in fit function of your model, try splitting your training data into train data and validate data before fit function and then feed the validation data in the feed function like this.



    Instead of doing this



    history=model.fit(X_train[train], y1[train], validation_split=0.30,epochs=2, batch_size=128,verbose=2)


    Split your train data into validation and train data by any method, and then say your validation data is (X_val,Y_val), then replace the above line of code with this one:



    history=model.fit(X_train[train], y1[train], validation_data=(X_val,Y_val),epochs=2, batch_size=128,verbose=2)





    share|improve this answer











    $endgroup$




















      0












      $begingroup$

      When a machine learning model has high training accuracy and very low validation then this case is probably known as over-fitting. The reasons for this can be as follows:



      1. The hypothesis function you are using is too complex that your model perfectly fits the training data but fails to do on test/validation data.

      2. The number of learning parameters in your model is way too big that instead of generalizing the examples , your model learns those examples and hence the model performs badly on test/validation data.

      To solve the above problems a number of solutions can be tried depending on your dataset:



      1. Use a simple cost and loss function.

      2. Use regulation which helps in reducing over-fitting i.e Dropout.

      3. Reduce the number of learning parameters in your model.

      These are the 3 solutions that are most likely to improve the validation accuracy of your model and still if these don't work check your inputs whether they have the right shapes and sizes.






      share|improve this answer











      $endgroup$












        Your Answer





        StackExchange.ifUsing("editor", function ()
        return StackExchange.using("mathjaxEditing", function ()
        StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
        StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
        );
        );
        , "mathjax-editing");

        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "557"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: false,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        imageUploader:
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        ,
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );













        draft saved

        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f29893%2fhigh-model-accuracy-vs-very-low-validation-accuarcy%23new-answer', 'question_page');

        );

        Post as a guest















        Required, but never shown

























        4 Answers
        4






        active

        oldest

        votes








        4 Answers
        4






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        2












        $begingroup$

        You should try to shuffle all of your data and split them to the train and test and valid set then train again.






        share|improve this answer











        $endgroup$












        • $begingroup$
          I used (validation_split=0.30) in my code. is that what you are referring to? @Anh Pham
          $endgroup$
          – Amy.Dj
          Apr 4 '18 at 10:39










        • $begingroup$
          From keras.io/getting-started/faq/… "If you set the validation_split argument in model.fit to e.g. 0.1, then the validation data used will be the last 10% of the data." My hypothesis is possibly the distribution of your data is not "the same" in the beginning to the end so if the data are not shuffled then training acc will be significantly different from validation acc. P/s: Loss for words at "the same" but hope you get it.
          $endgroup$
          – Anh Phạm
          Apr 4 '18 at 12:10











        • $begingroup$
          Your model is fine, there has to be a problem with the way you feed your data. Probably, as other people commented, your train and eval sets are differently distributed. I suggest you add the code where you handle the data to see whether we can help you.
          $endgroup$
          – TitoOrt
          Apr 4 '18 at 17:57















        2












        $begingroup$

        You should try to shuffle all of your data and split them to the train and test and valid set then train again.






        share|improve this answer











        $endgroup$












        • $begingroup$
          I used (validation_split=0.30) in my code. is that what you are referring to? @Anh Pham
          $endgroup$
          – Amy.Dj
          Apr 4 '18 at 10:39










        • $begingroup$
          From keras.io/getting-started/faq/… "If you set the validation_split argument in model.fit to e.g. 0.1, then the validation data used will be the last 10% of the data." My hypothesis is possibly the distribution of your data is not "the same" in the beginning to the end so if the data are not shuffled then training acc will be significantly different from validation acc. P/s: Loss for words at "the same" but hope you get it.
          $endgroup$
          – Anh Phạm
          Apr 4 '18 at 12:10











        • $begingroup$
          Your model is fine, there has to be a problem with the way you feed your data. Probably, as other people commented, your train and eval sets are differently distributed. I suggest you add the code where you handle the data to see whether we can help you.
          $endgroup$
          – TitoOrt
          Apr 4 '18 at 17:57













        2












        2








        2





        $begingroup$

        You should try to shuffle all of your data and split them to the train and test and valid set then train again.






        share|improve this answer











        $endgroup$



        You should try to shuffle all of your data and split them to the train and test and valid set then train again.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Dec 9 '18 at 16:37









        Stephen Rauch

        1,52551230




        1,52551230










        answered Apr 4 '18 at 10:26









        Anh PhạmAnh Phạm

        362




        362











        • $begingroup$
          I used (validation_split=0.30) in my code. is that what you are referring to? @Anh Pham
          $endgroup$
          – Amy.Dj
          Apr 4 '18 at 10:39










        • $begingroup$
          From keras.io/getting-started/faq/… "If you set the validation_split argument in model.fit to e.g. 0.1, then the validation data used will be the last 10% of the data." My hypothesis is possibly the distribution of your data is not "the same" in the beginning to the end so if the data are not shuffled then training acc will be significantly different from validation acc. P/s: Loss for words at "the same" but hope you get it.
          $endgroup$
          – Anh Phạm
          Apr 4 '18 at 12:10











        • $begingroup$
          Your model is fine, there has to be a problem with the way you feed your data. Probably, as other people commented, your train and eval sets are differently distributed. I suggest you add the code where you handle the data to see whether we can help you.
          $endgroup$
          – TitoOrt
          Apr 4 '18 at 17:57
















        • $begingroup$
          I used (validation_split=0.30) in my code. is that what you are referring to? @Anh Pham
          $endgroup$
          – Amy.Dj
          Apr 4 '18 at 10:39










        • $begingroup$
          From keras.io/getting-started/faq/… "If you set the validation_split argument in model.fit to e.g. 0.1, then the validation data used will be the last 10% of the data." My hypothesis is possibly the distribution of your data is not "the same" in the beginning to the end so if the data are not shuffled then training acc will be significantly different from validation acc. P/s: Loss for words at "the same" but hope you get it.
          $endgroup$
          – Anh Phạm
          Apr 4 '18 at 12:10











        • $begingroup$
          Your model is fine, there has to be a problem with the way you feed your data. Probably, as other people commented, your train and eval sets are differently distributed. I suggest you add the code where you handle the data to see whether we can help you.
          $endgroup$
          – TitoOrt
          Apr 4 '18 at 17:57















        $begingroup$
        I used (validation_split=0.30) in my code. is that what you are referring to? @Anh Pham
        $endgroup$
        – Amy.Dj
        Apr 4 '18 at 10:39




        $begingroup$
        I used (validation_split=0.30) in my code. is that what you are referring to? @Anh Pham
        $endgroup$
        – Amy.Dj
        Apr 4 '18 at 10:39












        $begingroup$
        From keras.io/getting-started/faq/… "If you set the validation_split argument in model.fit to e.g. 0.1, then the validation data used will be the last 10% of the data." My hypothesis is possibly the distribution of your data is not "the same" in the beginning to the end so if the data are not shuffled then training acc will be significantly different from validation acc. P/s: Loss for words at "the same" but hope you get it.
        $endgroup$
        – Anh Phạm
        Apr 4 '18 at 12:10





        $begingroup$
        From keras.io/getting-started/faq/… "If you set the validation_split argument in model.fit to e.g. 0.1, then the validation data used will be the last 10% of the data." My hypothesis is possibly the distribution of your data is not "the same" in the beginning to the end so if the data are not shuffled then training acc will be significantly different from validation acc. P/s: Loss for words at "the same" but hope you get it.
        $endgroup$
        – Anh Phạm
        Apr 4 '18 at 12:10













        $begingroup$
        Your model is fine, there has to be a problem with the way you feed your data. Probably, as other people commented, your train and eval sets are differently distributed. I suggest you add the code where you handle the data to see whether we can help you.
        $endgroup$
        – TitoOrt
        Apr 4 '18 at 17:57




        $begingroup$
        Your model is fine, there has to be a problem with the way you feed your data. Probably, as other people commented, your train and eval sets are differently distributed. I suggest you add the code where you handle the data to see whether we can help you.
        $endgroup$
        – TitoOrt
        Apr 4 '18 at 17:57











        1












        $begingroup$

        I had the same condition: High acc and low vad_acc.



        It was because the parameter of Keras.model.fit, validation_split.



        This will separate the last section of data as validation data.
        Therefore, if your data was in order, your validity data will be in the same case.
        Try to shuffle the training data.






        share|improve this answer











        $endgroup$

















          1












          $begingroup$

          I had the same condition: High acc and low vad_acc.



          It was because the parameter of Keras.model.fit, validation_split.



          This will separate the last section of data as validation data.
          Therefore, if your data was in order, your validity data will be in the same case.
          Try to shuffle the training data.






          share|improve this answer











          $endgroup$















            1












            1








            1





            $begingroup$

            I had the same condition: High acc and low vad_acc.



            It was because the parameter of Keras.model.fit, validation_split.



            This will separate the last section of data as validation data.
            Therefore, if your data was in order, your validity data will be in the same case.
            Try to shuffle the training data.






            share|improve this answer











            $endgroup$



            I had the same condition: High acc and low vad_acc.



            It was because the parameter of Keras.model.fit, validation_split.



            This will separate the last section of data as validation data.
            Therefore, if your data was in order, your validity data will be in the same case.
            Try to shuffle the training data.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Dec 9 '18 at 16:40









            Stephen Rauch

            1,52551230




            1,52551230










            answered Dec 9 '18 at 11:38









            龔柏翰龔柏翰

            111




            111





















                0












                $begingroup$

                It seems that with validation split, validation accuracy is not working properly. Instead of using validation split in fit function of your model, try splitting your training data into train data and validate data before fit function and then feed the validation data in the feed function like this.



                Instead of doing this



                history=model.fit(X_train[train], y1[train], validation_split=0.30,epochs=2, batch_size=128,verbose=2)


                Split your train data into validation and train data by any method, and then say your validation data is (X_val,Y_val), then replace the above line of code with this one:



                history=model.fit(X_train[train], y1[train], validation_data=(X_val,Y_val),epochs=2, batch_size=128,verbose=2)





                share|improve this answer











                $endgroup$

















                  0












                  $begingroup$

                  It seems that with validation split, validation accuracy is not working properly. Instead of using validation split in fit function of your model, try splitting your training data into train data and validate data before fit function and then feed the validation data in the feed function like this.



                  Instead of doing this



                  history=model.fit(X_train[train], y1[train], validation_split=0.30,epochs=2, batch_size=128,verbose=2)


                  Split your train data into validation and train data by any method, and then say your validation data is (X_val,Y_val), then replace the above line of code with this one:



                  history=model.fit(X_train[train], y1[train], validation_data=(X_val,Y_val),epochs=2, batch_size=128,verbose=2)





                  share|improve this answer











                  $endgroup$















                    0












                    0








                    0





                    $begingroup$

                    It seems that with validation split, validation accuracy is not working properly. Instead of using validation split in fit function of your model, try splitting your training data into train data and validate data before fit function and then feed the validation data in the feed function like this.



                    Instead of doing this



                    history=model.fit(X_train[train], y1[train], validation_split=0.30,epochs=2, batch_size=128,verbose=2)


                    Split your train data into validation and train data by any method, and then say your validation data is (X_val,Y_val), then replace the above line of code with this one:



                    history=model.fit(X_train[train], y1[train], validation_data=(X_val,Y_val),epochs=2, batch_size=128,verbose=2)





                    share|improve this answer











                    $endgroup$



                    It seems that with validation split, validation accuracy is not working properly. Instead of using validation split in fit function of your model, try splitting your training data into train data and validate data before fit function and then feed the validation data in the feed function like this.



                    Instead of doing this



                    history=model.fit(X_train[train], y1[train], validation_split=0.30,epochs=2, batch_size=128,verbose=2)


                    Split your train data into validation and train data by any method, and then say your validation data is (X_val,Y_val), then replace the above line of code with this one:



                    history=model.fit(X_train[train], y1[train], validation_data=(X_val,Y_val),epochs=2, batch_size=128,verbose=2)






                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Oct 25 '18 at 5:47









                    Stephen Rauch

                    1,52551230




                    1,52551230










                    answered Oct 25 '18 at 5:23









                    Amruth LakkavaramAmruth Lakkavaram

                    11




                    11





















                        0












                        $begingroup$

                        When a machine learning model has high training accuracy and very low validation then this case is probably known as over-fitting. The reasons for this can be as follows:



                        1. The hypothesis function you are using is too complex that your model perfectly fits the training data but fails to do on test/validation data.

                        2. The number of learning parameters in your model is way too big that instead of generalizing the examples , your model learns those examples and hence the model performs badly on test/validation data.

                        To solve the above problems a number of solutions can be tried depending on your dataset:



                        1. Use a simple cost and loss function.

                        2. Use regulation which helps in reducing over-fitting i.e Dropout.

                        3. Reduce the number of learning parameters in your model.

                        These are the 3 solutions that are most likely to improve the validation accuracy of your model and still if these don't work check your inputs whether they have the right shapes and sizes.






                        share|improve this answer











                        $endgroup$

















                          0












                          $begingroup$

                          When a machine learning model has high training accuracy and very low validation then this case is probably known as over-fitting. The reasons for this can be as follows:



                          1. The hypothesis function you are using is too complex that your model perfectly fits the training data but fails to do on test/validation data.

                          2. The number of learning parameters in your model is way too big that instead of generalizing the examples , your model learns those examples and hence the model performs badly on test/validation data.

                          To solve the above problems a number of solutions can be tried depending on your dataset:



                          1. Use a simple cost and loss function.

                          2. Use regulation which helps in reducing over-fitting i.e Dropout.

                          3. Reduce the number of learning parameters in your model.

                          These are the 3 solutions that are most likely to improve the validation accuracy of your model and still if these don't work check your inputs whether they have the right shapes and sizes.






                          share|improve this answer











                          $endgroup$















                            0












                            0








                            0





                            $begingroup$

                            When a machine learning model has high training accuracy and very low validation then this case is probably known as over-fitting. The reasons for this can be as follows:



                            1. The hypothesis function you are using is too complex that your model perfectly fits the training data but fails to do on test/validation data.

                            2. The number of learning parameters in your model is way too big that instead of generalizing the examples , your model learns those examples and hence the model performs badly on test/validation data.

                            To solve the above problems a number of solutions can be tried depending on your dataset:



                            1. Use a simple cost and loss function.

                            2. Use regulation which helps in reducing over-fitting i.e Dropout.

                            3. Reduce the number of learning parameters in your model.

                            These are the 3 solutions that are most likely to improve the validation accuracy of your model and still if these don't work check your inputs whether they have the right shapes and sizes.






                            share|improve this answer











                            $endgroup$



                            When a machine learning model has high training accuracy and very low validation then this case is probably known as over-fitting. The reasons for this can be as follows:



                            1. The hypothesis function you are using is too complex that your model perfectly fits the training data but fails to do on test/validation data.

                            2. The number of learning parameters in your model is way too big that instead of generalizing the examples , your model learns those examples and hence the model performs badly on test/validation data.

                            To solve the above problems a number of solutions can be tried depending on your dataset:



                            1. Use a simple cost and loss function.

                            2. Use regulation which helps in reducing over-fitting i.e Dropout.

                            3. Reduce the number of learning parameters in your model.

                            These are the 3 solutions that are most likely to improve the validation accuracy of your model and still if these don't work check your inputs whether they have the right shapes and sizes.







                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Mar 21 at 18:05









                            Blenzus

                            638




                            638










                            answered Oct 25 '18 at 5:47









                            Syed Nauyan RashidSyed Nauyan Rashid

                            1919




                            1919



























                                draft saved

                                draft discarded
















































                                Thanks for contributing an answer to Data Science Stack Exchange!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid


                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.

                                Use MathJax to format equations. MathJax reference.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f29893%2fhigh-model-accuracy-vs-very-low-validation-accuarcy%23new-answer', 'question_page');

                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

                                Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

                                Do these cracks on my tires look bad? The Next CEO of Stack OverflowDry rot tire should I replace?Having to replace tiresFishtailed so easily? Bad tires? ABS?Filling the tires with something other than air, to avoid puncture hassles?Used Michelin tires safe to install?Do these tyre cracks necessitate replacement?Rumbling noise: tires or mechanicalIs it possible to fix noisy feathered tires?Are bad winter tires still better than summer tires in winter?Torque converter failure - Related to replacing only 2 tires?Why use snow tires on all 4 wheels on 2-wheel-drive cars?