Why is my generator loss function increasing with iterations? The Next CEO of Stack Overflow2019 Community Moderator ElectionKelly Criterion in xgboost loss functionLoss given Activation Function and Probability ModelLoss function in GANMaximize profit with loss functionGenerator loss not decreasing- text to image synthesisGAN - why doesn't the generator nullify the noise input?Precision recall loss functionUsing a custom R generator function with fit_generator (Keras, R)how to implement infoGAN's loss function in Keras' functional APIWhy do most GAN (Generative Adversarial Network) implementations have symmetric discriminator and generator architectures?
What does "Its cash flow is deeply negative" mean?
Is it possible to replace duplicates of a character with one character using tr
No sign flipping while figuring out the emf of voltaic cell?
Is there a way to bypass a component in series in a circuit if that component fails?
Is micro rebar a better way to reinforce concrete than rebar?
Would this house-rule that treats advantage as a +1 to the roll instead (and disadvantage as -1) and allows them to stack be balanced?
Is it ever safe to open a suspicious HTML file (e.g. email attachment)?
Inappropriate reference requests from Journal reviewers
If a black hole is created from light, can this black hole then move at the speed of light?
How to scale a tikZ image which is within a figure environment
Why does standard notation not preserve intervals (visually)
Why is information "lost" when it got into a black hole?
What happened in Rome, when the western empire "fell"?
How to get the end in algorithm2e
What connection does MS Office have to Netscape Navigator?
Measuring resistivity of dielectric liquid
Plot of histogram similar to output from @risk
Example of a Mathematician/Physicist whose Other Publications during their PhD eclipsed their PhD Thesis
Is there a difference between "Fahrstuhl" and "Aufzug"
A Man With a Stainless Steel Endoskeleton (like The Terminator) Fighting Cloaked Aliens Only He Can See
is it ok to reduce charging current for li ion 18650 battery?
Why don't programming languages automatically manage the synchronous/asynchronous problem?
Can MTA send mail via a relay without being told so?
Does soap repel water?
Why is my generator loss function increasing with iterations?
The Next CEO of Stack Overflow2019 Community Moderator ElectionKelly Criterion in xgboost loss functionLoss given Activation Function and Probability ModelLoss function in GANMaximize profit with loss functionGenerator loss not decreasing- text to image synthesisGAN - why doesn't the generator nullify the noise input?Precision recall loss functionUsing a custom R generator function with fit_generator (Keras, R)how to implement infoGAN's loss function in Keras' functional APIWhy do most GAN (Generative Adversarial Network) implementations have symmetric discriminator and generator architectures?
$begingroup$
I'm trying to train a DC-GAN on CIFAR-10 Dataset. I'm using Binary Cross Entropy as my loss function for both discriminator and generator (appended with non-trainable discriminator). If I train using Adam optimizer, the GAN is training fine. But if I replace the optimizer by SGD, the training is going haywire. The generator accuracy starts at some higher point and with iterations, it goes to 0 and stays there. The discriminator accuracy starts at some lower point and reaches somewhere around 0.5 (expected, right?). The peculiar thing is the generator loss function is increasing with iterations. I though may be the step is too high. I tried changing the step size. I tried using momentum with SGD. In all these cases, the generator may or may not decrease in the beginning, but then increases for sure. So, I think there is something inherently wrong in my model. I know training Deep Models is difficult and GANs still more, but there has to be some reason/heuristic as to why this is happening. Any inputs in appreciated. I'm new to Neural Networks, Deep Learning and hence new to GANs as well.
Here is my code:
Cifar10Models.py
from keras import Sequential
from keras.initializers import TruncatedNormal
from keras.layers import Activation, BatchNormalization, Conv2D, Conv2DTranspose, Dense, Flatten, LeakyReLU, Reshape
from keras.optimizers import SGD
class DcGan:
def __init__(self, print_model_summary: bool = False):
self.generator_model = None
self.discriminator_model = None
self.concatenated_model = None
self.print_model_summary = print_model_summary
def build_generator_model(self):
if self.generator_model:
return self.generator_model
self.generator_model = Sequential()
self.generator_model.add(Dense(4 * 4 * 512, input_dim=100,
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Reshape((4, 4, 512)))
self.generator_model.add(Conv2DTranspose(256, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Conv2DTranspose(128, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Conv2DTranspose(64, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Conv2D(3, 3, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(Activation('tanh'))
if self.print_model_summary:
self.generator_model.summary()
return self.generator_model
def build_discriminator_model(self):
if self.discriminator_model:
return self.discriminator_model
self.discriminator_model = Sequential()
self.discriminator_model.add(Conv2D(128, 3, strides=2, input_shape=(32, 32, 3), padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Conv2D(256, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Conv2D(512, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Conv2D(1024, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Flatten())
self.discriminator_model.add(Dense(1, kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(Activation('sigmoid'))
if self.print_model_summary:
self.discriminator_model.summary()
return self.discriminator_model
def build_concatenated_model(self):
if self.concatenated_model:
return self.concatenated_model
self.concatenated_model = Sequential()
self.concatenated_model.add(self.generator_model)
self.concatenated_model.add(self.discriminator_model)
if self.print_model_summary:
self.concatenated_model.summary()
return self.concatenated_model
def build_dc_gan(self):
self.build_generator_model()
self.build_discriminator_model()
self.build_concatenated_model()
self.discriminator_model.trainable = True
optimizer = SGD(lr=0.0002)
self.discriminator_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
self.discriminator_model.trainable = False
optimizer = SGD(lr=0.0001)
self.concatenated_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
self.discriminator_model.trainable = True
Cifar10Trainer.py:
# Based on https://towardsdatascience.com/gan-by-example-using-keras-on-tensorflow-backend-1a6d515a60d0
import os
import datetime
import numpy
import time
from keras.datasets import cifar10
from keras.utils import np_utils
from matplotlib import pyplot as plt
import Cifar10Models
log_file_name = 'logs.csv'
class Cifar10Trainer:
def __init__(self):
self.x_train, self.y_train = self.get_train_and_test_data()
self.dc_gan = Cifar10Models.DcGan()
self.dc_gan.build_dc_gan()
@staticmethod
def get_train_and_test_data():
(x_train, y_train), _ = cifar10.load_data()
x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 3)
# Generator output has tanh activation whose range is [-1,1]
x_train = (x_train.astype('float32') * 2 / 255) - 1
y_train = np_utils.to_categorical(y_train, 10)
return x_train, y_train
def train(self, train_steps=10000, batch_size=128, log_interval=10, save_interval=100,
output_folder_path='./Trained_Models/'):
self.initialize_log(output_folder_path)
self.sample_real_images(output_folder_path)
for i in range(train_steps):
# Get real (Database) Images
images_real = self.x_train[numpy.random.randint(0, self.x_train.shape[0], size=batch_size), :, :, :]
# Generate Fake Images
noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
images_fake = self.dc_gan.generator_model.predict(noise)
# Train discriminator on both real and fake images
x = numpy.concatenate((images_real, images_fake), axis=0)
y = numpy.ones([2 * batch_size, 1])
y[batch_size:, :] = 0
d_loss = self.dc_gan.discriminator_model.train_on_batch(x, y)
# Train generator i.e. concatenated model
noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
y = numpy.ones([batch_size, 1])
g_loss = self.dc_gan.concatenated_model.train_on_batch(noise, y)
# Print Logs, Save Models, generate sample images
if (i + 1) % log_interval == 0:
self.log_progress(output_folder_path, i + 1, g_loss, d_loss)
if (i + 1) % save_interval == 0:
self.save_models(output_folder_path, i + 1)
self.generate_images(output_folder_path, i + 1)
@staticmethod
def initialize_log(output_folder_path):
log_line = 'Iteration No, Generator Loss, Generator Accuracy, Discriminator Loss, Discriminator Accuracy, '
'Timen'
with open(os.path.join(output_folder_path, log_file_name), 'w') as log_file:
log_file.write(log_line)
@staticmethod
def log_progress(output_folder_path, iteration_no, g_loss, d_loss):
log_line = '0:05,1:2.4f,2:0.4f,3:2.4f,4:0.4f,5n'
.format(iteration_no, g_loss[0], g_loss[1], d_loss[0], d_loss[1],
datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
with open(os.path.join(output_folder_path, log_file_name), 'a') as log_file:
log_file.write(log_line)
print(log_line)
def save_models(self, output_folder_path, iteration_no):
self.dc_gan.generator_model.save(
os.path.join(output_folder_path, 'generator_model_0.h5'.format(iteration_no)))
self.dc_gan.discriminator_model.save(
os.path.join(output_folder_path, 'discriminator_model_0.h5'.format(iteration_no)))
self.dc_gan.concatenated_model.save(
os.path.join(output_folder_path, 'concatenated_model_0.h5'.format(iteration_no)))
def sample_real_images(self, output_folder_path):
filepath = os.path.join(output_folder_path, 'CIFAR10_Sample_Real_Images.png')
i = numpy.random.randint(0, self.x_train.shape[0], 16)
images = self.x_train[i, :, :, :]
plt.figure(figsize=(10, 10))
for i in range(16):
plt.subplot(4, 4, i + 1)
image = images[i, :, :, :]
image = numpy.reshape(image, [32, 32, 3])
plt.imshow(image)
plt.axis('off')
plt.tight_layout()
plt.savefig(filepath)
plt.close('all')
def generate_images(self, output_folder_path, iteration_no, noise=None):
filepath = os.path.join(output_folder_path, 'CIFAR10_Gen_Image0.png'.format(iteration_no))
if noise is None:
noise = numpy.random.uniform(-1, 1, size=[16, 100])
# Generator output has tanh activation whose range is [-1,1]
images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
plt.figure(figsize=(10, 10))
for i in range(16):
plt.subplot(4, 4, i + 1)
image = images[i, :, :, :]
image = numpy.reshape(image, [32, 32, 3])
plt.imshow(image)
plt.axis('off')
plt.tight_layout()
plt.savefig(filepath)
plt.close('all')
def main():
cifar10_trainer = Cifar10Trainer()
cifar10_trainer.train(train_steps=10000, log_interval=10, save_interval=100)
del cifar10_trainer.dc_gan
return
if __name__ == '__main__':
start_time = time.time()
print('Program Started at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(start_time))))
main()
end_time = time.time()
print('Program Ended at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(end_time))))
print('Total Execution Time: 0s'.format(datetime.timedelta(seconds=end_time - start_time)))
Some of the graphs are as below:
Discriminator Optimizer: Adam(lr=0.0001, beta1=0.5)
Generator Optimizer: Adam(lr=0.0001, beta1=0.5)Discriminator Optimizer: SGD(lr=0.0001)
Generator Optimizer: SGD(lr=0.0001)Discriminator Optimizer: SGD(lr=0.0001)
Generator Optimizer: SGD(lr=0.001)Discriminator Optimizer: SGD(lr=0.0001)
Generator Optimizer: SGD(lr=0.0005)
Note:
This question was originally asked in StackOverflow and then re-asked here as per suggestions in SO
Edit1:
Adding some generated images for reference
Images generated by Adam Optimizer
Images generated by SGD Optimizer
python deep-learning keras optimization gan
$endgroup$
add a comment |
$begingroup$
I'm trying to train a DC-GAN on CIFAR-10 Dataset. I'm using Binary Cross Entropy as my loss function for both discriminator and generator (appended with non-trainable discriminator). If I train using Adam optimizer, the GAN is training fine. But if I replace the optimizer by SGD, the training is going haywire. The generator accuracy starts at some higher point and with iterations, it goes to 0 and stays there. The discriminator accuracy starts at some lower point and reaches somewhere around 0.5 (expected, right?). The peculiar thing is the generator loss function is increasing with iterations. I though may be the step is too high. I tried changing the step size. I tried using momentum with SGD. In all these cases, the generator may or may not decrease in the beginning, but then increases for sure. So, I think there is something inherently wrong in my model. I know training Deep Models is difficult and GANs still more, but there has to be some reason/heuristic as to why this is happening. Any inputs in appreciated. I'm new to Neural Networks, Deep Learning and hence new to GANs as well.
Here is my code:
Cifar10Models.py
from keras import Sequential
from keras.initializers import TruncatedNormal
from keras.layers import Activation, BatchNormalization, Conv2D, Conv2DTranspose, Dense, Flatten, LeakyReLU, Reshape
from keras.optimizers import SGD
class DcGan:
def __init__(self, print_model_summary: bool = False):
self.generator_model = None
self.discriminator_model = None
self.concatenated_model = None
self.print_model_summary = print_model_summary
def build_generator_model(self):
if self.generator_model:
return self.generator_model
self.generator_model = Sequential()
self.generator_model.add(Dense(4 * 4 * 512, input_dim=100,
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Reshape((4, 4, 512)))
self.generator_model.add(Conv2DTranspose(256, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Conv2DTranspose(128, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Conv2DTranspose(64, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Conv2D(3, 3, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(Activation('tanh'))
if self.print_model_summary:
self.generator_model.summary()
return self.generator_model
def build_discriminator_model(self):
if self.discriminator_model:
return self.discriminator_model
self.discriminator_model = Sequential()
self.discriminator_model.add(Conv2D(128, 3, strides=2, input_shape=(32, 32, 3), padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Conv2D(256, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Conv2D(512, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Conv2D(1024, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Flatten())
self.discriminator_model.add(Dense(1, kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(Activation('sigmoid'))
if self.print_model_summary:
self.discriminator_model.summary()
return self.discriminator_model
def build_concatenated_model(self):
if self.concatenated_model:
return self.concatenated_model
self.concatenated_model = Sequential()
self.concatenated_model.add(self.generator_model)
self.concatenated_model.add(self.discriminator_model)
if self.print_model_summary:
self.concatenated_model.summary()
return self.concatenated_model
def build_dc_gan(self):
self.build_generator_model()
self.build_discriminator_model()
self.build_concatenated_model()
self.discriminator_model.trainable = True
optimizer = SGD(lr=0.0002)
self.discriminator_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
self.discriminator_model.trainable = False
optimizer = SGD(lr=0.0001)
self.concatenated_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
self.discriminator_model.trainable = True
Cifar10Trainer.py:
# Based on https://towardsdatascience.com/gan-by-example-using-keras-on-tensorflow-backend-1a6d515a60d0
import os
import datetime
import numpy
import time
from keras.datasets import cifar10
from keras.utils import np_utils
from matplotlib import pyplot as plt
import Cifar10Models
log_file_name = 'logs.csv'
class Cifar10Trainer:
def __init__(self):
self.x_train, self.y_train = self.get_train_and_test_data()
self.dc_gan = Cifar10Models.DcGan()
self.dc_gan.build_dc_gan()
@staticmethod
def get_train_and_test_data():
(x_train, y_train), _ = cifar10.load_data()
x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 3)
# Generator output has tanh activation whose range is [-1,1]
x_train = (x_train.astype('float32') * 2 / 255) - 1
y_train = np_utils.to_categorical(y_train, 10)
return x_train, y_train
def train(self, train_steps=10000, batch_size=128, log_interval=10, save_interval=100,
output_folder_path='./Trained_Models/'):
self.initialize_log(output_folder_path)
self.sample_real_images(output_folder_path)
for i in range(train_steps):
# Get real (Database) Images
images_real = self.x_train[numpy.random.randint(0, self.x_train.shape[0], size=batch_size), :, :, :]
# Generate Fake Images
noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
images_fake = self.dc_gan.generator_model.predict(noise)
# Train discriminator on both real and fake images
x = numpy.concatenate((images_real, images_fake), axis=0)
y = numpy.ones([2 * batch_size, 1])
y[batch_size:, :] = 0
d_loss = self.dc_gan.discriminator_model.train_on_batch(x, y)
# Train generator i.e. concatenated model
noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
y = numpy.ones([batch_size, 1])
g_loss = self.dc_gan.concatenated_model.train_on_batch(noise, y)
# Print Logs, Save Models, generate sample images
if (i + 1) % log_interval == 0:
self.log_progress(output_folder_path, i + 1, g_loss, d_loss)
if (i + 1) % save_interval == 0:
self.save_models(output_folder_path, i + 1)
self.generate_images(output_folder_path, i + 1)
@staticmethod
def initialize_log(output_folder_path):
log_line = 'Iteration No, Generator Loss, Generator Accuracy, Discriminator Loss, Discriminator Accuracy, '
'Timen'
with open(os.path.join(output_folder_path, log_file_name), 'w') as log_file:
log_file.write(log_line)
@staticmethod
def log_progress(output_folder_path, iteration_no, g_loss, d_loss):
log_line = '0:05,1:2.4f,2:0.4f,3:2.4f,4:0.4f,5n'
.format(iteration_no, g_loss[0], g_loss[1], d_loss[0], d_loss[1],
datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
with open(os.path.join(output_folder_path, log_file_name), 'a') as log_file:
log_file.write(log_line)
print(log_line)
def save_models(self, output_folder_path, iteration_no):
self.dc_gan.generator_model.save(
os.path.join(output_folder_path, 'generator_model_0.h5'.format(iteration_no)))
self.dc_gan.discriminator_model.save(
os.path.join(output_folder_path, 'discriminator_model_0.h5'.format(iteration_no)))
self.dc_gan.concatenated_model.save(
os.path.join(output_folder_path, 'concatenated_model_0.h5'.format(iteration_no)))
def sample_real_images(self, output_folder_path):
filepath = os.path.join(output_folder_path, 'CIFAR10_Sample_Real_Images.png')
i = numpy.random.randint(0, self.x_train.shape[0], 16)
images = self.x_train[i, :, :, :]
plt.figure(figsize=(10, 10))
for i in range(16):
plt.subplot(4, 4, i + 1)
image = images[i, :, :, :]
image = numpy.reshape(image, [32, 32, 3])
plt.imshow(image)
plt.axis('off')
plt.tight_layout()
plt.savefig(filepath)
plt.close('all')
def generate_images(self, output_folder_path, iteration_no, noise=None):
filepath = os.path.join(output_folder_path, 'CIFAR10_Gen_Image0.png'.format(iteration_no))
if noise is None:
noise = numpy.random.uniform(-1, 1, size=[16, 100])
# Generator output has tanh activation whose range is [-1,1]
images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
plt.figure(figsize=(10, 10))
for i in range(16):
plt.subplot(4, 4, i + 1)
image = images[i, :, :, :]
image = numpy.reshape(image, [32, 32, 3])
plt.imshow(image)
plt.axis('off')
plt.tight_layout()
plt.savefig(filepath)
plt.close('all')
def main():
cifar10_trainer = Cifar10Trainer()
cifar10_trainer.train(train_steps=10000, log_interval=10, save_interval=100)
del cifar10_trainer.dc_gan
return
if __name__ == '__main__':
start_time = time.time()
print('Program Started at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(start_time))))
main()
end_time = time.time()
print('Program Ended at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(end_time))))
print('Total Execution Time: 0s'.format(datetime.timedelta(seconds=end_time - start_time)))
Some of the graphs are as below:
Discriminator Optimizer: Adam(lr=0.0001, beta1=0.5)
Generator Optimizer: Adam(lr=0.0001, beta1=0.5)Discriminator Optimizer: SGD(lr=0.0001)
Generator Optimizer: SGD(lr=0.0001)Discriminator Optimizer: SGD(lr=0.0001)
Generator Optimizer: SGD(lr=0.001)Discriminator Optimizer: SGD(lr=0.0001)
Generator Optimizer: SGD(lr=0.0005)
Note:
This question was originally asked in StackOverflow and then re-asked here as per suggestions in SO
Edit1:
Adding some generated images for reference
Images generated by Adam Optimizer
Images generated by SGD Optimizer
python deep-learning keras optimization gan
$endgroup$
add a comment |
$begingroup$
I'm trying to train a DC-GAN on CIFAR-10 Dataset. I'm using Binary Cross Entropy as my loss function for both discriminator and generator (appended with non-trainable discriminator). If I train using Adam optimizer, the GAN is training fine. But if I replace the optimizer by SGD, the training is going haywire. The generator accuracy starts at some higher point and with iterations, it goes to 0 and stays there. The discriminator accuracy starts at some lower point and reaches somewhere around 0.5 (expected, right?). The peculiar thing is the generator loss function is increasing with iterations. I though may be the step is too high. I tried changing the step size. I tried using momentum with SGD. In all these cases, the generator may or may not decrease in the beginning, but then increases for sure. So, I think there is something inherently wrong in my model. I know training Deep Models is difficult and GANs still more, but there has to be some reason/heuristic as to why this is happening. Any inputs in appreciated. I'm new to Neural Networks, Deep Learning and hence new to GANs as well.
Here is my code:
Cifar10Models.py
from keras import Sequential
from keras.initializers import TruncatedNormal
from keras.layers import Activation, BatchNormalization, Conv2D, Conv2DTranspose, Dense, Flatten, LeakyReLU, Reshape
from keras.optimizers import SGD
class DcGan:
def __init__(self, print_model_summary: bool = False):
self.generator_model = None
self.discriminator_model = None
self.concatenated_model = None
self.print_model_summary = print_model_summary
def build_generator_model(self):
if self.generator_model:
return self.generator_model
self.generator_model = Sequential()
self.generator_model.add(Dense(4 * 4 * 512, input_dim=100,
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Reshape((4, 4, 512)))
self.generator_model.add(Conv2DTranspose(256, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Conv2DTranspose(128, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Conv2DTranspose(64, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Conv2D(3, 3, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(Activation('tanh'))
if self.print_model_summary:
self.generator_model.summary()
return self.generator_model
def build_discriminator_model(self):
if self.discriminator_model:
return self.discriminator_model
self.discriminator_model = Sequential()
self.discriminator_model.add(Conv2D(128, 3, strides=2, input_shape=(32, 32, 3), padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Conv2D(256, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Conv2D(512, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Conv2D(1024, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Flatten())
self.discriminator_model.add(Dense(1, kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(Activation('sigmoid'))
if self.print_model_summary:
self.discriminator_model.summary()
return self.discriminator_model
def build_concatenated_model(self):
if self.concatenated_model:
return self.concatenated_model
self.concatenated_model = Sequential()
self.concatenated_model.add(self.generator_model)
self.concatenated_model.add(self.discriminator_model)
if self.print_model_summary:
self.concatenated_model.summary()
return self.concatenated_model
def build_dc_gan(self):
self.build_generator_model()
self.build_discriminator_model()
self.build_concatenated_model()
self.discriminator_model.trainable = True
optimizer = SGD(lr=0.0002)
self.discriminator_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
self.discriminator_model.trainable = False
optimizer = SGD(lr=0.0001)
self.concatenated_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
self.discriminator_model.trainable = True
Cifar10Trainer.py:
# Based on https://towardsdatascience.com/gan-by-example-using-keras-on-tensorflow-backend-1a6d515a60d0
import os
import datetime
import numpy
import time
from keras.datasets import cifar10
from keras.utils import np_utils
from matplotlib import pyplot as plt
import Cifar10Models
log_file_name = 'logs.csv'
class Cifar10Trainer:
def __init__(self):
self.x_train, self.y_train = self.get_train_and_test_data()
self.dc_gan = Cifar10Models.DcGan()
self.dc_gan.build_dc_gan()
@staticmethod
def get_train_and_test_data():
(x_train, y_train), _ = cifar10.load_data()
x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 3)
# Generator output has tanh activation whose range is [-1,1]
x_train = (x_train.astype('float32') * 2 / 255) - 1
y_train = np_utils.to_categorical(y_train, 10)
return x_train, y_train
def train(self, train_steps=10000, batch_size=128, log_interval=10, save_interval=100,
output_folder_path='./Trained_Models/'):
self.initialize_log(output_folder_path)
self.sample_real_images(output_folder_path)
for i in range(train_steps):
# Get real (Database) Images
images_real = self.x_train[numpy.random.randint(0, self.x_train.shape[0], size=batch_size), :, :, :]
# Generate Fake Images
noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
images_fake = self.dc_gan.generator_model.predict(noise)
# Train discriminator on both real and fake images
x = numpy.concatenate((images_real, images_fake), axis=0)
y = numpy.ones([2 * batch_size, 1])
y[batch_size:, :] = 0
d_loss = self.dc_gan.discriminator_model.train_on_batch(x, y)
# Train generator i.e. concatenated model
noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
y = numpy.ones([batch_size, 1])
g_loss = self.dc_gan.concatenated_model.train_on_batch(noise, y)
# Print Logs, Save Models, generate sample images
if (i + 1) % log_interval == 0:
self.log_progress(output_folder_path, i + 1, g_loss, d_loss)
if (i + 1) % save_interval == 0:
self.save_models(output_folder_path, i + 1)
self.generate_images(output_folder_path, i + 1)
@staticmethod
def initialize_log(output_folder_path):
log_line = 'Iteration No, Generator Loss, Generator Accuracy, Discriminator Loss, Discriminator Accuracy, '
'Timen'
with open(os.path.join(output_folder_path, log_file_name), 'w') as log_file:
log_file.write(log_line)
@staticmethod
def log_progress(output_folder_path, iteration_no, g_loss, d_loss):
log_line = '0:05,1:2.4f,2:0.4f,3:2.4f,4:0.4f,5n'
.format(iteration_no, g_loss[0], g_loss[1], d_loss[0], d_loss[1],
datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
with open(os.path.join(output_folder_path, log_file_name), 'a') as log_file:
log_file.write(log_line)
print(log_line)
def save_models(self, output_folder_path, iteration_no):
self.dc_gan.generator_model.save(
os.path.join(output_folder_path, 'generator_model_0.h5'.format(iteration_no)))
self.dc_gan.discriminator_model.save(
os.path.join(output_folder_path, 'discriminator_model_0.h5'.format(iteration_no)))
self.dc_gan.concatenated_model.save(
os.path.join(output_folder_path, 'concatenated_model_0.h5'.format(iteration_no)))
def sample_real_images(self, output_folder_path):
filepath = os.path.join(output_folder_path, 'CIFAR10_Sample_Real_Images.png')
i = numpy.random.randint(0, self.x_train.shape[0], 16)
images = self.x_train[i, :, :, :]
plt.figure(figsize=(10, 10))
for i in range(16):
plt.subplot(4, 4, i + 1)
image = images[i, :, :, :]
image = numpy.reshape(image, [32, 32, 3])
plt.imshow(image)
plt.axis('off')
plt.tight_layout()
plt.savefig(filepath)
plt.close('all')
def generate_images(self, output_folder_path, iteration_no, noise=None):
filepath = os.path.join(output_folder_path, 'CIFAR10_Gen_Image0.png'.format(iteration_no))
if noise is None:
noise = numpy.random.uniform(-1, 1, size=[16, 100])
# Generator output has tanh activation whose range is [-1,1]
images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
plt.figure(figsize=(10, 10))
for i in range(16):
plt.subplot(4, 4, i + 1)
image = images[i, :, :, :]
image = numpy.reshape(image, [32, 32, 3])
plt.imshow(image)
plt.axis('off')
plt.tight_layout()
plt.savefig(filepath)
plt.close('all')
def main():
cifar10_trainer = Cifar10Trainer()
cifar10_trainer.train(train_steps=10000, log_interval=10, save_interval=100)
del cifar10_trainer.dc_gan
return
if __name__ == '__main__':
start_time = time.time()
print('Program Started at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(start_time))))
main()
end_time = time.time()
print('Program Ended at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(end_time))))
print('Total Execution Time: 0s'.format(datetime.timedelta(seconds=end_time - start_time)))
Some of the graphs are as below:
Discriminator Optimizer: Adam(lr=0.0001, beta1=0.5)
Generator Optimizer: Adam(lr=0.0001, beta1=0.5)Discriminator Optimizer: SGD(lr=0.0001)
Generator Optimizer: SGD(lr=0.0001)Discriminator Optimizer: SGD(lr=0.0001)
Generator Optimizer: SGD(lr=0.001)Discriminator Optimizer: SGD(lr=0.0001)
Generator Optimizer: SGD(lr=0.0005)
Note:
This question was originally asked in StackOverflow and then re-asked here as per suggestions in SO
Edit1:
Adding some generated images for reference
Images generated by Adam Optimizer
Images generated by SGD Optimizer
python deep-learning keras optimization gan
$endgroup$
I'm trying to train a DC-GAN on CIFAR-10 Dataset. I'm using Binary Cross Entropy as my loss function for both discriminator and generator (appended with non-trainable discriminator). If I train using Adam optimizer, the GAN is training fine. But if I replace the optimizer by SGD, the training is going haywire. The generator accuracy starts at some higher point and with iterations, it goes to 0 and stays there. The discriminator accuracy starts at some lower point and reaches somewhere around 0.5 (expected, right?). The peculiar thing is the generator loss function is increasing with iterations. I though may be the step is too high. I tried changing the step size. I tried using momentum with SGD. In all these cases, the generator may or may not decrease in the beginning, but then increases for sure. So, I think there is something inherently wrong in my model. I know training Deep Models is difficult and GANs still more, but there has to be some reason/heuristic as to why this is happening. Any inputs in appreciated. I'm new to Neural Networks, Deep Learning and hence new to GANs as well.
Here is my code:
Cifar10Models.py
from keras import Sequential
from keras.initializers import TruncatedNormal
from keras.layers import Activation, BatchNormalization, Conv2D, Conv2DTranspose, Dense, Flatten, LeakyReLU, Reshape
from keras.optimizers import SGD
class DcGan:
def __init__(self, print_model_summary: bool = False):
self.generator_model = None
self.discriminator_model = None
self.concatenated_model = None
self.print_model_summary = print_model_summary
def build_generator_model(self):
if self.generator_model:
return self.generator_model
self.generator_model = Sequential()
self.generator_model.add(Dense(4 * 4 * 512, input_dim=100,
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Reshape((4, 4, 512)))
self.generator_model.add(Conv2DTranspose(256, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Conv2DTranspose(128, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Conv2DTranspose(64, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.generator_model.add(Activation('relu'))
self.generator_model.add(Conv2D(3, 3, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(Activation('tanh'))
if self.print_model_summary:
self.generator_model.summary()
return self.generator_model
def build_discriminator_model(self):
if self.discriminator_model:
return self.discriminator_model
self.discriminator_model = Sequential()
self.discriminator_model.add(Conv2D(128, 3, strides=2, input_shape=(32, 32, 3), padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Conv2D(256, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Conv2D(512, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Conv2D(1024, 3, strides=2, padding='same',
kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(LeakyReLU(alpha=0.2))
self.discriminator_model.add(Flatten())
self.discriminator_model.add(Dense(1, kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
self.generator_model.add(BatchNormalization(momentum=0.5))
self.discriminator_model.add(Activation('sigmoid'))
if self.print_model_summary:
self.discriminator_model.summary()
return self.discriminator_model
def build_concatenated_model(self):
if self.concatenated_model:
return self.concatenated_model
self.concatenated_model = Sequential()
self.concatenated_model.add(self.generator_model)
self.concatenated_model.add(self.discriminator_model)
if self.print_model_summary:
self.concatenated_model.summary()
return self.concatenated_model
def build_dc_gan(self):
self.build_generator_model()
self.build_discriminator_model()
self.build_concatenated_model()
self.discriminator_model.trainable = True
optimizer = SGD(lr=0.0002)
self.discriminator_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
self.discriminator_model.trainable = False
optimizer = SGD(lr=0.0001)
self.concatenated_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
self.discriminator_model.trainable = True
Cifar10Trainer.py:
# Based on https://towardsdatascience.com/gan-by-example-using-keras-on-tensorflow-backend-1a6d515a60d0
import os
import datetime
import numpy
import time
from keras.datasets import cifar10
from keras.utils import np_utils
from matplotlib import pyplot as plt
import Cifar10Models
log_file_name = 'logs.csv'
class Cifar10Trainer:
def __init__(self):
self.x_train, self.y_train = self.get_train_and_test_data()
self.dc_gan = Cifar10Models.DcGan()
self.dc_gan.build_dc_gan()
@staticmethod
def get_train_and_test_data():
(x_train, y_train), _ = cifar10.load_data()
x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 3)
# Generator output has tanh activation whose range is [-1,1]
x_train = (x_train.astype('float32') * 2 / 255) - 1
y_train = np_utils.to_categorical(y_train, 10)
return x_train, y_train
def train(self, train_steps=10000, batch_size=128, log_interval=10, save_interval=100,
output_folder_path='./Trained_Models/'):
self.initialize_log(output_folder_path)
self.sample_real_images(output_folder_path)
for i in range(train_steps):
# Get real (Database) Images
images_real = self.x_train[numpy.random.randint(0, self.x_train.shape[0], size=batch_size), :, :, :]
# Generate Fake Images
noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
images_fake = self.dc_gan.generator_model.predict(noise)
# Train discriminator on both real and fake images
x = numpy.concatenate((images_real, images_fake), axis=0)
y = numpy.ones([2 * batch_size, 1])
y[batch_size:, :] = 0
d_loss = self.dc_gan.discriminator_model.train_on_batch(x, y)
# Train generator i.e. concatenated model
noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
y = numpy.ones([batch_size, 1])
g_loss = self.dc_gan.concatenated_model.train_on_batch(noise, y)
# Print Logs, Save Models, generate sample images
if (i + 1) % log_interval == 0:
self.log_progress(output_folder_path, i + 1, g_loss, d_loss)
if (i + 1) % save_interval == 0:
self.save_models(output_folder_path, i + 1)
self.generate_images(output_folder_path, i + 1)
@staticmethod
def initialize_log(output_folder_path):
log_line = 'Iteration No, Generator Loss, Generator Accuracy, Discriminator Loss, Discriminator Accuracy, '
'Timen'
with open(os.path.join(output_folder_path, log_file_name), 'w') as log_file:
log_file.write(log_line)
@staticmethod
def log_progress(output_folder_path, iteration_no, g_loss, d_loss):
log_line = '0:05,1:2.4f,2:0.4f,3:2.4f,4:0.4f,5n'
.format(iteration_no, g_loss[0], g_loss[1], d_loss[0], d_loss[1],
datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
with open(os.path.join(output_folder_path, log_file_name), 'a') as log_file:
log_file.write(log_line)
print(log_line)
def save_models(self, output_folder_path, iteration_no):
self.dc_gan.generator_model.save(
os.path.join(output_folder_path, 'generator_model_0.h5'.format(iteration_no)))
self.dc_gan.discriminator_model.save(
os.path.join(output_folder_path, 'discriminator_model_0.h5'.format(iteration_no)))
self.dc_gan.concatenated_model.save(
os.path.join(output_folder_path, 'concatenated_model_0.h5'.format(iteration_no)))
def sample_real_images(self, output_folder_path):
filepath = os.path.join(output_folder_path, 'CIFAR10_Sample_Real_Images.png')
i = numpy.random.randint(0, self.x_train.shape[0], 16)
images = self.x_train[i, :, :, :]
plt.figure(figsize=(10, 10))
for i in range(16):
plt.subplot(4, 4, i + 1)
image = images[i, :, :, :]
image = numpy.reshape(image, [32, 32, 3])
plt.imshow(image)
plt.axis('off')
plt.tight_layout()
plt.savefig(filepath)
plt.close('all')
def generate_images(self, output_folder_path, iteration_no, noise=None):
filepath = os.path.join(output_folder_path, 'CIFAR10_Gen_Image0.png'.format(iteration_no))
if noise is None:
noise = numpy.random.uniform(-1, 1, size=[16, 100])
# Generator output has tanh activation whose range is [-1,1]
images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
plt.figure(figsize=(10, 10))
for i in range(16):
plt.subplot(4, 4, i + 1)
image = images[i, :, :, :]
image = numpy.reshape(image, [32, 32, 3])
plt.imshow(image)
plt.axis('off')
plt.tight_layout()
plt.savefig(filepath)
plt.close('all')
def main():
cifar10_trainer = Cifar10Trainer()
cifar10_trainer.train(train_steps=10000, log_interval=10, save_interval=100)
del cifar10_trainer.dc_gan
return
if __name__ == '__main__':
start_time = time.time()
print('Program Started at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(start_time))))
main()
end_time = time.time()
print('Program Ended at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(end_time))))
print('Total Execution Time: 0s'.format(datetime.timedelta(seconds=end_time - start_time)))
Some of the graphs are as below:
Discriminator Optimizer: Adam(lr=0.0001, beta1=0.5)
Generator Optimizer: Adam(lr=0.0001, beta1=0.5)Discriminator Optimizer: SGD(lr=0.0001)
Generator Optimizer: SGD(lr=0.0001)Discriminator Optimizer: SGD(lr=0.0001)
Generator Optimizer: SGD(lr=0.001)Discriminator Optimizer: SGD(lr=0.0001)
Generator Optimizer: SGD(lr=0.0005)
Note:
This question was originally asked in StackOverflow and then re-asked here as per suggestions in SO
Edit1:
Adding some generated images for reference
Images generated by Adam Optimizer
Images generated by SGD Optimizer
python deep-learning keras optimization gan
python deep-learning keras optimization gan
edited Mar 24 at 12:01
Nagabhushan S N
asked Mar 24 at 5:18
Nagabhushan S NNagabhushan S N
1597
1597
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
I think that there are several issues with your model:
First of all - Your generator's loss is not the generator's loss. You have on binary cross-entropy loss function for the discriminator, and you have another binary cross-entropy loss function for the concatenated model whose output is again the discriminator's output (on generated images).
The "generator loss" you are showing is the discriminator's loss when dealing with generated images. You want this loss to go up, it means that your model successfully generates images that you discriminator fails to catch (as can be seen in the overall discriminator's accuracy which is at 0.5).
Another issue, is that you should add some generator regularization in the form of an actual generator loss ("generator objective function"). You can read about the different options in GAN Objective Functions: GANs and Their Variations.
A final issue that I see is that you are passing the generated images thru a final hyperbolic tangent activation function, and I don't really understand why? The generator in your case is supposed to generate a "believable" CIFAR10 image, which is a 32x32x3 tensor with values in the range [0,255] or [0,1]. Your generator's output has a potential range of [-1,1] (as you state in your code).
$endgroup$
$begingroup$
Happy 1K! Comments must be at least 15 characters in length.
$endgroup$
– Esmailian
Mar 24 at 8:26
1
$begingroup$
Thank you very much :)
$endgroup$
– Mark.F
Mar 24 at 10:15
$begingroup$
Thanks. 1. What I've defined as generator_loss, it is the binary cross entropy between the discriminator output and the desired output, which is 1 while training generator. Now, if my generator is able to fool the discriminator, then discriminator output should be close to 1, right?. So, the bce value should decrease. Right? Also, if you see the first graph where I've used Adam instead of SGD, the loss didn't increase. In that case, the generated images are better. When using SGD, the generated images are noise. Since generator accuracy is 0, the discriminator accuracy of 0.5 doesn't mean much
$endgroup$
– Nagabhushan S N
Mar 24 at 11:54
$begingroup$
2. I'll look into GAN objective functions. I was trying to implement plain DCGAN paper. 3. I'm using tanh function because DC-GAN paper says so. Yes, even though tanh outputs in the range [-1,1], if you see the generate_images function in Trainer.py file, I'm doing this:images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
So, this is a valid Image right? Also, like the first graph showed, even with this, using Adam Optimizer, I'm getting good results. Using SGD is causes this problem. The only change is SGD is used instead of Adam. So, there must be some problem with this.
$endgroup$
– Nagabhushan S N
Mar 24 at 11:58
$begingroup$
I've added some generated images for reference. Please check them as well. Again, thanks a lot for your time and suggestions
$endgroup$
– Nagabhushan S N
Mar 24 at 12:02
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47875%2fwhy-is-my-generator-loss-function-increasing-with-iterations%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
I think that there are several issues with your model:
First of all - Your generator's loss is not the generator's loss. You have on binary cross-entropy loss function for the discriminator, and you have another binary cross-entropy loss function for the concatenated model whose output is again the discriminator's output (on generated images).
The "generator loss" you are showing is the discriminator's loss when dealing with generated images. You want this loss to go up, it means that your model successfully generates images that you discriminator fails to catch (as can be seen in the overall discriminator's accuracy which is at 0.5).
Another issue, is that you should add some generator regularization in the form of an actual generator loss ("generator objective function"). You can read about the different options in GAN Objective Functions: GANs and Their Variations.
A final issue that I see is that you are passing the generated images thru a final hyperbolic tangent activation function, and I don't really understand why? The generator in your case is supposed to generate a "believable" CIFAR10 image, which is a 32x32x3 tensor with values in the range [0,255] or [0,1]. Your generator's output has a potential range of [-1,1] (as you state in your code).
$endgroup$
$begingroup$
Happy 1K! Comments must be at least 15 characters in length.
$endgroup$
– Esmailian
Mar 24 at 8:26
1
$begingroup$
Thank you very much :)
$endgroup$
– Mark.F
Mar 24 at 10:15
$begingroup$
Thanks. 1. What I've defined as generator_loss, it is the binary cross entropy between the discriminator output and the desired output, which is 1 while training generator. Now, if my generator is able to fool the discriminator, then discriminator output should be close to 1, right?. So, the bce value should decrease. Right? Also, if you see the first graph where I've used Adam instead of SGD, the loss didn't increase. In that case, the generated images are better. When using SGD, the generated images are noise. Since generator accuracy is 0, the discriminator accuracy of 0.5 doesn't mean much
$endgroup$
– Nagabhushan S N
Mar 24 at 11:54
$begingroup$
2. I'll look into GAN objective functions. I was trying to implement plain DCGAN paper. 3. I'm using tanh function because DC-GAN paper says so. Yes, even though tanh outputs in the range [-1,1], if you see the generate_images function in Trainer.py file, I'm doing this:images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
So, this is a valid Image right? Also, like the first graph showed, even with this, using Adam Optimizer, I'm getting good results. Using SGD is causes this problem. The only change is SGD is used instead of Adam. So, there must be some problem with this.
$endgroup$
– Nagabhushan S N
Mar 24 at 11:58
$begingroup$
I've added some generated images for reference. Please check them as well. Again, thanks a lot for your time and suggestions
$endgroup$
– Nagabhushan S N
Mar 24 at 12:02
add a comment |
$begingroup$
I think that there are several issues with your model:
First of all - Your generator's loss is not the generator's loss. You have on binary cross-entropy loss function for the discriminator, and you have another binary cross-entropy loss function for the concatenated model whose output is again the discriminator's output (on generated images).
The "generator loss" you are showing is the discriminator's loss when dealing with generated images. You want this loss to go up, it means that your model successfully generates images that you discriminator fails to catch (as can be seen in the overall discriminator's accuracy which is at 0.5).
Another issue, is that you should add some generator regularization in the form of an actual generator loss ("generator objective function"). You can read about the different options in GAN Objective Functions: GANs and Their Variations.
A final issue that I see is that you are passing the generated images thru a final hyperbolic tangent activation function, and I don't really understand why? The generator in your case is supposed to generate a "believable" CIFAR10 image, which is a 32x32x3 tensor with values in the range [0,255] or [0,1]. Your generator's output has a potential range of [-1,1] (as you state in your code).
$endgroup$
$begingroup$
Happy 1K! Comments must be at least 15 characters in length.
$endgroup$
– Esmailian
Mar 24 at 8:26
1
$begingroup$
Thank you very much :)
$endgroup$
– Mark.F
Mar 24 at 10:15
$begingroup$
Thanks. 1. What I've defined as generator_loss, it is the binary cross entropy between the discriminator output and the desired output, which is 1 while training generator. Now, if my generator is able to fool the discriminator, then discriminator output should be close to 1, right?. So, the bce value should decrease. Right? Also, if you see the first graph where I've used Adam instead of SGD, the loss didn't increase. In that case, the generated images are better. When using SGD, the generated images are noise. Since generator accuracy is 0, the discriminator accuracy of 0.5 doesn't mean much
$endgroup$
– Nagabhushan S N
Mar 24 at 11:54
$begingroup$
2. I'll look into GAN objective functions. I was trying to implement plain DCGAN paper. 3. I'm using tanh function because DC-GAN paper says so. Yes, even though tanh outputs in the range [-1,1], if you see the generate_images function in Trainer.py file, I'm doing this:images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
So, this is a valid Image right? Also, like the first graph showed, even with this, using Adam Optimizer, I'm getting good results. Using SGD is causes this problem. The only change is SGD is used instead of Adam. So, there must be some problem with this.
$endgroup$
– Nagabhushan S N
Mar 24 at 11:58
$begingroup$
I've added some generated images for reference. Please check them as well. Again, thanks a lot for your time and suggestions
$endgroup$
– Nagabhushan S N
Mar 24 at 12:02
add a comment |
$begingroup$
I think that there are several issues with your model:
First of all - Your generator's loss is not the generator's loss. You have on binary cross-entropy loss function for the discriminator, and you have another binary cross-entropy loss function for the concatenated model whose output is again the discriminator's output (on generated images).
The "generator loss" you are showing is the discriminator's loss when dealing with generated images. You want this loss to go up, it means that your model successfully generates images that you discriminator fails to catch (as can be seen in the overall discriminator's accuracy which is at 0.5).
Another issue, is that you should add some generator regularization in the form of an actual generator loss ("generator objective function"). You can read about the different options in GAN Objective Functions: GANs and Their Variations.
A final issue that I see is that you are passing the generated images thru a final hyperbolic tangent activation function, and I don't really understand why? The generator in your case is supposed to generate a "believable" CIFAR10 image, which is a 32x32x3 tensor with values in the range [0,255] or [0,1]. Your generator's output has a potential range of [-1,1] (as you state in your code).
$endgroup$
I think that there are several issues with your model:
First of all - Your generator's loss is not the generator's loss. You have on binary cross-entropy loss function for the discriminator, and you have another binary cross-entropy loss function for the concatenated model whose output is again the discriminator's output (on generated images).
The "generator loss" you are showing is the discriminator's loss when dealing with generated images. You want this loss to go up, it means that your model successfully generates images that you discriminator fails to catch (as can be seen in the overall discriminator's accuracy which is at 0.5).
Another issue, is that you should add some generator regularization in the form of an actual generator loss ("generator objective function"). You can read about the different options in GAN Objective Functions: GANs and Their Variations.
A final issue that I see is that you are passing the generated images thru a final hyperbolic tangent activation function, and I don't really understand why? The generator in your case is supposed to generate a "believable" CIFAR10 image, which is a 32x32x3 tensor with values in the range [0,255] or [0,1]. Your generator's output has a potential range of [-1,1] (as you state in your code).
answered Mar 24 at 8:15
Mark.FMark.F
1,0241421
1,0241421
$begingroup$
Happy 1K! Comments must be at least 15 characters in length.
$endgroup$
– Esmailian
Mar 24 at 8:26
1
$begingroup$
Thank you very much :)
$endgroup$
– Mark.F
Mar 24 at 10:15
$begingroup$
Thanks. 1. What I've defined as generator_loss, it is the binary cross entropy between the discriminator output and the desired output, which is 1 while training generator. Now, if my generator is able to fool the discriminator, then discriminator output should be close to 1, right?. So, the bce value should decrease. Right? Also, if you see the first graph where I've used Adam instead of SGD, the loss didn't increase. In that case, the generated images are better. When using SGD, the generated images are noise. Since generator accuracy is 0, the discriminator accuracy of 0.5 doesn't mean much
$endgroup$
– Nagabhushan S N
Mar 24 at 11:54
$begingroup$
2. I'll look into GAN objective functions. I was trying to implement plain DCGAN paper. 3. I'm using tanh function because DC-GAN paper says so. Yes, even though tanh outputs in the range [-1,1], if you see the generate_images function in Trainer.py file, I'm doing this:images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
So, this is a valid Image right? Also, like the first graph showed, even with this, using Adam Optimizer, I'm getting good results. Using SGD is causes this problem. The only change is SGD is used instead of Adam. So, there must be some problem with this.
$endgroup$
– Nagabhushan S N
Mar 24 at 11:58
$begingroup$
I've added some generated images for reference. Please check them as well. Again, thanks a lot for your time and suggestions
$endgroup$
– Nagabhushan S N
Mar 24 at 12:02
add a comment |
$begingroup$
Happy 1K! Comments must be at least 15 characters in length.
$endgroup$
– Esmailian
Mar 24 at 8:26
1
$begingroup$
Thank you very much :)
$endgroup$
– Mark.F
Mar 24 at 10:15
$begingroup$
Thanks. 1. What I've defined as generator_loss, it is the binary cross entropy between the discriminator output and the desired output, which is 1 while training generator. Now, if my generator is able to fool the discriminator, then discriminator output should be close to 1, right?. So, the bce value should decrease. Right? Also, if you see the first graph where I've used Adam instead of SGD, the loss didn't increase. In that case, the generated images are better. When using SGD, the generated images are noise. Since generator accuracy is 0, the discriminator accuracy of 0.5 doesn't mean much
$endgroup$
– Nagabhushan S N
Mar 24 at 11:54
$begingroup$
2. I'll look into GAN objective functions. I was trying to implement plain DCGAN paper. 3. I'm using tanh function because DC-GAN paper says so. Yes, even though tanh outputs in the range [-1,1], if you see the generate_images function in Trainer.py file, I'm doing this:images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
So, this is a valid Image right? Also, like the first graph showed, even with this, using Adam Optimizer, I'm getting good results. Using SGD is causes this problem. The only change is SGD is used instead of Adam. So, there must be some problem with this.
$endgroup$
– Nagabhushan S N
Mar 24 at 11:58
$begingroup$
I've added some generated images for reference. Please check them as well. Again, thanks a lot for your time and suggestions
$endgroup$
– Nagabhushan S N
Mar 24 at 12:02
$begingroup$
Happy 1K! Comments must be at least 15 characters in length.
$endgroup$
– Esmailian
Mar 24 at 8:26
$begingroup$
Happy 1K! Comments must be at least 15 characters in length.
$endgroup$
– Esmailian
Mar 24 at 8:26
1
1
$begingroup$
Thank you very much :)
$endgroup$
– Mark.F
Mar 24 at 10:15
$begingroup$
Thank you very much :)
$endgroup$
– Mark.F
Mar 24 at 10:15
$begingroup$
Thanks. 1. What I've defined as generator_loss, it is the binary cross entropy between the discriminator output and the desired output, which is 1 while training generator. Now, if my generator is able to fool the discriminator, then discriminator output should be close to 1, right?. So, the bce value should decrease. Right? Also, if you see the first graph where I've used Adam instead of SGD, the loss didn't increase. In that case, the generated images are better. When using SGD, the generated images are noise. Since generator accuracy is 0, the discriminator accuracy of 0.5 doesn't mean much
$endgroup$
– Nagabhushan S N
Mar 24 at 11:54
$begingroup$
Thanks. 1. What I've defined as generator_loss, it is the binary cross entropy between the discriminator output and the desired output, which is 1 while training generator. Now, if my generator is able to fool the discriminator, then discriminator output should be close to 1, right?. So, the bce value should decrease. Right? Also, if you see the first graph where I've used Adam instead of SGD, the loss didn't increase. In that case, the generated images are better. When using SGD, the generated images are noise. Since generator accuracy is 0, the discriminator accuracy of 0.5 doesn't mean much
$endgroup$
– Nagabhushan S N
Mar 24 at 11:54
$begingroup$
2. I'll look into GAN objective functions. I was trying to implement plain DCGAN paper. 3. I'm using tanh function because DC-GAN paper says so. Yes, even though tanh outputs in the range [-1,1], if you see the generate_images function in Trainer.py file, I'm doing this:
images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
So, this is a valid Image right? Also, like the first graph showed, even with this, using Adam Optimizer, I'm getting good results. Using SGD is causes this problem. The only change is SGD is used instead of Adam. So, there must be some problem with this.$endgroup$
– Nagabhushan S N
Mar 24 at 11:58
$begingroup$
2. I'll look into GAN objective functions. I was trying to implement plain DCGAN paper. 3. I'm using tanh function because DC-GAN paper says so. Yes, even though tanh outputs in the range [-1,1], if you see the generate_images function in Trainer.py file, I'm doing this:
images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
So, this is a valid Image right? Also, like the first graph showed, even with this, using Adam Optimizer, I'm getting good results. Using SGD is causes this problem. The only change is SGD is used instead of Adam. So, there must be some problem with this.$endgroup$
– Nagabhushan S N
Mar 24 at 11:58
$begingroup$
I've added some generated images for reference. Please check them as well. Again, thanks a lot for your time and suggestions
$endgroup$
– Nagabhushan S N
Mar 24 at 12:02
$begingroup$
I've added some generated images for reference. Please check them as well. Again, thanks a lot for your time and suggestions
$endgroup$
– Nagabhushan S N
Mar 24 at 12:02
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47875%2fwhy-is-my-generator-loss-function-increasing-with-iterations%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown