Why is my generator loss function increasing with iterations? The Next CEO of Stack Overflow2019 Community Moderator ElectionKelly Criterion in xgboost loss functionLoss given Activation Function and Probability ModelLoss function in GANMaximize profit with loss functionGenerator loss not decreasing- text to image synthesisGAN - why doesn't the generator nullify the noise input?Precision recall loss functionUsing a custom R generator function with fit_generator (Keras, R)how to implement infoGAN's loss function in Keras' functional APIWhy do most GAN (Generative Adversarial Network) implementations have symmetric discriminator and generator architectures?

What does "Its cash flow is deeply negative" mean?

Is it possible to replace duplicates of a character with one character using tr

No sign flipping while figuring out the emf of voltaic cell?

Is there a way to bypass a component in series in a circuit if that component fails?

Is micro rebar a better way to reinforce concrete than rebar?

Would this house-rule that treats advantage as a +1 to the roll instead (and disadvantage as -1) and allows them to stack be balanced?

Is it ever safe to open a suspicious HTML file (e.g. email attachment)?

Inappropriate reference requests from Journal reviewers

If a black hole is created from light, can this black hole then move at the speed of light?

How to scale a tikZ image which is within a figure environment

Why does standard notation not preserve intervals (visually)

Why is information "lost" when it got into a black hole?

What happened in Rome, when the western empire "fell"?

How to get the end in algorithm2e

What connection does MS Office have to Netscape Navigator?

Measuring resistivity of dielectric liquid

Plot of histogram similar to output from @risk

Example of a Mathematician/Physicist whose Other Publications during their PhD eclipsed their PhD Thesis

Is there a difference between "Fahrstuhl" and "Aufzug"

A Man With a Stainless Steel Endoskeleton (like The Terminator) Fighting Cloaked Aliens Only He Can See

is it ok to reduce charging current for li ion 18650 battery?

Why don't programming languages automatically manage the synchronous/asynchronous problem?

Can MTA send mail via a relay without being told so?

Does soap repel water?

Why is my generator loss function increasing with iterations?

The Next CEO of Stack Overflow

2019 Community Moderator ElectionKelly Criterion in xgboost loss functionLoss given Activation Function and Probability ModelLoss function in GANMaximize profit with loss functionGenerator loss not decreasing- text to image synthesisGAN - why doesn't the generator nullify the noise input?Precision recall loss functionUsing a custom R generator function with fit_generator (Keras, R)how to implement infoGAN's loss function in Keras' functional APIWhy do most GAN (Generative Adversarial Network) implementations have symmetric discriminator and generator architectures?

I'm trying to train a DC-GAN on CIFAR-10 Dataset. I'm using Binary Cross Entropy as my loss function for both discriminator and generator (appended with non-trainable discriminator). If I train using Adam optimizer, the GAN is training fine. But if I replace the optimizer by SGD, the training is going haywire. The generator accuracy starts at some higher point and with iterations, it goes to 0 and stays there. The discriminator accuracy starts at some lower point and reaches somewhere around 0.5 (expected, right?). The peculiar thing is the generator loss function is increasing with iterations. I though may be the step is too high. I tried changing the step size. I tried using momentum with SGD. In all these cases, the generator may or may not decrease in the beginning, but then increases for sure. So, I think there is something inherently wrong in my model. I know training Deep Models is difficult and GANs still more, but there has to be some reason/heuristic as to why this is happening. Any inputs in appreciated. I'm new to Neural Networks, Deep Learning and hence new to GANs as well.

Here is my code:
Cifar10Models.py

from keras import Sequential
from keras.initializers import TruncatedNormal
from keras.layers import Activation, BatchNormalization, Conv2D, Conv2DTranspose, Dense, Flatten, LeakyReLU, Reshape
from keras.optimizers import SGD


class DcGan:
 def __init__(self, print_model_summary: bool = False):
 self.generator_model = None
 self.discriminator_model = None
 self.concatenated_model = None
 self.print_model_summary = print_model_summary

 def build_generator_model(self):
 if self.generator_model:
 return self.generator_model

 self.generator_model = Sequential()
 self.generator_model.add(Dense(4 * 4 * 512, input_dim=100,
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))
 self.generator_model.add(Reshape((4, 4, 512)))

 self.generator_model.add(Conv2DTranspose(256, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))

 self.generator_model.add(Conv2DTranspose(128, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))

 self.generator_model.add(Conv2DTranspose(64, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))

 self.generator_model.add(Conv2D(3, 3, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(Activation('tanh'))

 if self.print_model_summary:
 self.generator_model.summary()

 return self.generator_model

 def build_discriminator_model(self):
 if self.discriminator_model:
 return self.discriminator_model

 self.discriminator_model = Sequential()
 self.discriminator_model.add(Conv2D(128, 3, strides=2, input_shape=(32, 32, 3), padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Conv2D(256, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Conv2D(512, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Conv2D(1024, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Flatten())
 self.discriminator_model.add(Dense(1, kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(Activation('sigmoid'))

 if self.print_model_summary:
 self.discriminator_model.summary()

 return self.discriminator_model

 def build_concatenated_model(self):
 if self.concatenated_model:
 return self.concatenated_model

 self.concatenated_model = Sequential()
 self.concatenated_model.add(self.generator_model)
 self.concatenated_model.add(self.discriminator_model)

 if self.print_model_summary:
 self.concatenated_model.summary()

 return self.concatenated_model

 def build_dc_gan(self):
 self.build_generator_model()
 self.build_discriminator_model()
 self.build_concatenated_model()

 self.discriminator_model.trainable = True
 optimizer = SGD(lr=0.0002)
 self.discriminator_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
 self.discriminator_model.trainable = False
 optimizer = SGD(lr=0.0001)
 self.concatenated_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
 self.discriminator_model.trainable = True

Cifar10Trainer.py:

# Based on https://towardsdatascience.com/gan-by-example-using-keras-on-tensorflow-backend-1a6d515a60d0

import os

import datetime
import numpy
import time
from keras.datasets import cifar10
from keras.utils import np_utils
from matplotlib import pyplot as plt

import Cifar10Models

log_file_name = 'logs.csv'


class Cifar10Trainer:
 def __init__(self):
 self.x_train, self.y_train = self.get_train_and_test_data()
 self.dc_gan = Cifar10Models.DcGan()
 self.dc_gan.build_dc_gan()

 @staticmethod
 def get_train_and_test_data():
 (x_train, y_train), _ = cifar10.load_data()
 x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 3)
 # Generator output has tanh activation whose range is [-1,1]
 x_train = (x_train.astype('float32') * 2 / 255) - 1
 y_train = np_utils.to_categorical(y_train, 10)
 return x_train, y_train

 def train(self, train_steps=10000, batch_size=128, log_interval=10, save_interval=100,
 output_folder_path='./Trained_Models/'):
 self.initialize_log(output_folder_path)
 self.sample_real_images(output_folder_path)
 for i in range(train_steps):
 # Get real (Database) Images
 images_real = self.x_train[numpy.random.randint(0, self.x_train.shape[0], size=batch_size), :, :, :]

 # Generate Fake Images
 noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
 images_fake = self.dc_gan.generator_model.predict(noise)

 # Train discriminator on both real and fake images
 x = numpy.concatenate((images_real, images_fake), axis=0)
 y = numpy.ones([2 * batch_size, 1])
 y[batch_size:, :] = 0
 d_loss = self.dc_gan.discriminator_model.train_on_batch(x, y)

 # Train generator i.e. concatenated model
 noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
 y = numpy.ones([batch_size, 1])
 g_loss = self.dc_gan.concatenated_model.train_on_batch(noise, y)

 # Print Logs, Save Models, generate sample images
 if (i + 1) % log_interval == 0:
 self.log_progress(output_folder_path, i + 1, g_loss, d_loss)
 if (i + 1) % save_interval == 0:
 self.save_models(output_folder_path, i + 1)
 self.generate_images(output_folder_path, i + 1)

 @staticmethod
 def initialize_log(output_folder_path):
 log_line = 'Iteration No, Generator Loss, Generator Accuracy, Discriminator Loss, Discriminator Accuracy, ' 
 'Timen'
 with open(os.path.join(output_folder_path, log_file_name), 'w') as log_file:
 log_file.write(log_line)

 @staticmethod
 def log_progress(output_folder_path, iteration_no, g_loss, d_loss):
 log_line = '0:05,1:2.4f,2:0.4f,3:2.4f,4:0.4f,5n' 
 .format(iteration_no, g_loss[0], g_loss[1], d_loss[0], d_loss[1],
 datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
 with open(os.path.join(output_folder_path, log_file_name), 'a') as log_file:
 log_file.write(log_line)
 print(log_line)

 def save_models(self, output_folder_path, iteration_no):
 self.dc_gan.generator_model.save(
 os.path.join(output_folder_path, 'generator_model_0.h5'.format(iteration_no)))
 self.dc_gan.discriminator_model.save(
 os.path.join(output_folder_path, 'discriminator_model_0.h5'.format(iteration_no)))
 self.dc_gan.concatenated_model.save(
 os.path.join(output_folder_path, 'concatenated_model_0.h5'.format(iteration_no)))

 def sample_real_images(self, output_folder_path):
 filepath = os.path.join(output_folder_path, 'CIFAR10_Sample_Real_Images.png')
 i = numpy.random.randint(0, self.x_train.shape[0], 16)
 images = self.x_train[i, :, :, :]
 plt.figure(figsize=(10, 10))
 for i in range(16):
 plt.subplot(4, 4, i + 1)
 image = images[i, :, :, :]
 image = numpy.reshape(image, [32, 32, 3])
 plt.imshow(image)
 plt.axis('off')
 plt.tight_layout()
 plt.savefig(filepath)
 plt.close('all')

 def generate_images(self, output_folder_path, iteration_no, noise=None):
 filepath = os.path.join(output_folder_path, 'CIFAR10_Gen_Image0.png'.format(iteration_no))
 if noise is None:
 noise = numpy.random.uniform(-1, 1, size=[16, 100])
 # Generator output has tanh activation whose range is [-1,1]
 images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
 plt.figure(figsize=(10, 10))
 for i in range(16):
 plt.subplot(4, 4, i + 1)
 image = images[i, :, :, :]
 image = numpy.reshape(image, [32, 32, 3])
 plt.imshow(image)
 plt.axis('off')
 plt.tight_layout()
 plt.savefig(filepath)
 plt.close('all')


def main():
 cifar10_trainer = Cifar10Trainer()
 cifar10_trainer.train(train_steps=10000, log_interval=10, save_interval=100)
 del cifar10_trainer.dc_gan
 return


if __name__ == '__main__':
 start_time = time.time()
 print('Program Started at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(start_time))))
 main()
 end_time = time.time()
 print('Program Ended at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(end_time))))
 print('Total Execution Time: 0s'.format(datetime.timedelta(seconds=end_time - start_time)))

Some of the graphs are as below:

Discriminator Optimizer: Adam(lr=0.0001, beta1=0.5)

Generator Optimizer: Adam(lr=0.0001, beta1=0.5)

Discriminator Optimizer: SGD(lr=0.0001)

Generator Optimizer: SGD(lr=0.0001)

Discriminator Optimizer: SGD(lr=0.0001)

Generator Optimizer: SGD(lr=0.001)

Discriminator Optimizer: SGD(lr=0.0001)

Generator Optimizer: SGD(lr=0.0005)

Note:

This question was originally asked in StackOverflow and then re-asked here as per suggestions in SO

Edit1:

Adding some generated images for reference

Images generated by Adam Optimizer

Images generated by SGD Optimizer

edited Mar 24 at 12:01

asked Mar 24 at 5:18

Nagabhushan S N

1597

add a comment |

Here is my code:
Cifar10Models.py

from keras import Sequential
from keras.initializers import TruncatedNormal
from keras.layers import Activation, BatchNormalization, Conv2D, Conv2DTranspose, Dense, Flatten, LeakyReLU, Reshape
from keras.optimizers import SGD


class DcGan:
 def __init__(self, print_model_summary: bool = False):
 self.generator_model = None
 self.discriminator_model = None
 self.concatenated_model = None
 self.print_model_summary = print_model_summary

 def build_generator_model(self):
 if self.generator_model:
 return self.generator_model

 self.generator_model = Sequential()
 self.generator_model.add(Dense(4 * 4 * 512, input_dim=100,
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))
 self.generator_model.add(Reshape((4, 4, 512)))

 self.generator_model.add(Conv2DTranspose(256, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))

 self.generator_model.add(Conv2DTranspose(128, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))

 self.generator_model.add(Conv2DTranspose(64, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))

 self.generator_model.add(Conv2D(3, 3, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(Activation('tanh'))

 if self.print_model_summary:
 self.generator_model.summary()

 return self.generator_model

 def build_discriminator_model(self):
 if self.discriminator_model:
 return self.discriminator_model

 self.discriminator_model = Sequential()
 self.discriminator_model.add(Conv2D(128, 3, strides=2, input_shape=(32, 32, 3), padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Conv2D(256, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Conv2D(512, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Conv2D(1024, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Flatten())
 self.discriminator_model.add(Dense(1, kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(Activation('sigmoid'))

 if self.print_model_summary:
 self.discriminator_model.summary()

 return self.discriminator_model

 def build_concatenated_model(self):
 if self.concatenated_model:
 return self.concatenated_model

 self.concatenated_model = Sequential()
 self.concatenated_model.add(self.generator_model)
 self.concatenated_model.add(self.discriminator_model)

 if self.print_model_summary:
 self.concatenated_model.summary()

 return self.concatenated_model

 def build_dc_gan(self):
 self.build_generator_model()
 self.build_discriminator_model()
 self.build_concatenated_model()

 self.discriminator_model.trainable = True
 optimizer = SGD(lr=0.0002)
 self.discriminator_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
 self.discriminator_model.trainable = False
 optimizer = SGD(lr=0.0001)
 self.concatenated_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
 self.discriminator_model.trainable = True

Cifar10Trainer.py:

# Based on https://towardsdatascience.com/gan-by-example-using-keras-on-tensorflow-backend-1a6d515a60d0

import os

import datetime
import numpy
import time
from keras.datasets import cifar10
from keras.utils import np_utils
from matplotlib import pyplot as plt

import Cifar10Models

log_file_name = 'logs.csv'


class Cifar10Trainer:
 def __init__(self):
 self.x_train, self.y_train = self.get_train_and_test_data()
 self.dc_gan = Cifar10Models.DcGan()
 self.dc_gan.build_dc_gan()

 @staticmethod
 def get_train_and_test_data():
 (x_train, y_train), _ = cifar10.load_data()
 x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 3)
 # Generator output has tanh activation whose range is [-1,1]
 x_train = (x_train.astype('float32') * 2 / 255) - 1
 y_train = np_utils.to_categorical(y_train, 10)
 return x_train, y_train

 def train(self, train_steps=10000, batch_size=128, log_interval=10, save_interval=100,
 output_folder_path='./Trained_Models/'):
 self.initialize_log(output_folder_path)
 self.sample_real_images(output_folder_path)
 for i in range(train_steps):
 # Get real (Database) Images
 images_real = self.x_train[numpy.random.randint(0, self.x_train.shape[0], size=batch_size), :, :, :]

 # Generate Fake Images
 noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
 images_fake = self.dc_gan.generator_model.predict(noise)

 # Train discriminator on both real and fake images
 x = numpy.concatenate((images_real, images_fake), axis=0)
 y = numpy.ones([2 * batch_size, 1])
 y[batch_size:, :] = 0
 d_loss = self.dc_gan.discriminator_model.train_on_batch(x, y)

 # Train generator i.e. concatenated model
 noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
 y = numpy.ones([batch_size, 1])
 g_loss = self.dc_gan.concatenated_model.train_on_batch(noise, y)

 # Print Logs, Save Models, generate sample images
 if (i + 1) % log_interval == 0:
 self.log_progress(output_folder_path, i + 1, g_loss, d_loss)
 if (i + 1) % save_interval == 0:
 self.save_models(output_folder_path, i + 1)
 self.generate_images(output_folder_path, i + 1)

 @staticmethod
 def initialize_log(output_folder_path):
 log_line = 'Iteration No, Generator Loss, Generator Accuracy, Discriminator Loss, Discriminator Accuracy, ' 
 'Timen'
 with open(os.path.join(output_folder_path, log_file_name), 'w') as log_file:
 log_file.write(log_line)

 @staticmethod
 def log_progress(output_folder_path, iteration_no, g_loss, d_loss):
 log_line = '0:05,1:2.4f,2:0.4f,3:2.4f,4:0.4f,5n' 
 .format(iteration_no, g_loss[0], g_loss[1], d_loss[0], d_loss[1],
 datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
 with open(os.path.join(output_folder_path, log_file_name), 'a') as log_file:
 log_file.write(log_line)
 print(log_line)

 def save_models(self, output_folder_path, iteration_no):
 self.dc_gan.generator_model.save(
 os.path.join(output_folder_path, 'generator_model_0.h5'.format(iteration_no)))
 self.dc_gan.discriminator_model.save(
 os.path.join(output_folder_path, 'discriminator_model_0.h5'.format(iteration_no)))
 self.dc_gan.concatenated_model.save(
 os.path.join(output_folder_path, 'concatenated_model_0.h5'.format(iteration_no)))

 def sample_real_images(self, output_folder_path):
 filepath = os.path.join(output_folder_path, 'CIFAR10_Sample_Real_Images.png')
 i = numpy.random.randint(0, self.x_train.shape[0], 16)
 images = self.x_train[i, :, :, :]
 plt.figure(figsize=(10, 10))
 for i in range(16):
 plt.subplot(4, 4, i + 1)
 image = images[i, :, :, :]
 image = numpy.reshape(image, [32, 32, 3])
 plt.imshow(image)
 plt.axis('off')
 plt.tight_layout()
 plt.savefig(filepath)
 plt.close('all')

 def generate_images(self, output_folder_path, iteration_no, noise=None):
 filepath = os.path.join(output_folder_path, 'CIFAR10_Gen_Image0.png'.format(iteration_no))
 if noise is None:
 noise = numpy.random.uniform(-1, 1, size=[16, 100])
 # Generator output has tanh activation whose range is [-1,1]
 images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
 plt.figure(figsize=(10, 10))
 for i in range(16):
 plt.subplot(4, 4, i + 1)
 image = images[i, :, :, :]
 image = numpy.reshape(image, [32, 32, 3])
 plt.imshow(image)
 plt.axis('off')
 plt.tight_layout()
 plt.savefig(filepath)
 plt.close('all')


def main():
 cifar10_trainer = Cifar10Trainer()
 cifar10_trainer.train(train_steps=10000, log_interval=10, save_interval=100)
 del cifar10_trainer.dc_gan
 return


if __name__ == '__main__':
 start_time = time.time()
 print('Program Started at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(start_time))))
 main()
 end_time = time.time()
 print('Program Ended at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(end_time))))
 print('Total Execution Time: 0s'.format(datetime.timedelta(seconds=end_time - start_time)))

Some of the graphs are as below:

Discriminator Optimizer: Adam(lr=0.0001, beta1=0.5)

Generator Optimizer: Adam(lr=0.0001, beta1=0.5)

Discriminator Optimizer: SGD(lr=0.0001)

Generator Optimizer: SGD(lr=0.0001)

Discriminator Optimizer: SGD(lr=0.0001)

Generator Optimizer: SGD(lr=0.001)

Discriminator Optimizer: SGD(lr=0.0001)

Generator Optimizer: SGD(lr=0.0005)

Note:

This question was originally asked in StackOverflow and then re-asked here as per suggestions in SO

Edit1:

Adding some generated images for reference

Images generated by Adam Optimizer

Images generated by SGD Optimizer

edited Mar 24 at 12:01

asked Mar 24 at 5:18

Nagabhushan S N

1597

add a comment |

Here is my code:
Cifar10Models.py

from keras import Sequential
from keras.initializers import TruncatedNormal
from keras.layers import Activation, BatchNormalization, Conv2D, Conv2DTranspose, Dense, Flatten, LeakyReLU, Reshape
from keras.optimizers import SGD


class DcGan:
 def __init__(self, print_model_summary: bool = False):
 self.generator_model = None
 self.discriminator_model = None
 self.concatenated_model = None
 self.print_model_summary = print_model_summary

 def build_generator_model(self):
 if self.generator_model:
 return self.generator_model

 self.generator_model = Sequential()
 self.generator_model.add(Dense(4 * 4 * 512, input_dim=100,
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))
 self.generator_model.add(Reshape((4, 4, 512)))

 self.generator_model.add(Conv2DTranspose(256, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))

 self.generator_model.add(Conv2DTranspose(128, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))

 self.generator_model.add(Conv2DTranspose(64, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))

 self.generator_model.add(Conv2D(3, 3, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(Activation('tanh'))

 if self.print_model_summary:
 self.generator_model.summary()

 return self.generator_model

 def build_discriminator_model(self):
 if self.discriminator_model:
 return self.discriminator_model

 self.discriminator_model = Sequential()
 self.discriminator_model.add(Conv2D(128, 3, strides=2, input_shape=(32, 32, 3), padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Conv2D(256, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Conv2D(512, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Conv2D(1024, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Flatten())
 self.discriminator_model.add(Dense(1, kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(Activation('sigmoid'))

 if self.print_model_summary:
 self.discriminator_model.summary()

 return self.discriminator_model

 def build_concatenated_model(self):
 if self.concatenated_model:
 return self.concatenated_model

 self.concatenated_model = Sequential()
 self.concatenated_model.add(self.generator_model)
 self.concatenated_model.add(self.discriminator_model)

 if self.print_model_summary:
 self.concatenated_model.summary()

 return self.concatenated_model

 def build_dc_gan(self):
 self.build_generator_model()
 self.build_discriminator_model()
 self.build_concatenated_model()

 self.discriminator_model.trainable = True
 optimizer = SGD(lr=0.0002)
 self.discriminator_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
 self.discriminator_model.trainable = False
 optimizer = SGD(lr=0.0001)
 self.concatenated_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
 self.discriminator_model.trainable = True

Cifar10Trainer.py:

# Based on https://towardsdatascience.com/gan-by-example-using-keras-on-tensorflow-backend-1a6d515a60d0

import os

import datetime
import numpy
import time
from keras.datasets import cifar10
from keras.utils import np_utils
from matplotlib import pyplot as plt

import Cifar10Models

log_file_name = 'logs.csv'


class Cifar10Trainer:
 def __init__(self):
 self.x_train, self.y_train = self.get_train_and_test_data()
 self.dc_gan = Cifar10Models.DcGan()
 self.dc_gan.build_dc_gan()

 @staticmethod
 def get_train_and_test_data():
 (x_train, y_train), _ = cifar10.load_data()
 x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 3)
 # Generator output has tanh activation whose range is [-1,1]
 x_train = (x_train.astype('float32') * 2 / 255) - 1
 y_train = np_utils.to_categorical(y_train, 10)
 return x_train, y_train

 def train(self, train_steps=10000, batch_size=128, log_interval=10, save_interval=100,
 output_folder_path='./Trained_Models/'):
 self.initialize_log(output_folder_path)
 self.sample_real_images(output_folder_path)
 for i in range(train_steps):
 # Get real (Database) Images
 images_real = self.x_train[numpy.random.randint(0, self.x_train.shape[0], size=batch_size), :, :, :]

 # Generate Fake Images
 noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
 images_fake = self.dc_gan.generator_model.predict(noise)

 # Train discriminator on both real and fake images
 x = numpy.concatenate((images_real, images_fake), axis=0)
 y = numpy.ones([2 * batch_size, 1])
 y[batch_size:, :] = 0
 d_loss = self.dc_gan.discriminator_model.train_on_batch(x, y)

 # Train generator i.e. concatenated model
 noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
 y = numpy.ones([batch_size, 1])
 g_loss = self.dc_gan.concatenated_model.train_on_batch(noise, y)

 # Print Logs, Save Models, generate sample images
 if (i + 1) % log_interval == 0:
 self.log_progress(output_folder_path, i + 1, g_loss, d_loss)
 if (i + 1) % save_interval == 0:
 self.save_models(output_folder_path, i + 1)
 self.generate_images(output_folder_path, i + 1)

 @staticmethod
 def initialize_log(output_folder_path):
 log_line = 'Iteration No, Generator Loss, Generator Accuracy, Discriminator Loss, Discriminator Accuracy, ' 
 'Timen'
 with open(os.path.join(output_folder_path, log_file_name), 'w') as log_file:
 log_file.write(log_line)

 @staticmethod
 def log_progress(output_folder_path, iteration_no, g_loss, d_loss):
 log_line = '0:05,1:2.4f,2:0.4f,3:2.4f,4:0.4f,5n' 
 .format(iteration_no, g_loss[0], g_loss[1], d_loss[0], d_loss[1],
 datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
 with open(os.path.join(output_folder_path, log_file_name), 'a') as log_file:
 log_file.write(log_line)
 print(log_line)

 def save_models(self, output_folder_path, iteration_no):
 self.dc_gan.generator_model.save(
 os.path.join(output_folder_path, 'generator_model_0.h5'.format(iteration_no)))
 self.dc_gan.discriminator_model.save(
 os.path.join(output_folder_path, 'discriminator_model_0.h5'.format(iteration_no)))
 self.dc_gan.concatenated_model.save(
 os.path.join(output_folder_path, 'concatenated_model_0.h5'.format(iteration_no)))

 def sample_real_images(self, output_folder_path):
 filepath = os.path.join(output_folder_path, 'CIFAR10_Sample_Real_Images.png')
 i = numpy.random.randint(0, self.x_train.shape[0], 16)
 images = self.x_train[i, :, :, :]
 plt.figure(figsize=(10, 10))
 for i in range(16):
 plt.subplot(4, 4, i + 1)
 image = images[i, :, :, :]
 image = numpy.reshape(image, [32, 32, 3])
 plt.imshow(image)
 plt.axis('off')
 plt.tight_layout()
 plt.savefig(filepath)
 plt.close('all')

 def generate_images(self, output_folder_path, iteration_no, noise=None):
 filepath = os.path.join(output_folder_path, 'CIFAR10_Gen_Image0.png'.format(iteration_no))
 if noise is None:
 noise = numpy.random.uniform(-1, 1, size=[16, 100])
 # Generator output has tanh activation whose range is [-1,1]
 images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
 plt.figure(figsize=(10, 10))
 for i in range(16):
 plt.subplot(4, 4, i + 1)
 image = images[i, :, :, :]
 image = numpy.reshape(image, [32, 32, 3])
 plt.imshow(image)
 plt.axis('off')
 plt.tight_layout()
 plt.savefig(filepath)
 plt.close('all')


def main():
 cifar10_trainer = Cifar10Trainer()
 cifar10_trainer.train(train_steps=10000, log_interval=10, save_interval=100)
 del cifar10_trainer.dc_gan
 return


if __name__ == '__main__':
 start_time = time.time()
 print('Program Started at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(start_time))))
 main()
 end_time = time.time()
 print('Program Ended at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(end_time))))
 print('Total Execution Time: 0s'.format(datetime.timedelta(seconds=end_time - start_time)))

Some of the graphs are as below:

Discriminator Optimizer: Adam(lr=0.0001, beta1=0.5)

Generator Optimizer: Adam(lr=0.0001, beta1=0.5)

Discriminator Optimizer: SGD(lr=0.0001)

Generator Optimizer: SGD(lr=0.0001)

Discriminator Optimizer: SGD(lr=0.0001)

Generator Optimizer: SGD(lr=0.001)

Discriminator Optimizer: SGD(lr=0.0001)

Generator Optimizer: SGD(lr=0.0005)

Note:

This question was originally asked in StackOverflow and then re-asked here as per suggestions in SO

Edit1:

Adding some generated images for reference

Images generated by Adam Optimizer

Images generated by SGD Optimizer

edited Mar 24 at 12:01

asked Mar 24 at 5:18

Nagabhushan S N

1597

Here is my code:
Cifar10Models.py

from keras import Sequential
from keras.initializers import TruncatedNormal
from keras.layers import Activation, BatchNormalization, Conv2D, Conv2DTranspose, Dense, Flatten, LeakyReLU, Reshape
from keras.optimizers import SGD


class DcGan:
 def __init__(self, print_model_summary: bool = False):
 self.generator_model = None
 self.discriminator_model = None
 self.concatenated_model = None
 self.print_model_summary = print_model_summary

 def build_generator_model(self):
 if self.generator_model:
 return self.generator_model

 self.generator_model = Sequential()
 self.generator_model.add(Dense(4 * 4 * 512, input_dim=100,
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))
 self.generator_model.add(Reshape((4, 4, 512)))

 self.generator_model.add(Conv2DTranspose(256, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))

 self.generator_model.add(Conv2DTranspose(128, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))

 self.generator_model.add(Conv2DTranspose(64, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.generator_model.add(Activation('relu'))

 self.generator_model.add(Conv2D(3, 3, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(Activation('tanh'))

 if self.print_model_summary:
 self.generator_model.summary()

 return self.generator_model

 def build_discriminator_model(self):
 if self.discriminator_model:
 return self.discriminator_model

 self.discriminator_model = Sequential()
 self.discriminator_model.add(Conv2D(128, 3, strides=2, input_shape=(32, 32, 3), padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Conv2D(256, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Conv2D(512, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Conv2D(1024, 3, strides=2, padding='same',
 kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(LeakyReLU(alpha=0.2))

 self.discriminator_model.add(Flatten())
 self.discriminator_model.add(Dense(1, kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.02)))
 self.generator_model.add(BatchNormalization(momentum=0.5))
 self.discriminator_model.add(Activation('sigmoid'))

 if self.print_model_summary:
 self.discriminator_model.summary()

 return self.discriminator_model

 def build_concatenated_model(self):
 if self.concatenated_model:
 return self.concatenated_model

 self.concatenated_model = Sequential()
 self.concatenated_model.add(self.generator_model)
 self.concatenated_model.add(self.discriminator_model)

 if self.print_model_summary:
 self.concatenated_model.summary()

 return self.concatenated_model

 def build_dc_gan(self):
 self.build_generator_model()
 self.build_discriminator_model()
 self.build_concatenated_model()

 self.discriminator_model.trainable = True
 optimizer = SGD(lr=0.0002)
 self.discriminator_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
 self.discriminator_model.trainable = False
 optimizer = SGD(lr=0.0001)
 self.concatenated_model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
 self.discriminator_model.trainable = True

Cifar10Trainer.py:

# Based on https://towardsdatascience.com/gan-by-example-using-keras-on-tensorflow-backend-1a6d515a60d0

import os

import datetime
import numpy
import time
from keras.datasets import cifar10
from keras.utils import np_utils
from matplotlib import pyplot as plt

import Cifar10Models

log_file_name = 'logs.csv'


class Cifar10Trainer:
 def __init__(self):
 self.x_train, self.y_train = self.get_train_and_test_data()
 self.dc_gan = Cifar10Models.DcGan()
 self.dc_gan.build_dc_gan()

 @staticmethod
 def get_train_and_test_data():
 (x_train, y_train), _ = cifar10.load_data()
 x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 3)
 # Generator output has tanh activation whose range is [-1,1]
 x_train = (x_train.astype('float32') * 2 / 255) - 1
 y_train = np_utils.to_categorical(y_train, 10)
 return x_train, y_train

 def train(self, train_steps=10000, batch_size=128, log_interval=10, save_interval=100,
 output_folder_path='./Trained_Models/'):
 self.initialize_log(output_folder_path)
 self.sample_real_images(output_folder_path)
 for i in range(train_steps):
 # Get real (Database) Images
 images_real = self.x_train[numpy.random.randint(0, self.x_train.shape[0], size=batch_size), :, :, :]

 # Generate Fake Images
 noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
 images_fake = self.dc_gan.generator_model.predict(noise)

 # Train discriminator on both real and fake images
 x = numpy.concatenate((images_real, images_fake), axis=0)
 y = numpy.ones([2 * batch_size, 1])
 y[batch_size:, :] = 0
 d_loss = self.dc_gan.discriminator_model.train_on_batch(x, y)

 # Train generator i.e. concatenated model
 noise = numpy.random.uniform(-1.0, 1.0, size=[batch_size, 100])
 y = numpy.ones([batch_size, 1])
 g_loss = self.dc_gan.concatenated_model.train_on_batch(noise, y)

 # Print Logs, Save Models, generate sample images
 if (i + 1) % log_interval == 0:
 self.log_progress(output_folder_path, i + 1, g_loss, d_loss)
 if (i + 1) % save_interval == 0:
 self.save_models(output_folder_path, i + 1)
 self.generate_images(output_folder_path, i + 1)

 @staticmethod
 def initialize_log(output_folder_path):
 log_line = 'Iteration No, Generator Loss, Generator Accuracy, Discriminator Loss, Discriminator Accuracy, ' 
 'Timen'
 with open(os.path.join(output_folder_path, log_file_name), 'w') as log_file:
 log_file.write(log_line)

 @staticmethod
 def log_progress(output_folder_path, iteration_no, g_loss, d_loss):
 log_line = '0:05,1:2.4f,2:0.4f,3:2.4f,4:0.4f,5n' 
 .format(iteration_no, g_loss[0], g_loss[1], d_loss[0], d_loss[1],
 datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
 with open(os.path.join(output_folder_path, log_file_name), 'a') as log_file:
 log_file.write(log_line)
 print(log_line)

 def save_models(self, output_folder_path, iteration_no):
 self.dc_gan.generator_model.save(
 os.path.join(output_folder_path, 'generator_model_0.h5'.format(iteration_no)))
 self.dc_gan.discriminator_model.save(
 os.path.join(output_folder_path, 'discriminator_model_0.h5'.format(iteration_no)))
 self.dc_gan.concatenated_model.save(
 os.path.join(output_folder_path, 'concatenated_model_0.h5'.format(iteration_no)))

 def sample_real_images(self, output_folder_path):
 filepath = os.path.join(output_folder_path, 'CIFAR10_Sample_Real_Images.png')
 i = numpy.random.randint(0, self.x_train.shape[0], 16)
 images = self.x_train[i, :, :, :]
 plt.figure(figsize=(10, 10))
 for i in range(16):
 plt.subplot(4, 4, i + 1)
 image = images[i, :, :, :]
 image = numpy.reshape(image, [32, 32, 3])
 plt.imshow(image)
 plt.axis('off')
 plt.tight_layout()
 plt.savefig(filepath)
 plt.close('all')

 def generate_images(self, output_folder_path, iteration_no, noise=None):
 filepath = os.path.join(output_folder_path, 'CIFAR10_Gen_Image0.png'.format(iteration_no))
 if noise is None:
 noise = numpy.random.uniform(-1, 1, size=[16, 100])
 # Generator output has tanh activation whose range is [-1,1]
 images = (self.dc_gan.generator_model.predict(noise) + 1) / 2
 plt.figure(figsize=(10, 10))
 for i in range(16):
 plt.subplot(4, 4, i + 1)
 image = images[i, :, :, :]
 image = numpy.reshape(image, [32, 32, 3])
 plt.imshow(image)
 plt.axis('off')
 plt.tight_layout()
 plt.savefig(filepath)
 plt.close('all')


def main():
 cifar10_trainer = Cifar10Trainer()
 cifar10_trainer.train(train_steps=10000, log_interval=10, save_interval=100)
 del cifar10_trainer.dc_gan
 return


if __name__ == '__main__':
 start_time = time.time()
 print('Program Started at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(start_time))))
 main()
 end_time = time.time()
 print('Program Ended at 0'.format(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(end_time))))
 print('Total Execution Time: 0s'.format(datetime.timedelta(seconds=end_time - start_time)))

Some of the graphs are as below:

Discriminator Optimizer: Adam(lr=0.0001, beta1=0.5)

Generator Optimizer: Adam(lr=0.0001, beta1=0.5)

Discriminator Optimizer: SGD(lr=0.0001)

Generator Optimizer: SGD(lr=0.0001)

Discriminator Optimizer: SGD(lr=0.0001)

Generator Optimizer: SGD(lr=0.001)

Discriminator Optimizer: SGD(lr=0.0001)

Generator Optimizer: SGD(lr=0.0005)

Note:

This question was originally asked in StackOverflow and then re-asked here as per suggestions in SO

Edit1:

Adding some generated images for reference

Images generated by Adam Optimizer

Images generated by SGD Optimizer

python deep-learning keras optimization gan

edited Mar 24 at 12:01

asked Mar 24 at 5:18

Nagabhushan S N

1597

edited Mar 24 at 12:01

asked Mar 24 at 5:18

Nagabhushan S N

1597

edited Mar 24 at 12:01

asked Mar 24 at 5:18

Nagabhushan S N

1597

asked Mar 24 at 5:18

Nagabhushan S N

1597

asked Mar 24 at 5:18

Nagabhushan S N

1597

add a comment |

1 Answer
1

active

oldest

votes

I think that there are several issues with your model:

First of all - Your generator's loss is not the generator's loss. You have on binary cross-entropy loss function for the discriminator, and you have another binary cross-entropy loss function for the concatenated model whose output is again the discriminator's output (on generated images).
The "generator loss" you are showing is the discriminator's loss when dealing with generated images. You want this loss to go up, it means that your model successfully generates images that you discriminator fails to catch (as can be seen in the overall discriminator's accuracy which is at 0.5).

Another issue, is that you should add some generator regularization in the form of an actual generator loss ("generator objective function"). You can read about the different options in GAN Objective Functions: GANs and Their Variations.

A final issue that I see is that you are passing the generated images thru a final hyperbolic tangent activation function, and I don't really understand why? The generator in your case is supposed to generate a "believable" CIFAR10 image, which is a 32x32x3 tensor with values in the range [0,255] or [0,1]. Your generator's output has a potential range of [-1,1] (as you state in your code).

answered Mar 24 at 8:15

Mark.F

1,0241421

$begingroup$
Happy 1K! Comments must be at least 15 characters in length.
$endgroup$
– Esmailian
Mar 24 at 8:26

1

$begingroup$
Thank you very much :)
$endgroup$
– Mark.F
Mar 24 at 10:15

$begingroup$
Thanks. 1. What I've defined as generator_loss, it is the binary cross entropy between the discriminator output and the desired output, which is 1 while training generator. Now, if my generator is able to fool the discriminator, then discriminator output should be close to 1, right?. So, the bce value should decrease. Right? Also, if you see the first graph where I've used Adam instead of SGD, the loss didn't increase. In that case, the generated images are better. When using SGD, the generated images are noise. Since generator accuracy is 0, the discriminator accuracy of 0.5 doesn't mean much
$endgroup$
– Nagabhushan S N
Mar 24 at 11:54

$begingroup$
2. I'll look into GAN objective functions. I was trying to implement plain DCGAN paper. 3. I'm using tanh function because DC-GAN paper says so. Yes, even though tanh outputs in the range [-1,1], if you see the generate_images function in Trainer.py file, I'm doing this: images = (self.dc_gan.generator_model.predict(noise) + 1) / 2 So, this is a valid Image right? Also, like the first graph showed, even with this, using Adam Optimizer, I'm getting good results. Using SGD is causes this problem. The only change is SGD is used instead of Adam. So, there must be some problem with this.
$endgroup$
– Nagabhushan S N
Mar 24 at 11:58

$begingroup$
I've added some generated images for reference. Please check them as well. Again, thanks a lot for your time and suggestions
$endgroup$
– Nagabhushan S N
Mar 24 at 12:02

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47875%2fwhy-is-my-generator-loss-function-increasing-with-iterations%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

I think that there are several issues with your model:

answered Mar 24 at 8:15

Mark.F

1,0241421

$begingroup$
Happy 1K! Comments must be at least 15 characters in length.
$endgroup$
– Esmailian
Mar 24 at 8:26

1

$begingroup$
Thank you very much :)
$endgroup$
– Mark.F
Mar 24 at 10:15

$begingroup$
Thanks. 1. What I've defined as generator_loss, it is the binary cross entropy between the discriminator output and the desired output, which is 1 while training generator. Now, if my generator is able to fool the discriminator, then discriminator output should be close to 1, right?. So, the bce value should decrease. Right? Also, if you see the first graph where I've used Adam instead of SGD, the loss didn't increase. In that case, the generated images are better. When using SGD, the generated images are noise. Since generator accuracy is 0, the discriminator accuracy of 0.5 doesn't mean much
$endgroup$
– Nagabhushan S N
Mar 24 at 11:54

$begingroup$
2. I'll look into GAN objective functions. I was trying to implement plain DCGAN paper. 3. I'm using tanh function because DC-GAN paper says so. Yes, even though tanh outputs in the range [-1,1], if you see the generate_images function in Trainer.py file, I'm doing this: images = (self.dc_gan.generator_model.predict(noise) + 1) / 2 So, this is a valid Image right? Also, like the first graph showed, even with this, using Adam Optimizer, I'm getting good results. Using SGD is causes this problem. The only change is SGD is used instead of Adam. So, there must be some problem with this.
$endgroup$
– Nagabhushan S N
Mar 24 at 11:58

$begingroup$
I've added some generated images for reference. Please check them as well. Again, thanks a lot for your time and suggestions
$endgroup$
– Nagabhushan S N
Mar 24 at 12:02

add a comment |

I think that there are several issues with your model:

answered Mar 24 at 8:15

Mark.F

1,0241421

$begingroup$
Happy 1K! Comments must be at least 15 characters in length.
$endgroup$
– Esmailian
Mar 24 at 8:26

1

$begingroup$
Thank you very much :)
$endgroup$
– Mark.F
Mar 24 at 10:15

$begingroup$
Thanks. 1. What I've defined as generator_loss, it is the binary cross entropy between the discriminator output and the desired output, which is 1 while training generator. Now, if my generator is able to fool the discriminator, then discriminator output should be close to 1, right?. So, the bce value should decrease. Right? Also, if you see the first graph where I've used Adam instead of SGD, the loss didn't increase. In that case, the generated images are better. When using SGD, the generated images are noise. Since generator accuracy is 0, the discriminator accuracy of 0.5 doesn't mean much
$endgroup$
– Nagabhushan S N
Mar 24 at 11:54

$begingroup$
2. I'll look into GAN objective functions. I was trying to implement plain DCGAN paper. 3. I'm using tanh function because DC-GAN paper says so. Yes, even though tanh outputs in the range [-1,1], if you see the generate_images function in Trainer.py file, I'm doing this: images = (self.dc_gan.generator_model.predict(noise) + 1) / 2 So, this is a valid Image right? Also, like the first graph showed, even with this, using Adam Optimizer, I'm getting good results. Using SGD is causes this problem. The only change is SGD is used instead of Adam. So, there must be some problem with this.
$endgroup$
– Nagabhushan S N
Mar 24 at 11:58

$begingroup$
I've added some generated images for reference. Please check them as well. Again, thanks a lot for your time and suggestions
$endgroup$
– Nagabhushan S N
Mar 24 at 12:02

add a comment |

I think that there are several issues with your model:

answered Mar 24 at 8:15

Mark.F

1,0241421

I think that there are several issues with your model:

answered Mar 24 at 8:15

Mark.F

1,0241421

answered Mar 24 at 8:15

Mark.F

1,0241421

answered Mar 24 at 8:15

Mark.F

1,0241421

answered Mar 24 at 8:15

Mark.F

1,0241421

$begingroup$
Happy 1K! Comments must be at least 15 characters in length.
$endgroup$
– Esmailian
Mar 24 at 8:26

1

$begingroup$
Thank you very much :)
$endgroup$
– Mark.F
Mar 24 at 10:15

$begingroup$
Thanks. 1. What I've defined as generator_loss, it is the binary cross entropy between the discriminator output and the desired output, which is 1 while training generator. Now, if my generator is able to fool the discriminator, then discriminator output should be close to 1, right?. So, the bce value should decrease. Right? Also, if you see the first graph where I've used Adam instead of SGD, the loss didn't increase. In that case, the generated images are better. When using SGD, the generated images are noise. Since generator accuracy is 0, the discriminator accuracy of 0.5 doesn't mean much
$endgroup$
– Nagabhushan S N
Mar 24 at 11:54

$begingroup$
2. I'll look into GAN objective functions. I was trying to implement plain DCGAN paper. 3. I'm using tanh function because DC-GAN paper says so. Yes, even though tanh outputs in the range [-1,1], if you see the generate_images function in Trainer.py file, I'm doing this: images = (self.dc_gan.generator_model.predict(noise) + 1) / 2 So, this is a valid Image right? Also, like the first graph showed, even with this, using Adam Optimizer, I'm getting good results. Using SGD is causes this problem. The only change is SGD is used instead of Adam. So, there must be some problem with this.
$endgroup$
– Nagabhushan S N
Mar 24 at 11:58

$begingroup$
I've added some generated images for reference. Please check them as well. Again, thanks a lot for your time and suggestions
$endgroup$
– Nagabhushan S N
Mar 24 at 12:02

add a comment |

$begingroup$
Happy 1K! Comments must be at least 15 characters in length.
$endgroup$
– Esmailian
Mar 24 at 8:26

1

$begingroup$
Thank you very much :)
$endgroup$
– Mark.F
Mar 24 at 10:15

$begingroup$
Thanks. 1. What I've defined as generator_loss, it is the binary cross entropy between the discriminator output and the desired output, which is 1 while training generator. Now, if my generator is able to fool the discriminator, then discriminator output should be close to 1, right?. So, the bce value should decrease. Right? Also, if you see the first graph where I've used Adam instead of SGD, the loss didn't increase. In that case, the generated images are better. When using SGD, the generated images are noise. Since generator accuracy is 0, the discriminator accuracy of 0.5 doesn't mean much
$endgroup$
– Nagabhushan S N
Mar 24 at 11:54

$begingroup$
2. I'll look into GAN objective functions. I was trying to implement plain DCGAN paper. 3. I'm using tanh function because DC-GAN paper says so. Yes, even though tanh outputs in the range [-1,1], if you see the generate_images function in Trainer.py file, I'm doing this: images = (self.dc_gan.generator_model.predict(noise) + 1) / 2 So, this is a valid Image right? Also, like the first graph showed, even with this, using Adam Optimizer, I'm getting good results. Using SGD is causes this problem. The only change is SGD is used instead of Adam. So, there must be some problem with this.
$endgroup$
– Nagabhushan S N
Mar 24 at 11:58

$begingroup$
I've added some generated images for reference. Please check them as well. Again, thanks a lot for your time and suggestions
$endgroup$
– Nagabhushan S N
Mar 24 at 12:02

Happy 1K! Comments must be at least 15 characters in length.

– Esmailian
Mar 24 at 8:26

Thank you very much :)

– Mark.F
Mar 24 at 10:15

Thanks. 1. What I've defined as generator_loss, it is the binary cross entropy between the discriminator output and the desired output, which is 1 while training generator. Now, if my generator is able to fool the discriminator, then discriminator output should be close to 1, right?. So, the bce value should decrease. Right? Also, if you see the first graph where I've used Adam instead of SGD, the loss didn't increase. In that case, the generated images are better. When using SGD, the generated images are noise. Since generator accuracy is 0, the discriminator accuracy of 0.5 doesn't mean much

– Nagabhushan S N
Mar 24 at 11:54

2. I'll look into GAN objective functions. I was trying to implement plain DCGAN paper. 3. I'm using tanh function because DC-GAN paper says so. Yes, even though tanh outputs in the range [-1,1], if you see the generate_images function in Trainer.py file, I'm doing this: images = (self.dc_gan.generator_model.predict(noise) + 1) / 2 So, this is a valid Image right? Also, like the first graph showed, even with this, using Adam Optimizer, I'm getting good results. Using SGD is causes this problem. The only change is SGD is used instead of Adam. So, there must be some problem with this.

– Nagabhushan S N
Mar 24 at 11:58

I've added some generated images for reference. Please check them as well. Again, thanks a lot for your time and suggestions

– Nagabhushan S N
Mar 24 at 12:02

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

IVfQ hRHaq2RxpySXJLwqmNscPqLh8YcVxi0fE93c3gRn 6bUlwJ0n6onVWnrbAlL8AVPuM,PkyncLQswEg ug3nzKZPPrzD21vDYkr

搜尋此網誌

Trjtdtk

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

1 Answer
1

1 Answer
1

1 Answer
1