I have designed a rudimentary network to evaluate draughts (checkers) positions. Is the convergence too slow or is my model too simple?2019 Community Moderator ElectionSimple feed forward neural network on Tensorflow questionWhat should I use if I have millions of categories for a sklearn predictive model?How to create a model that can have multiple labels associated to it?Applying bayesian methods to a simple neural networkPandas dataframe resample aggregation by mills too slowSimple prediction with KerasSimple 3 layers Neural Network cannot be trainedInvalid Argument Error when running simple Convolutional Neural NetworkPrint the prediction of convolutional neural networkHaving difficult interpreting the eigenvectors for a simple 3x2 matrix

can i play a electric guitar through a bass amp?

What do the dots in this tr command do: tr .............A-Z A-ZA-Z <<< "JVPQBOV" (with 13 dots)

Do VLANs within a subnet need to have their own subnet for router on a stick?

Collect Fourier series terms

How do I create uniquely male characters?

How does strength of boric acid solution increase in presence of salicylic acid?

Schoenfled Residua test shows proportionality hazard assumptions holds but Kaplan-Meier plots intersect

Dragon forelimb placement

Why "Having chlorophyll without photosynthesis is actually very dangerous" and "like living with a bomb"?

Why do falling prices hurt debtors?

How does one intimidate enemies without having the capacity for violence?

Have astronauts in space suits ever taken selfies? If so, how?

Why Is Death Allowed In the Matrix?

Writing rule stating superpower from different root cause is bad writing

In Japanese, what’s the difference between “Tonari ni” (となりに) and “Tsugi” (つぎ)? When would you use one over the other?

I’m planning on buying a laser printer but concerned about the life cycle of toner in the machine

How did the USSR manage to innovate in an environment characterized by government censorship and high bureaucracy?

Why doesn't Newton's third law mean a person bounces back to where they started when they hit the ground?

Mathematical cryptic clues

What's the output of a record cartridge playing an out-of-speed record

Finding angle with pure Geometry.

What is the word for reserving something for yourself before others do?

How can I make my BBEG immortal short of making them a Lich or Vampire?

Approximately how much travel time was saved by the opening of the Suez Canal in 1869?



I have designed a rudimentary network to evaluate draughts (checkers) positions. Is the convergence too slow or is my model too simple?



2019 Community Moderator ElectionSimple feed forward neural network on Tensorflow questionWhat should I use if I have millions of categories for a sklearn predictive model?How to create a model that can have multiple labels associated to it?Applying bayesian methods to a simple neural networkPandas dataframe resample aggregation by mills too slowSimple prediction with KerasSimple 3 layers Neural Network cannot be trainedInvalid Argument Error when running simple Convolutional Neural NetworkPrint the prediction of convolutional neural networkHaving difficult interpreting the eigenvectors for a simple 3x2 matrix










0












$begingroup$


The data is in the format of a board with 'my'/'your' pieces and it is 'my' turn to move. Specifically the board is 8x4, and their are 4 piece types, and so the position is represented as a 4x8x4 with one-hot encoding. Then there is a layer of 2**13 hidden neurons, each of which is connected to every input on the 4x8x4 tensor with ReLU activation. Then there is an output layer of 3 neurons (L/D/W) are each connected to every hidden neuron with no activation function. The cost of a prediction is (L-is_lost)**2 + (D-is_drawn)**2 + (W-is_won)**2



I have implemented this in numpy. I am new to machine learning and I wanted to start at a low level of abstraction. This is the code:



import numpy as np

def relu(x):
if x < 0:
return 0
return x

def relu_prime(x):
if x < 0:
return 0
return 1

class NN():
def __init__(self, hidden_weights, hidden_biases, output_weights,output_biases):

self.hidden_weights = hidden_weights
self.hidden_biases = hidden_biases
self.output_weights = output_weights
self.output_biases = output_biases

def forward(self, tensor):
hidden_neuron_count = 2**13 #not a parameter

hidden_neuron_activations = []
for f in range(hidden_neuron_count):
activation = np.sum(self.hidden_weights[f,:,:]*tensor)+self.hidden_biases[f]
hidden_neuron_activations.append(activation)

hidden_neuron_activations = np.array(hidden_neuron_activations)
output_activations = [np.sum(list(map(relu, hidden_neuron_activations))*self.output_weights[:,result_output])+self.output_biases[result_output] for result_output in range(3)]

return (hidden_neuron_activations, output_activations)

def train_epoch(self, tensors, labels, lr_weights, lr_biases):
hidden_neuron_count = 2**13 #not a parameter

d_output_weights = np.zeros((hidden_neuron_count,3))
d_output_biases = np.zeros(3)
d_hidden_weights = np.zeros((hidden_neuron_count,4,8,4))
d_hidden_biases = np.zeros(hidden_neuron_count)

batch_size = len(tensors)

for tensor, label in zip(tensors,labels):
results = self.forward(tensor)
hidden_neuron_activations = results[0]
output_activations = results[1]

Eo = np.array([ 2.0/batch_size*((output_activations[result_output])-float(int(result_output-1) == int(label))) for result_output in range(3)])

d_output_biases += Eo
d_output_weights += np.array([ np.multiply(Eo,relu(hidden_neuron_activations[neuron])) for neuron in range(hidden_neuron_count)])

d_hidden_biases += np.array([ sum( np.multiply(Eo*self.output_weights[neuron], relu_prime(hidden_neuron_activations[neuron])) ) for neuron in range(hidden_neuron_count)])
d_hidden_weights += np.array([ sum([np.multiply(tensor, Eo[j]*self.output_weights[neuron][j]*relu_prime(hidden_neuron_activations[neuron])) for j in range(3)]) for neuron in range(hidden_neuron_count)])

self.output_weights -= lr_weights*d_output_weights
self.output_biases -= lr_biases*d_output_biases
self.hidden_weights -= lr_weights*d_hidden_weights
self.hidden_biases -= lr_biases*d_hidden_biases

def show(self, tensors, labels):

total_cost = 0
batch_size = len(tensors)

for tensor, label in zip(tensors,labels):

activations = self.forward(tensor)
output_activations = activations[1]
cost = sum([((float(int(result_output-1) == int(label))-(output_activations[result_output])) )**2 for result_output in range(3)])
total_cost += cost
print( [(output_activations[result_output]) for result_output in range(3)] , cost, label )

total_cost /= batch_size
print('Average-cost: '.format(total_cost))


I am feeding it endgame positions. I've generated ~8,000 positions to feed it. THe error has decreased in 40 epochs (each epoch feeds it about 1/15 of the data) from 6000 to about 1500. I am doubtful at this point that tailoring the learning rate (I use double for biases) or giving it more data will improve this score. I also believe that I would have more success with sigmoid activations on both layers. Basically, should I press on or is this problem intractable on this small of a network. I have a GPU which supports an old CUDA, so I will soon do something more auspicious in pytorch.










share|improve this question









$endgroup$
















    0












    $begingroup$


    The data is in the format of a board with 'my'/'your' pieces and it is 'my' turn to move. Specifically the board is 8x4, and their are 4 piece types, and so the position is represented as a 4x8x4 with one-hot encoding. Then there is a layer of 2**13 hidden neurons, each of which is connected to every input on the 4x8x4 tensor with ReLU activation. Then there is an output layer of 3 neurons (L/D/W) are each connected to every hidden neuron with no activation function. The cost of a prediction is (L-is_lost)**2 + (D-is_drawn)**2 + (W-is_won)**2



    I have implemented this in numpy. I am new to machine learning and I wanted to start at a low level of abstraction. This is the code:



    import numpy as np

    def relu(x):
    if x < 0:
    return 0
    return x

    def relu_prime(x):
    if x < 0:
    return 0
    return 1

    class NN():
    def __init__(self, hidden_weights, hidden_biases, output_weights,output_biases):

    self.hidden_weights = hidden_weights
    self.hidden_biases = hidden_biases
    self.output_weights = output_weights
    self.output_biases = output_biases

    def forward(self, tensor):
    hidden_neuron_count = 2**13 #not a parameter

    hidden_neuron_activations = []
    for f in range(hidden_neuron_count):
    activation = np.sum(self.hidden_weights[f,:,:]*tensor)+self.hidden_biases[f]
    hidden_neuron_activations.append(activation)

    hidden_neuron_activations = np.array(hidden_neuron_activations)
    output_activations = [np.sum(list(map(relu, hidden_neuron_activations))*self.output_weights[:,result_output])+self.output_biases[result_output] for result_output in range(3)]

    return (hidden_neuron_activations, output_activations)

    def train_epoch(self, tensors, labels, lr_weights, lr_biases):
    hidden_neuron_count = 2**13 #not a parameter

    d_output_weights = np.zeros((hidden_neuron_count,3))
    d_output_biases = np.zeros(3)
    d_hidden_weights = np.zeros((hidden_neuron_count,4,8,4))
    d_hidden_biases = np.zeros(hidden_neuron_count)

    batch_size = len(tensors)

    for tensor, label in zip(tensors,labels):
    results = self.forward(tensor)
    hidden_neuron_activations = results[0]
    output_activations = results[1]

    Eo = np.array([ 2.0/batch_size*((output_activations[result_output])-float(int(result_output-1) == int(label))) for result_output in range(3)])

    d_output_biases += Eo
    d_output_weights += np.array([ np.multiply(Eo,relu(hidden_neuron_activations[neuron])) for neuron in range(hidden_neuron_count)])

    d_hidden_biases += np.array([ sum( np.multiply(Eo*self.output_weights[neuron], relu_prime(hidden_neuron_activations[neuron])) ) for neuron in range(hidden_neuron_count)])
    d_hidden_weights += np.array([ sum([np.multiply(tensor, Eo[j]*self.output_weights[neuron][j]*relu_prime(hidden_neuron_activations[neuron])) for j in range(3)]) for neuron in range(hidden_neuron_count)])

    self.output_weights -= lr_weights*d_output_weights
    self.output_biases -= lr_biases*d_output_biases
    self.hidden_weights -= lr_weights*d_hidden_weights
    self.hidden_biases -= lr_biases*d_hidden_biases

    def show(self, tensors, labels):

    total_cost = 0
    batch_size = len(tensors)

    for tensor, label in zip(tensors,labels):

    activations = self.forward(tensor)
    output_activations = activations[1]
    cost = sum([((float(int(result_output-1) == int(label))-(output_activations[result_output])) )**2 for result_output in range(3)])
    total_cost += cost
    print( [(output_activations[result_output]) for result_output in range(3)] , cost, label )

    total_cost /= batch_size
    print('Average-cost: '.format(total_cost))


    I am feeding it endgame positions. I've generated ~8,000 positions to feed it. THe error has decreased in 40 epochs (each epoch feeds it about 1/15 of the data) from 6000 to about 1500. I am doubtful at this point that tailoring the learning rate (I use double for biases) or giving it more data will improve this score. I also believe that I would have more success with sigmoid activations on both layers. Basically, should I press on or is this problem intractable on this small of a network. I have a GPU which supports an old CUDA, so I will soon do something more auspicious in pytorch.










    share|improve this question









    $endgroup$














      0












      0








      0





      $begingroup$


      The data is in the format of a board with 'my'/'your' pieces and it is 'my' turn to move. Specifically the board is 8x4, and their are 4 piece types, and so the position is represented as a 4x8x4 with one-hot encoding. Then there is a layer of 2**13 hidden neurons, each of which is connected to every input on the 4x8x4 tensor with ReLU activation. Then there is an output layer of 3 neurons (L/D/W) are each connected to every hidden neuron with no activation function. The cost of a prediction is (L-is_lost)**2 + (D-is_drawn)**2 + (W-is_won)**2



      I have implemented this in numpy. I am new to machine learning and I wanted to start at a low level of abstraction. This is the code:



      import numpy as np

      def relu(x):
      if x < 0:
      return 0
      return x

      def relu_prime(x):
      if x < 0:
      return 0
      return 1

      class NN():
      def __init__(self, hidden_weights, hidden_biases, output_weights,output_biases):

      self.hidden_weights = hidden_weights
      self.hidden_biases = hidden_biases
      self.output_weights = output_weights
      self.output_biases = output_biases

      def forward(self, tensor):
      hidden_neuron_count = 2**13 #not a parameter

      hidden_neuron_activations = []
      for f in range(hidden_neuron_count):
      activation = np.sum(self.hidden_weights[f,:,:]*tensor)+self.hidden_biases[f]
      hidden_neuron_activations.append(activation)

      hidden_neuron_activations = np.array(hidden_neuron_activations)
      output_activations = [np.sum(list(map(relu, hidden_neuron_activations))*self.output_weights[:,result_output])+self.output_biases[result_output] for result_output in range(3)]

      return (hidden_neuron_activations, output_activations)

      def train_epoch(self, tensors, labels, lr_weights, lr_biases):
      hidden_neuron_count = 2**13 #not a parameter

      d_output_weights = np.zeros((hidden_neuron_count,3))
      d_output_biases = np.zeros(3)
      d_hidden_weights = np.zeros((hidden_neuron_count,4,8,4))
      d_hidden_biases = np.zeros(hidden_neuron_count)

      batch_size = len(tensors)

      for tensor, label in zip(tensors,labels):
      results = self.forward(tensor)
      hidden_neuron_activations = results[0]
      output_activations = results[1]

      Eo = np.array([ 2.0/batch_size*((output_activations[result_output])-float(int(result_output-1) == int(label))) for result_output in range(3)])

      d_output_biases += Eo
      d_output_weights += np.array([ np.multiply(Eo,relu(hidden_neuron_activations[neuron])) for neuron in range(hidden_neuron_count)])

      d_hidden_biases += np.array([ sum( np.multiply(Eo*self.output_weights[neuron], relu_prime(hidden_neuron_activations[neuron])) ) for neuron in range(hidden_neuron_count)])
      d_hidden_weights += np.array([ sum([np.multiply(tensor, Eo[j]*self.output_weights[neuron][j]*relu_prime(hidden_neuron_activations[neuron])) for j in range(3)]) for neuron in range(hidden_neuron_count)])

      self.output_weights -= lr_weights*d_output_weights
      self.output_biases -= lr_biases*d_output_biases
      self.hidden_weights -= lr_weights*d_hidden_weights
      self.hidden_biases -= lr_biases*d_hidden_biases

      def show(self, tensors, labels):

      total_cost = 0
      batch_size = len(tensors)

      for tensor, label in zip(tensors,labels):

      activations = self.forward(tensor)
      output_activations = activations[1]
      cost = sum([((float(int(result_output-1) == int(label))-(output_activations[result_output])) )**2 for result_output in range(3)])
      total_cost += cost
      print( [(output_activations[result_output]) for result_output in range(3)] , cost, label )

      total_cost /= batch_size
      print('Average-cost: '.format(total_cost))


      I am feeding it endgame positions. I've generated ~8,000 positions to feed it. THe error has decreased in 40 epochs (each epoch feeds it about 1/15 of the data) from 6000 to about 1500. I am doubtful at this point that tailoring the learning rate (I use double for biases) or giving it more data will improve this score. I also believe that I would have more success with sigmoid activations on both layers. Basically, should I press on or is this problem intractable on this small of a network. I have a GPU which supports an old CUDA, so I will soon do something more auspicious in pytorch.










      share|improve this question









      $endgroup$




      The data is in the format of a board with 'my'/'your' pieces and it is 'my' turn to move. Specifically the board is 8x4, and their are 4 piece types, and so the position is represented as a 4x8x4 with one-hot encoding. Then there is a layer of 2**13 hidden neurons, each of which is connected to every input on the 4x8x4 tensor with ReLU activation. Then there is an output layer of 3 neurons (L/D/W) are each connected to every hidden neuron with no activation function. The cost of a prediction is (L-is_lost)**2 + (D-is_drawn)**2 + (W-is_won)**2



      I have implemented this in numpy. I am new to machine learning and I wanted to start at a low level of abstraction. This is the code:



      import numpy as np

      def relu(x):
      if x < 0:
      return 0
      return x

      def relu_prime(x):
      if x < 0:
      return 0
      return 1

      class NN():
      def __init__(self, hidden_weights, hidden_biases, output_weights,output_biases):

      self.hidden_weights = hidden_weights
      self.hidden_biases = hidden_biases
      self.output_weights = output_weights
      self.output_biases = output_biases

      def forward(self, tensor):
      hidden_neuron_count = 2**13 #not a parameter

      hidden_neuron_activations = []
      for f in range(hidden_neuron_count):
      activation = np.sum(self.hidden_weights[f,:,:]*tensor)+self.hidden_biases[f]
      hidden_neuron_activations.append(activation)

      hidden_neuron_activations = np.array(hidden_neuron_activations)
      output_activations = [np.sum(list(map(relu, hidden_neuron_activations))*self.output_weights[:,result_output])+self.output_biases[result_output] for result_output in range(3)]

      return (hidden_neuron_activations, output_activations)

      def train_epoch(self, tensors, labels, lr_weights, lr_biases):
      hidden_neuron_count = 2**13 #not a parameter

      d_output_weights = np.zeros((hidden_neuron_count,3))
      d_output_biases = np.zeros(3)
      d_hidden_weights = np.zeros((hidden_neuron_count,4,8,4))
      d_hidden_biases = np.zeros(hidden_neuron_count)

      batch_size = len(tensors)

      for tensor, label in zip(tensors,labels):
      results = self.forward(tensor)
      hidden_neuron_activations = results[0]
      output_activations = results[1]

      Eo = np.array([ 2.0/batch_size*((output_activations[result_output])-float(int(result_output-1) == int(label))) for result_output in range(3)])

      d_output_biases += Eo
      d_output_weights += np.array([ np.multiply(Eo,relu(hidden_neuron_activations[neuron])) for neuron in range(hidden_neuron_count)])

      d_hidden_biases += np.array([ sum( np.multiply(Eo*self.output_weights[neuron], relu_prime(hidden_neuron_activations[neuron])) ) for neuron in range(hidden_neuron_count)])
      d_hidden_weights += np.array([ sum([np.multiply(tensor, Eo[j]*self.output_weights[neuron][j]*relu_prime(hidden_neuron_activations[neuron])) for j in range(3)]) for neuron in range(hidden_neuron_count)])

      self.output_weights -= lr_weights*d_output_weights
      self.output_biases -= lr_biases*d_output_biases
      self.hidden_weights -= lr_weights*d_hidden_weights
      self.hidden_biases -= lr_biases*d_hidden_biases

      def show(self, tensors, labels):

      total_cost = 0
      batch_size = len(tensors)

      for tensor, label in zip(tensors,labels):

      activations = self.forward(tensor)
      output_activations = activations[1]
      cost = sum([((float(int(result_output-1) == int(label))-(output_activations[result_output])) )**2 for result_output in range(3)])
      total_cost += cost
      print( [(output_activations[result_output]) for result_output in range(3)] , cost, label )

      total_cost /= batch_size
      print('Average-cost: '.format(total_cost))


      I am feeding it endgame positions. I've generated ~8,000 positions to feed it. THe error has decreased in 40 epochs (each epoch feeds it about 1/15 of the data) from 6000 to about 1500. I am doubtful at this point that tailoring the learning rate (I use double for biases) or giving it more data will improve this score. I also believe that I would have more success with sigmoid activations on both layers. Basically, should I press on or is this problem intractable on this small of a network. I have a GPU which supports an old CUDA, so I will soon do something more auspicious in pytorch.







      python game






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 27 at 7:43









      basketbasket

      1012




      1012




















          0






          active

          oldest

          votes












          Your Answer





          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "557"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48067%2fi-have-designed-a-rudimentary-network-to-evaluate-draughts-checkers-positions%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48067%2fi-have-designed-a-rudimentary-network-to-evaluate-draughts-checkers-positions%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

          Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

          Do these cracks on my tires look bad? The Next CEO of Stack OverflowDry rot tire should I replace?Having to replace tiresFishtailed so easily? Bad tires? ABS?Filling the tires with something other than air, to avoid puncture hassles?Used Michelin tires safe to install?Do these tyre cracks necessitate replacement?Rumbling noise: tires or mechanicalIs it possible to fix noisy feathered tires?Are bad winter tires still better than summer tires in winter?Torque converter failure - Related to replacing only 2 tires?Why use snow tires on all 4 wheels on 2-wheel-drive cars?