K.gradients gives type error where both arguments are tensors The 2019 Stack Overflow Developer Survey Results Are InWhich type auto encoder gives best results for textArchitecture for multivariate multi-time-series model where some features are TS specific and some features are globalHow to deal with classification where all target classes are independent (keras & image recognition)
Multiply Two Integer Polynomials
Pokemon Turn Based battle (Python)
Why isn't the circumferential light around the M87 black hole's event horizon symmetric?
Why didn't the Event Horizon Telescope team mention Sagittarius A*?
If I can cast sorceries at instant speed, can I use sorcery-speed activated abilities at instant speed?
Is "plugging out" electronic devices an American expression?
How to type this arrow in math mode?
Is it possible for absolutely everyone to attain enlightenment?
Why are there uneven bright areas in this photo of black hole?
When should I buy a clipper card after flying to Oakland?
Button changing its text & action. Good or terrible?
Omit the same coordinate parameters in drawing line in tikz
Why couldn't they take pictures of a closer black hole?
Can we generate random numbers using irrational numbers like π and e?
What is the motivation for a law requiring 2 parties to consent for recording a conversation
What to expect from an e-bike service?
I am eight letters word. Find me who Am I?
Why doesn't mkfifo with a mode of 1755 grant read permissions and sticky bit to the user?
Proving Trigonometric “Definitions”
What are the motivations for publishing new editions of an existing textbook, beyond new discoveries in a field?
How come people say “Would of”?
How to support a colleague who finds meetings extremely tiring?
Vorinclex, does my opponents land untap if they were tapped before i summoned him?
What does Linus Torvalds mean when he says that Git "never ever" tracks a file?
K.gradients gives type error where both arguments are tensors
The 2019 Stack Overflow Developer Survey Results Are InWhich type auto encoder gives best results for textArchitecture for multivariate multi-time-series model where some features are TS specific and some features are globalHow to deal with classification where all target classes are independent (keras & image recognition)
$begingroup$
On lines 142 and 143 of: https://github.com/nyck33/openai_spinup_my_implements/blob/master/continuous/mountaincar/my_ddpg_ac.py
I have:
self.get_action_gradients = K.function(inputs=[self.model.input[0], self.model.input[1],
K.learning_phase()], outputs=[action_gradients])
Which tells me:line 143, in build_model
K.learning_phase()], outputs=[action_gradients])
TypeError: Can not convert a list into a Tensor or Operation.
action_gradients
are calculated on line 140 via:
action_gradients = K.gradients(Q_value, actions)
so I did not think that is a problem but when I take the brackets off of the output argument for K.function
like so:
self.get_action_gradients = K.function(inputs=[*self.model.input,
K.learning_phase()], outputs=action_gradients)
Now I get a slightly different error mentioning Nonetype
rather than list
:
rning-copied/_1my_imps/continuous/mountaincar/my_ddpg_ac.py", line 143, in build_model
TypeError: Can not convert a NoneType into a Tensor or Operation.
Printing out the Q_values and actions show they are tensors:
Q_values Tensor("q_values/BiasAdd:0", shape=(?, 1), dtype=float32) actions Tensor("actions:0", shape=(?, 1), dtype=float32)
But printing out the action_gradients
and type(action_gradients)
just confuses me more:
action_gradients [None]
action_gradients type <class 'list'>
calling: K.gradients()
on two tensors should work shouldn't it?
This DDPG code is originally from here: https://github.com/nyck33/autonomous_quadcopter
and I am trying to adapt it for MountainCarContinuous-v0
keras
$endgroup$
add a comment |
$begingroup$
On lines 142 and 143 of: https://github.com/nyck33/openai_spinup_my_implements/blob/master/continuous/mountaincar/my_ddpg_ac.py
I have:
self.get_action_gradients = K.function(inputs=[self.model.input[0], self.model.input[1],
K.learning_phase()], outputs=[action_gradients])
Which tells me:line 143, in build_model
K.learning_phase()], outputs=[action_gradients])
TypeError: Can not convert a list into a Tensor or Operation.
action_gradients
are calculated on line 140 via:
action_gradients = K.gradients(Q_value, actions)
so I did not think that is a problem but when I take the brackets off of the output argument for K.function
like so:
self.get_action_gradients = K.function(inputs=[*self.model.input,
K.learning_phase()], outputs=action_gradients)
Now I get a slightly different error mentioning Nonetype
rather than list
:
rning-copied/_1my_imps/continuous/mountaincar/my_ddpg_ac.py", line 143, in build_model
TypeError: Can not convert a NoneType into a Tensor or Operation.
Printing out the Q_values and actions show they are tensors:
Q_values Tensor("q_values/BiasAdd:0", shape=(?, 1), dtype=float32) actions Tensor("actions:0", shape=(?, 1), dtype=float32)
But printing out the action_gradients
and type(action_gradients)
just confuses me more:
action_gradients [None]
action_gradients type <class 'list'>
calling: K.gradients()
on two tensors should work shouldn't it?
This DDPG code is originally from here: https://github.com/nyck33/autonomous_quadcopter
and I am trying to adapt it for MountainCarContinuous-v0
keras
$endgroup$
add a comment |
$begingroup$
On lines 142 and 143 of: https://github.com/nyck33/openai_spinup_my_implements/blob/master/continuous/mountaincar/my_ddpg_ac.py
I have:
self.get_action_gradients = K.function(inputs=[self.model.input[0], self.model.input[1],
K.learning_phase()], outputs=[action_gradients])
Which tells me:line 143, in build_model
K.learning_phase()], outputs=[action_gradients])
TypeError: Can not convert a list into a Tensor or Operation.
action_gradients
are calculated on line 140 via:
action_gradients = K.gradients(Q_value, actions)
so I did not think that is a problem but when I take the brackets off of the output argument for K.function
like so:
self.get_action_gradients = K.function(inputs=[*self.model.input,
K.learning_phase()], outputs=action_gradients)
Now I get a slightly different error mentioning Nonetype
rather than list
:
rning-copied/_1my_imps/continuous/mountaincar/my_ddpg_ac.py", line 143, in build_model
TypeError: Can not convert a NoneType into a Tensor or Operation.
Printing out the Q_values and actions show they are tensors:
Q_values Tensor("q_values/BiasAdd:0", shape=(?, 1), dtype=float32) actions Tensor("actions:0", shape=(?, 1), dtype=float32)
But printing out the action_gradients
and type(action_gradients)
just confuses me more:
action_gradients [None]
action_gradients type <class 'list'>
calling: K.gradients()
on two tensors should work shouldn't it?
This DDPG code is originally from here: https://github.com/nyck33/autonomous_quadcopter
and I am trying to adapt it for MountainCarContinuous-v0
keras
$endgroup$
On lines 142 and 143 of: https://github.com/nyck33/openai_spinup_my_implements/blob/master/continuous/mountaincar/my_ddpg_ac.py
I have:
self.get_action_gradients = K.function(inputs=[self.model.input[0], self.model.input[1],
K.learning_phase()], outputs=[action_gradients])
Which tells me:line 143, in build_model
K.learning_phase()], outputs=[action_gradients])
TypeError: Can not convert a list into a Tensor or Operation.
action_gradients
are calculated on line 140 via:
action_gradients = K.gradients(Q_value, actions)
so I did not think that is a problem but when I take the brackets off of the output argument for K.function
like so:
self.get_action_gradients = K.function(inputs=[*self.model.input,
K.learning_phase()], outputs=action_gradients)
Now I get a slightly different error mentioning Nonetype
rather than list
:
rning-copied/_1my_imps/continuous/mountaincar/my_ddpg_ac.py", line 143, in build_model
TypeError: Can not convert a NoneType into a Tensor or Operation.
Printing out the Q_values and actions show they are tensors:
Q_values Tensor("q_values/BiasAdd:0", shape=(?, 1), dtype=float32) actions Tensor("actions:0", shape=(?, 1), dtype=float32)
But printing out the action_gradients
and type(action_gradients)
just confuses me more:
action_gradients [None]
action_gradients type <class 'list'>
calling: K.gradients()
on two tensors should work shouldn't it?
This DDPG code is originally from here: https://github.com/nyck33/autonomous_quadcopter
and I am trying to adapt it for MountainCarContinuous-v0
keras
keras
edited Apr 4 at 12:39
flexitarian33
asked Mar 29 at 17:30
flexitarian33flexitarian33
457
457
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
I am very sorry but this was a careless error. In the critic network I had different pathways for state
and action
, joining them right before the Q-value output layer but...
I actually had state
as the input for both pathways rather than action
for the action pathway.
So of course, if the action
is not present in a network, there is no way to use the chain rule to get the dQ(s,a)/da gradient of output w.r.t. actions.
Here is the code for clarification:
def build_model(self):
#lrelu = LeakyReLU(alpha=0.1)
#Define input layers
states = layers.Input(shape=(self.state_size,), name="states")
actions = layers.Input(shape=(self.action_size,), name="actions")
#Add hidden layers for state pathway
net_states = layers.Dense(units=32, use_bias=False)(states)
net_states = layers.BatchNormalization()(net_states)
net_states = layers.LeakyReLU(alpha=0.1)(net_states)
net_states = layers.Dense(units=64)(net_states)
net_states = layers.BatchNormalization()(net_states)
net_states = layers.LeakyReLU(alpha=0.1)(net_states)
#hidden layers for action
net_actions = layers.Dense(units=32)(actions) #had (states) here instead
net_actions = layers.BatchNormalization()(net_actions)
net_actions = layers.LeakyReLU(alpha=0.1)(net_actions)
net_actions = layers.Dense(units=64)(net_actions)
net_actions = layers.BatchNormalization()(net_actions)
net_actions = layers.LeakyReLU(alpha=0.1)(net_actions)
and the output to console shows that it is no longer a None
type:
action_gradients [<tf.Tensor 'gradients_3/dense_13/MatMul_grad/MatMul:0' shape=(?, 1) dtype=float32>]
Edit: The code works for MtnCarContinuous-v0 environment of openai gym with a modified reward function (incremental based on car position because the final reward is so sparse), over 90 seems to be achieved in 11 episodes. I rendered so actually saw the car make it to the top on the attempts where score is over 100 below.
Here are the results:
episode: 0 score: -41.59034754863541 mean: -41.59 std: 0.0
episode: 1 score: -75.87242108451743 mean: -58.73 std: 17.14
episode: 2 score: 32.01844640263133 mean: -28.48 std: 45.01
episode: 3 score: 132.904319261567 mean: 11.86 std: 80.02
episode: 4 score: 125.81198290946529 mean: 34.65 std: 84.85
episode: 5 score: 84.26163480413017 mean: 42.92 std: 79.63
episode: 6 score: 126.89684490110164 mean: 54.92 std: 79.37
episode: 7 score: 139.18190524840517 mean: 65.45 std: 79.3
episode: 8 score: 100.24481691450521 mean: 69.32 std: 75.56
episode: 9 score: 165.80286734425076 mean: 78.97 std: 77.31
episode: 10 score: 109.29507292352991 mean: 94.05 std: 66.24
episode: 11 score: 209.07900825070152 mean: 122.55 std: 44.84
Source code here: https://github.com/nyck33/openai_my_implements/tree/master/continuous/mountaincar
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48228%2fk-gradients-gives-type-error-where-both-arguments-are-tensors%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
I am very sorry but this was a careless error. In the critic network I had different pathways for state
and action
, joining them right before the Q-value output layer but...
I actually had state
as the input for both pathways rather than action
for the action pathway.
So of course, if the action
is not present in a network, there is no way to use the chain rule to get the dQ(s,a)/da gradient of output w.r.t. actions.
Here is the code for clarification:
def build_model(self):
#lrelu = LeakyReLU(alpha=0.1)
#Define input layers
states = layers.Input(shape=(self.state_size,), name="states")
actions = layers.Input(shape=(self.action_size,), name="actions")
#Add hidden layers for state pathway
net_states = layers.Dense(units=32, use_bias=False)(states)
net_states = layers.BatchNormalization()(net_states)
net_states = layers.LeakyReLU(alpha=0.1)(net_states)
net_states = layers.Dense(units=64)(net_states)
net_states = layers.BatchNormalization()(net_states)
net_states = layers.LeakyReLU(alpha=0.1)(net_states)
#hidden layers for action
net_actions = layers.Dense(units=32)(actions) #had (states) here instead
net_actions = layers.BatchNormalization()(net_actions)
net_actions = layers.LeakyReLU(alpha=0.1)(net_actions)
net_actions = layers.Dense(units=64)(net_actions)
net_actions = layers.BatchNormalization()(net_actions)
net_actions = layers.LeakyReLU(alpha=0.1)(net_actions)
and the output to console shows that it is no longer a None
type:
action_gradients [<tf.Tensor 'gradients_3/dense_13/MatMul_grad/MatMul:0' shape=(?, 1) dtype=float32>]
Edit: The code works for MtnCarContinuous-v0 environment of openai gym with a modified reward function (incremental based on car position because the final reward is so sparse), over 90 seems to be achieved in 11 episodes. I rendered so actually saw the car make it to the top on the attempts where score is over 100 below.
Here are the results:
episode: 0 score: -41.59034754863541 mean: -41.59 std: 0.0
episode: 1 score: -75.87242108451743 mean: -58.73 std: 17.14
episode: 2 score: 32.01844640263133 mean: -28.48 std: 45.01
episode: 3 score: 132.904319261567 mean: 11.86 std: 80.02
episode: 4 score: 125.81198290946529 mean: 34.65 std: 84.85
episode: 5 score: 84.26163480413017 mean: 42.92 std: 79.63
episode: 6 score: 126.89684490110164 mean: 54.92 std: 79.37
episode: 7 score: 139.18190524840517 mean: 65.45 std: 79.3
episode: 8 score: 100.24481691450521 mean: 69.32 std: 75.56
episode: 9 score: 165.80286734425076 mean: 78.97 std: 77.31
episode: 10 score: 109.29507292352991 mean: 94.05 std: 66.24
episode: 11 score: 209.07900825070152 mean: 122.55 std: 44.84
Source code here: https://github.com/nyck33/openai_my_implements/tree/master/continuous/mountaincar
$endgroup$
add a comment |
$begingroup$
I am very sorry but this was a careless error. In the critic network I had different pathways for state
and action
, joining them right before the Q-value output layer but...
I actually had state
as the input for both pathways rather than action
for the action pathway.
So of course, if the action
is not present in a network, there is no way to use the chain rule to get the dQ(s,a)/da gradient of output w.r.t. actions.
Here is the code for clarification:
def build_model(self):
#lrelu = LeakyReLU(alpha=0.1)
#Define input layers
states = layers.Input(shape=(self.state_size,), name="states")
actions = layers.Input(shape=(self.action_size,), name="actions")
#Add hidden layers for state pathway
net_states = layers.Dense(units=32, use_bias=False)(states)
net_states = layers.BatchNormalization()(net_states)
net_states = layers.LeakyReLU(alpha=0.1)(net_states)
net_states = layers.Dense(units=64)(net_states)
net_states = layers.BatchNormalization()(net_states)
net_states = layers.LeakyReLU(alpha=0.1)(net_states)
#hidden layers for action
net_actions = layers.Dense(units=32)(actions) #had (states) here instead
net_actions = layers.BatchNormalization()(net_actions)
net_actions = layers.LeakyReLU(alpha=0.1)(net_actions)
net_actions = layers.Dense(units=64)(net_actions)
net_actions = layers.BatchNormalization()(net_actions)
net_actions = layers.LeakyReLU(alpha=0.1)(net_actions)
and the output to console shows that it is no longer a None
type:
action_gradients [<tf.Tensor 'gradients_3/dense_13/MatMul_grad/MatMul:0' shape=(?, 1) dtype=float32>]
Edit: The code works for MtnCarContinuous-v0 environment of openai gym with a modified reward function (incremental based on car position because the final reward is so sparse), over 90 seems to be achieved in 11 episodes. I rendered so actually saw the car make it to the top on the attempts where score is over 100 below.
Here are the results:
episode: 0 score: -41.59034754863541 mean: -41.59 std: 0.0
episode: 1 score: -75.87242108451743 mean: -58.73 std: 17.14
episode: 2 score: 32.01844640263133 mean: -28.48 std: 45.01
episode: 3 score: 132.904319261567 mean: 11.86 std: 80.02
episode: 4 score: 125.81198290946529 mean: 34.65 std: 84.85
episode: 5 score: 84.26163480413017 mean: 42.92 std: 79.63
episode: 6 score: 126.89684490110164 mean: 54.92 std: 79.37
episode: 7 score: 139.18190524840517 mean: 65.45 std: 79.3
episode: 8 score: 100.24481691450521 mean: 69.32 std: 75.56
episode: 9 score: 165.80286734425076 mean: 78.97 std: 77.31
episode: 10 score: 109.29507292352991 mean: 94.05 std: 66.24
episode: 11 score: 209.07900825070152 mean: 122.55 std: 44.84
Source code here: https://github.com/nyck33/openai_my_implements/tree/master/continuous/mountaincar
$endgroup$
add a comment |
$begingroup$
I am very sorry but this was a careless error. In the critic network I had different pathways for state
and action
, joining them right before the Q-value output layer but...
I actually had state
as the input for both pathways rather than action
for the action pathway.
So of course, if the action
is not present in a network, there is no way to use the chain rule to get the dQ(s,a)/da gradient of output w.r.t. actions.
Here is the code for clarification:
def build_model(self):
#lrelu = LeakyReLU(alpha=0.1)
#Define input layers
states = layers.Input(shape=(self.state_size,), name="states")
actions = layers.Input(shape=(self.action_size,), name="actions")
#Add hidden layers for state pathway
net_states = layers.Dense(units=32, use_bias=False)(states)
net_states = layers.BatchNormalization()(net_states)
net_states = layers.LeakyReLU(alpha=0.1)(net_states)
net_states = layers.Dense(units=64)(net_states)
net_states = layers.BatchNormalization()(net_states)
net_states = layers.LeakyReLU(alpha=0.1)(net_states)
#hidden layers for action
net_actions = layers.Dense(units=32)(actions) #had (states) here instead
net_actions = layers.BatchNormalization()(net_actions)
net_actions = layers.LeakyReLU(alpha=0.1)(net_actions)
net_actions = layers.Dense(units=64)(net_actions)
net_actions = layers.BatchNormalization()(net_actions)
net_actions = layers.LeakyReLU(alpha=0.1)(net_actions)
and the output to console shows that it is no longer a None
type:
action_gradients [<tf.Tensor 'gradients_3/dense_13/MatMul_grad/MatMul:0' shape=(?, 1) dtype=float32>]
Edit: The code works for MtnCarContinuous-v0 environment of openai gym with a modified reward function (incremental based on car position because the final reward is so sparse), over 90 seems to be achieved in 11 episodes. I rendered so actually saw the car make it to the top on the attempts where score is over 100 below.
Here are the results:
episode: 0 score: -41.59034754863541 mean: -41.59 std: 0.0
episode: 1 score: -75.87242108451743 mean: -58.73 std: 17.14
episode: 2 score: 32.01844640263133 mean: -28.48 std: 45.01
episode: 3 score: 132.904319261567 mean: 11.86 std: 80.02
episode: 4 score: 125.81198290946529 mean: 34.65 std: 84.85
episode: 5 score: 84.26163480413017 mean: 42.92 std: 79.63
episode: 6 score: 126.89684490110164 mean: 54.92 std: 79.37
episode: 7 score: 139.18190524840517 mean: 65.45 std: 79.3
episode: 8 score: 100.24481691450521 mean: 69.32 std: 75.56
episode: 9 score: 165.80286734425076 mean: 78.97 std: 77.31
episode: 10 score: 109.29507292352991 mean: 94.05 std: 66.24
episode: 11 score: 209.07900825070152 mean: 122.55 std: 44.84
Source code here: https://github.com/nyck33/openai_my_implements/tree/master/continuous/mountaincar
$endgroup$
I am very sorry but this was a careless error. In the critic network I had different pathways for state
and action
, joining them right before the Q-value output layer but...
I actually had state
as the input for both pathways rather than action
for the action pathway.
So of course, if the action
is not present in a network, there is no way to use the chain rule to get the dQ(s,a)/da gradient of output w.r.t. actions.
Here is the code for clarification:
def build_model(self):
#lrelu = LeakyReLU(alpha=0.1)
#Define input layers
states = layers.Input(shape=(self.state_size,), name="states")
actions = layers.Input(shape=(self.action_size,), name="actions")
#Add hidden layers for state pathway
net_states = layers.Dense(units=32, use_bias=False)(states)
net_states = layers.BatchNormalization()(net_states)
net_states = layers.LeakyReLU(alpha=0.1)(net_states)
net_states = layers.Dense(units=64)(net_states)
net_states = layers.BatchNormalization()(net_states)
net_states = layers.LeakyReLU(alpha=0.1)(net_states)
#hidden layers for action
net_actions = layers.Dense(units=32)(actions) #had (states) here instead
net_actions = layers.BatchNormalization()(net_actions)
net_actions = layers.LeakyReLU(alpha=0.1)(net_actions)
net_actions = layers.Dense(units=64)(net_actions)
net_actions = layers.BatchNormalization()(net_actions)
net_actions = layers.LeakyReLU(alpha=0.1)(net_actions)
and the output to console shows that it is no longer a None
type:
action_gradients [<tf.Tensor 'gradients_3/dense_13/MatMul_grad/MatMul:0' shape=(?, 1) dtype=float32>]
Edit: The code works for MtnCarContinuous-v0 environment of openai gym with a modified reward function (incremental based on car position because the final reward is so sparse), over 90 seems to be achieved in 11 episodes. I rendered so actually saw the car make it to the top on the attempts where score is over 100 below.
Here are the results:
episode: 0 score: -41.59034754863541 mean: -41.59 std: 0.0
episode: 1 score: -75.87242108451743 mean: -58.73 std: 17.14
episode: 2 score: 32.01844640263133 mean: -28.48 std: 45.01
episode: 3 score: 132.904319261567 mean: 11.86 std: 80.02
episode: 4 score: 125.81198290946529 mean: 34.65 std: 84.85
episode: 5 score: 84.26163480413017 mean: 42.92 std: 79.63
episode: 6 score: 126.89684490110164 mean: 54.92 std: 79.37
episode: 7 score: 139.18190524840517 mean: 65.45 std: 79.3
episode: 8 score: 100.24481691450521 mean: 69.32 std: 75.56
episode: 9 score: 165.80286734425076 mean: 78.97 std: 77.31
episode: 10 score: 109.29507292352991 mean: 94.05 std: 66.24
episode: 11 score: 209.07900825070152 mean: 122.55 std: 44.84
Source code here: https://github.com/nyck33/openai_my_implements/tree/master/continuous/mountaincar
edited Apr 5 at 6:56
answered Apr 5 at 5:37
flexitarian33flexitarian33
457
457
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48228%2fk-gradients-gives-type-error-where-both-arguments-are-tensors%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown