Using a discriminator to distinguish ground truth and predicted boxes for FRCNNTensorFlow: Regression using Deep Neural NetworkBest approach for image recognition/classification with few training dataUsing Binary Image Classification on VideoHow to maximize recall?Space between an object and the ground truth bounding boxPreparing ground truth labels for YOLO3Calibrate the predicted class probability to make it represent a true probability?mean average precision - pseudo codeUsing deep learning to classify similar imagesKeras Loss Value Extremely High + Prediction Result same
Does the Shadow Magic sorcerer's Eyes of the Dark feature work on all Darkness spells or just his/her own?
Box half filled color
Will my managed file get deleted?
"Marked down as someone wanting to sell shares." What does that mean?
Naïve RSA decryption in Python
Why didn't Voldemort know what Grindelwald looked like?
Why doesn't the fusion process of the sun speed up?
Single word to change groups
Why is this tree refusing to shed its dead leaves?
Is it okay for a cleric of life to use spells like Animate Dead and/or Contagion?
Should a narrator ever describe things based on a characters view instead of fact?
How to understand 「僕は誰より彼女が好きなんだ。」
How to balance a monster modification (zombie)?
Data prepration for logistic regression : Value either "not available" or a "year"
Would this string work as string?
Why are there no stars visible in cislunar space?
How can an organ that provides biological immortality be unable to regenerate?
Should I be concerned about student access to a test bank?
Magento 2: Make category field required in product form in backend
If I cast the Enlarge/Reduce spell on an arrow, what weapon could it count as?
When should a starting writer get his own webpage?
PTIJ: Where did Achashverosh's years wander off to?
Do native speakers use "ultima" and "proxima" frequently in spoken English?
What do the positive and negative (+/-) transmit and receive pins mean on Ethernet cables?
Using a discriminator to distinguish ground truth and predicted boxes for FRCNN
TensorFlow: Regression using Deep Neural NetworkBest approach for image recognition/classification with few training dataUsing Binary Image Classification on VideoHow to maximize recall?Space between an object and the ground truth bounding boxPreparing ground truth labels for YOLO3Calibrate the predicted class probability to make it represent a true probability?mean average precision - pseudo codeUsing deep learning to classify similar imagesKeras Loss Value Extremely High + Prediction Result same
$begingroup$
We have implemented an object detection framework in Keras based on the Faster R-CNN model. Currently, we would like to find a way to automatically classify images on which the model is performing exceptionally well. So my idea was to create a discriminator model, as in a GAN, to distinguish ground truth from predicted boxes.
The basic idea is that we run the image through the backbone image classifier (we use DenseNet) and put either ground truth boxes or predicted boxes on that image through the RoI Pooling layer of FRCNN picking GT or prediction at random. After RoI Pooling we added a few FC layers and then a single neuron with sigmoid activation for the distinction.
So now this model is either learning "too good" meaning the binary accuracy is close to 100% after 10 images or "too bad" meaning the binary accuracy is close to 50% the whole time. This depends on whether we are loading the weights that the FRCNN model was trained with or not. If we load them, the activation for GT boxes after the RoI Pooling is probably quite distinguishable from the activation of predicted boxes. If we do not load the weights, the model is probably "too weak" to learn something.
So our question: Does anyone have any experience or intuition on how to model and train such an approach? Is there something intrinsically wrong with our network?
machine-learning deep-learning keras computer-vision object-detection
New contributor
$endgroup$
add a comment |
$begingroup$
We have implemented an object detection framework in Keras based on the Faster R-CNN model. Currently, we would like to find a way to automatically classify images on which the model is performing exceptionally well. So my idea was to create a discriminator model, as in a GAN, to distinguish ground truth from predicted boxes.
The basic idea is that we run the image through the backbone image classifier (we use DenseNet) and put either ground truth boxes or predicted boxes on that image through the RoI Pooling layer of FRCNN picking GT or prediction at random. After RoI Pooling we added a few FC layers and then a single neuron with sigmoid activation for the distinction.
So now this model is either learning "too good" meaning the binary accuracy is close to 100% after 10 images or "too bad" meaning the binary accuracy is close to 50% the whole time. This depends on whether we are loading the weights that the FRCNN model was trained with or not. If we load them, the activation for GT boxes after the RoI Pooling is probably quite distinguishable from the activation of predicted boxes. If we do not load the weights, the model is probably "too weak" to learn something.
So our question: Does anyone have any experience or intuition on how to model and train such an approach? Is there something intrinsically wrong with our network?
machine-learning deep-learning keras computer-vision object-detection
New contributor
$endgroup$
add a comment |
$begingroup$
We have implemented an object detection framework in Keras based on the Faster R-CNN model. Currently, we would like to find a way to automatically classify images on which the model is performing exceptionally well. So my idea was to create a discriminator model, as in a GAN, to distinguish ground truth from predicted boxes.
The basic idea is that we run the image through the backbone image classifier (we use DenseNet) and put either ground truth boxes or predicted boxes on that image through the RoI Pooling layer of FRCNN picking GT or prediction at random. After RoI Pooling we added a few FC layers and then a single neuron with sigmoid activation for the distinction.
So now this model is either learning "too good" meaning the binary accuracy is close to 100% after 10 images or "too bad" meaning the binary accuracy is close to 50% the whole time. This depends on whether we are loading the weights that the FRCNN model was trained with or not. If we load them, the activation for GT boxes after the RoI Pooling is probably quite distinguishable from the activation of predicted boxes. If we do not load the weights, the model is probably "too weak" to learn something.
So our question: Does anyone have any experience or intuition on how to model and train such an approach? Is there something intrinsically wrong with our network?
machine-learning deep-learning keras computer-vision object-detection
New contributor
$endgroup$
We have implemented an object detection framework in Keras based on the Faster R-CNN model. Currently, we would like to find a way to automatically classify images on which the model is performing exceptionally well. So my idea was to create a discriminator model, as in a GAN, to distinguish ground truth from predicted boxes.
The basic idea is that we run the image through the backbone image classifier (we use DenseNet) and put either ground truth boxes or predicted boxes on that image through the RoI Pooling layer of FRCNN picking GT or prediction at random. After RoI Pooling we added a few FC layers and then a single neuron with sigmoid activation for the distinction.
So now this model is either learning "too good" meaning the binary accuracy is close to 100% after 10 images or "too bad" meaning the binary accuracy is close to 50% the whole time. This depends on whether we are loading the weights that the FRCNN model was trained with or not. If we load them, the activation for GT boxes after the RoI Pooling is probably quite distinguishable from the activation of predicted boxes. If we do not load the weights, the model is probably "too weak" to learn something.
So our question: Does anyone have any experience or intuition on how to model and train such an approach? Is there something intrinsically wrong with our network?
machine-learning deep-learning keras computer-vision object-detection
machine-learning deep-learning keras computer-vision object-detection
New contributor
New contributor
New contributor
asked 2 mins ago
RichardRichard
1
1
New contributor
New contributor
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Richard is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47601%2fusing-a-discriminator-to-distinguish-ground-truth-and-predicted-boxes-for-frcnn%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Richard is a new contributor. Be nice, and check out our Code of Conduct.
Richard is a new contributor. Be nice, and check out our Code of Conduct.
Richard is a new contributor. Be nice, and check out our Code of Conduct.
Richard is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47601%2fusing-a-discriminator-to-distinguish-ground-truth-and-predicted-boxes-for-frcnn%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown