Detection model - Training with class-instance limit-awareness2019 Community Moderator ElectionSpecifying neural network output layout for object detectionWhat classifier is the best to determine if object was detected in the correct position?Data preprocessing: Should we normalise images pixel-wise?Several fundamental questions about CNNTraining of Region Proposal Network (RPN)Bounding Boxes in YOLO ModelYOLO algorithm - understanding training dataOdd Loss Curves for Object Detection TaskDetecting address labels using Tensorflow Object Detection API

aging parents with no investments

Uplifted animals have parts of their "brain" in various locations of their body. Where?

Can a planet have a different gravitational pull depending on its location in orbit around its sun?

extract characters between two commas?

How to deal with fear of taking dependencies

Is domain driven design an anti-SQL pattern?

Where to refill my bottle in India?

Does the average primeness of natural numbers tend to zero?

Are cabin dividers used to "hide" the flex of the airplane?

Creating a loop after a break using Markov Chain in Tikz

Why do UK politicians seemingly ignore opinion polls on Brexit?

How did the USSR manage to innovate in an environment characterized by government censorship and high bureaucracy?

Is it legal to have the "// (c) 2019 John Smith" header in all files when there are hundreds of contributors?

LWC and complex parameters

I’m planning on buying a laser printer but concerned about the life cycle of toner in the machine

New order #4: World

Why is the design of haulage companies so “special”?

Why do we use polarized capacitors?

Prime joint compound before latex paint?

Doomsday-clock for my fantasy planet

Is Social Media Science Fiction?

Can the Produce Flame cantrip be used to grapple, or as an unarmed strike, in the right circumstances?

Is this relativistic mass?

What do you call something that goes against the spirit of the law, but is legal when interpreting the law to the letter?

Detection model - Training with class-instance limit-awareness

2019 Community Moderator ElectionSpecifying neural network output layout for object detectionWhat classifier is the best to determine if object was detected in the correct position?Data preprocessing: Should we normalise images pixel-wise?Several fundamental questions about CNNTraining of Region Proposal Network (RPN)Bounding Boxes in YOLO ModelYOLO algorithm - understanding training dataOdd Loss Curves for Object Detection TaskDetecting address labels using Tensorflow Object Detection API

Is there a way to make a detection model aware of the maximal number of possible objects of a given class within a single image?

For example in a toy case with 2 classes. If I know that in every single image, class A can have no more than 5 instances and class B no more than 1. Is there a way to incorporate it into the training process?

To make it clear, I'm not talking about an additional algorithm which runs on top of the trained model (such as non-maximum suppression which is used to select a single bounding box for an object). I specifically ask about the actual model and its training process.

asked Mar 29 at 8:33

Mark.F

1,0841521

add a comment |

Is there a way to make a detection model aware of the maximal number of possible objects of a given class within a single image?

asked Mar 29 at 8:33

Mark.F

1,0841521

add a comment |

Is there a way to make a detection model aware of the maximal number of possible objects of a given class within a single image?

asked Mar 29 at 8:33

Mark.F

1,0841521

Is there a way to make a detection model aware of the maximal number of possible objects of a given class within a single image?

machine-learning deep-learning object-detection

asked Mar 29 at 8:33

Mark.F

1,0841521

asked Mar 29 at 8:33

Mark.F

1,0841521

asked Mar 29 at 8:33

Mark.F

1,0841521

asked Mar 29 at 8:33

Mark.F

1,0841521

asked Mar 29 at 8:33

Mark.F

1,0841521

add a comment |

2 Answers
2

active

oldest

votes

I can think of the following approach:

Let's say that you have two classes, A and B. Additionally, you now that for class A there is at maximum 5 instances (so 0, 1, 2, 3, 4 or 5) and for B 1 instance (0 or 1).

For this purpose, you can have 6 outputs for class A and 2 outputs for class B.
Between those 6 outputs for class A, only one should be active; same for B - only one of two should be active.

For example, if on some image there are 3 objects of class A and 0 for B, the outputs would be: [0, 0, 0, 1, 0, 0] for A, and [1, 0] for B (or something very close to 0s and 1s, right?)

With these outputs, you can also combine other outputs which are needed for detection.

answered Mar 29 at 9:58

Antonio Jurić

741111

add a comment |

This is a very interesting supervision but hard to achieve!

Why we need this supervision?

The need for this supervision comes from the fact that model may wrongly detect more objects than it should, thus, must be punished (taught) for this violation, otherwise no supervision would be required since model is acting accordingly.

How to implement this supervision?

To this end, we need to fork some layers from the model to output the number of detected objects per class $c$ for input image $i$, namely $n'_c,i$, then supervise this output with the true number of objects in image $i$, namely $n_c, i$, or merely with an upper limit $N_c$ per class as you have suggested. Then, add a term like $(n_c, i - n'_c, i)^2$ or $textmax(0, n'_c, i - N_c)$ to the loss function to punish the model for detecting wrong or more number of objects than it should respectively. Then proceed to train the model.

What may go wrong?

But here is the problem, model can learn to lie about the number of detected objects through modifying the forked layers (weights)! Since it is easier for model to fabricate a valid $n'_c, i$ than to actually detect fewer objects which is more complex. Also, if we use a constant, unfabricatable unit (e.g. a constant neural net) that counts the number of detected objects, there would be no gradient to punish (teach) the model!

This is why this type of supervision is hard to achieve.

edited Mar 29 at 20:16

answered Mar 29 at 11:07

Esmailian

2,805318

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48197%2fdetection-model-training-with-class-instance-limit-awareness%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

I can think of the following approach:

Let's say that you have two classes, A and B. Additionally, you now that for class A there is at maximum 5 instances (so 0, 1, 2, 3, 4 or 5) and for B 1 instance (0 or 1).

For this purpose, you can have 6 outputs for class A and 2 outputs for class B.
Between those 6 outputs for class A, only one should be active; same for B - only one of two should be active.

For example, if on some image there are 3 objects of class A and 0 for B, the outputs would be: [0, 0, 0, 1, 0, 0] for A, and [1, 0] for B (or something very close to 0s and 1s, right?)

With these outputs, you can also combine other outputs which are needed for detection.

answered Mar 29 at 9:58

Antonio Jurić

741111

add a comment |

I can think of the following approach:

Let's say that you have two classes, A and B. Additionally, you now that for class A there is at maximum 5 instances (so 0, 1, 2, 3, 4 or 5) and for B 1 instance (0 or 1).

For this purpose, you can have 6 outputs for class A and 2 outputs for class B.
Between those 6 outputs for class A, only one should be active; same for B - only one of two should be active.

For example, if on some image there are 3 objects of class A and 0 for B, the outputs would be: [0, 0, 0, 1, 0, 0] for A, and [1, 0] for B (or something very close to 0s and 1s, right?)

With these outputs, you can also combine other outputs which are needed for detection.

answered Mar 29 at 9:58

Antonio Jurić

741111

add a comment |

I can think of the following approach:

Let's say that you have two classes, A and B. Additionally, you now that for class A there is at maximum 5 instances (so 0, 1, 2, 3, 4 or 5) and for B 1 instance (0 or 1).

For this purpose, you can have 6 outputs for class A and 2 outputs for class B.
Between those 6 outputs for class A, only one should be active; same for B - only one of two should be active.

For example, if on some image there are 3 objects of class A and 0 for B, the outputs would be: [0, 0, 0, 1, 0, 0] for A, and [1, 0] for B (or something very close to 0s and 1s, right?)

With these outputs, you can also combine other outputs which are needed for detection.

answered Mar 29 at 9:58

Antonio Jurić

741111

I can think of the following approach:

Let's say that you have two classes, A and B. Additionally, you now that for class A there is at maximum 5 instances (so 0, 1, 2, 3, 4 or 5) and for B 1 instance (0 or 1).

For this purpose, you can have 6 outputs for class A and 2 outputs for class B.
Between those 6 outputs for class A, only one should be active; same for B - only one of two should be active.

For example, if on some image there are 3 objects of class A and 0 for B, the outputs would be: [0, 0, 0, 1, 0, 0] for A, and [1, 0] for B (or something very close to 0s and 1s, right?)

With these outputs, you can also combine other outputs which are needed for detection.

answered Mar 29 at 9:58

Antonio Jurić

741111

answered Mar 29 at 9:58

Antonio Jurić

741111

answered Mar 29 at 9:58

Antonio Jurić

741111

answered Mar 29 at 9:58

Antonio Jurić

741111

add a comment |

This is a very interesting supervision but hard to achieve!

Why we need this supervision?

How to implement this supervision?

What may go wrong?

This is why this type of supervision is hard to achieve.

edited Mar 29 at 20:16

answered Mar 29 at 11:07

Esmailian

2,805318

add a comment |

This is a very interesting supervision but hard to achieve!

Why we need this supervision?

How to implement this supervision?

What may go wrong?

This is why this type of supervision is hard to achieve.

edited Mar 29 at 20:16

answered Mar 29 at 11:07

Esmailian

2,805318

add a comment |

This is a very interesting supervision but hard to achieve!

Why we need this supervision?

How to implement this supervision?

What may go wrong?

This is why this type of supervision is hard to achieve.

edited Mar 29 at 20:16

answered Mar 29 at 11:07

Esmailian

2,805318

This is a very interesting supervision but hard to achieve!

Why we need this supervision?

How to implement this supervision?

What may go wrong?

This is why this type of supervision is hard to achieve.

edited Mar 29 at 20:16

answered Mar 29 at 11:07

Esmailian

2,805318

edited Mar 29 at 20:16

answered Mar 29 at 11:07

Esmailian

2,805318

answered Mar 29 at 11:07

Esmailian

2,805318

answered Mar 29 at 11:07

Esmailian

2,805318

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

oqJTAi SBiatBACNLCNWwm9OIlwk1L2bseODEkIUNsC HU Yk pO eHqB9ZhZFJDc ANo9EgWKO QenMHwfL,zv66l0yB,ItxB6YB 17WH HR

搜尋此網誌

Trjtdtk

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

2 Answers
2

2 Answers
2

2 Answers
2