Machine learning model to predict the best candidateDifferent methods for clustering skills in textSelf adjusting CNN networkHow can I use machine learning methods on modelling time series data?An Artificial Neuron Network (ANN) with an arbitrary number of inputs and outputsWhat classifier is the best to determine if object was detected in the correct position?what is the best approach to my prediction problemSupervised learning for variable length feature-less dataHow to represent data-set in a RNN?Advice on what Machine Learning Algorithms to study for a Job to candidate matching algorithmHow can I implement a deep/machine learning algorithm for this?
How do I deal with a coworker that keeps asking to make small superficial changes to a report, and it is seriously triggering my anxiety?
What is the difference between `command a[bc]d` and `command `ab,cd`
What do the phrase "Reeyan's seacrest" and the word "fraggle" mean in a sketch?
Does holding a wand and speaking its command word count as V/S/M spell components?
How did Captain America manage to do this?
What is the strongest case that can be made in favour of the UK regaining some control over fishing policy after Brexit?
Was there a Viking Exchange as well as a Columbian one?
Packing rectangles: Does rotation ever help?
Examples of subgroups where it's nontrivial to show closure under multiplication?
What happened to Captain America in Endgame?
how to find the equation of a circle given points of the circle
Why does academia still use scientific journals and not peer-reviewed government funded alternatives?
Does a semiconductor follow Ohm's law?
How to pronounce 'C++' in Spanish
How to stop co-workers from teasing me because I know Russian?
How to make a pipeline wait for end-of-file or stop after an error?
Examples of non trivial equivalence relations , I mean equivalence relations without the expression " same ... as" in their definition?
Is it possible to determine the symmetric encryption method used by output size?
Can someone publish a story that happened to you?
Why was Germany not as successful as other Europeans in establishing overseas colonies?
Is the 5 MB static resource size limit 5,242,880 bytes or 5,000,000 bytes?
How to have a sharp product image?
Document starts having heaps of errors in the middle, but the code doesn't have any problems in it
How come there are so many candidates for the 2020 Democratic party presidential nomination?
Machine learning model to predict the best candidate
Different methods for clustering skills in textSelf adjusting CNN networkHow can I use machine learning methods on modelling time series data?An Artificial Neuron Network (ANN) with an arbitrary number of inputs and outputsWhat classifier is the best to determine if object was detected in the correct position?what is the best approach to my prediction problemSupervised learning for variable length feature-less dataHow to represent data-set in a RNN?Advice on what Machine Learning Algorithms to study for a Job to candidate matching algorithmHow can I implement a deep/machine learning algorithm for this?
$begingroup$
Problem: I would like to build a machine learning model that can predict the best candidate from any given set. What could be a good architecture for such a model?
Given: I have several training examples, each of which consists of:
- a set of candidates.
- a descriptor for the set as a whole.
- a label that tells which one of those candidates is the best in that set.
Details:
- I will have around 10K such sets.
- The number of candidates in every set may be different (may vary roughly from 10 to 100)
- Every set is unordered.
- The descriptor of each set is currently a fixed length one-hot vector. I'm open to add more features to it though.
- Each candidate is represented by a fixed length feature vector. (However in future, the number of features describing each candidate may also differ for every candidate).
What I tried but didn't work:
One approach I tried was a simple MLP that takes one candidate as input and outputs whether or not the candidate is the best. But since this MLP wouldn't know which set the candidate belongs to, it fails in situations where a candidate is the best in one set but the same candidate is not the best in another set.
To get into some more specifics, in my current problem, each candidate is a 2D polygon with a fixed number of line segments. Labelling on the training examples is being done manually to pick the most good looking polygon in a given set of polylines. Each polygon is described by an array of (x,y) coordinates.
One problem I face is that I don't have a natural starting point for a polygon to begin it's array of (x,y) coordinates from. Currently I'm choosing the starting point to be the one with the minimum value of x+y and going counterclockwise from there.
Currently each 2D polygon has the same number of segments. But I would soon need to support polygons with varying number of segments.
In future, I would like to extend this ML model to 3D polyhedrons too, but I don't know how to even build a feature vector to describe for 3D polyhedron yet. I guess that's a problem for another day.
machine-learning neural-network prediction machine-learning-model
$endgroup$
|
show 3 more comments
$begingroup$
Problem: I would like to build a machine learning model that can predict the best candidate from any given set. What could be a good architecture for such a model?
Given: I have several training examples, each of which consists of:
- a set of candidates.
- a descriptor for the set as a whole.
- a label that tells which one of those candidates is the best in that set.
Details:
- I will have around 10K such sets.
- The number of candidates in every set may be different (may vary roughly from 10 to 100)
- Every set is unordered.
- The descriptor of each set is currently a fixed length one-hot vector. I'm open to add more features to it though.
- Each candidate is represented by a fixed length feature vector. (However in future, the number of features describing each candidate may also differ for every candidate).
What I tried but didn't work:
One approach I tried was a simple MLP that takes one candidate as input and outputs whether or not the candidate is the best. But since this MLP wouldn't know which set the candidate belongs to, it fails in situations where a candidate is the best in one set but the same candidate is not the best in another set.
To get into some more specifics, in my current problem, each candidate is a 2D polygon with a fixed number of line segments. Labelling on the training examples is being done manually to pick the most good looking polygon in a given set of polylines. Each polygon is described by an array of (x,y) coordinates.
One problem I face is that I don't have a natural starting point for a polygon to begin it's array of (x,y) coordinates from. Currently I'm choosing the starting point to be the one with the minimum value of x+y and going counterclockwise from there.
Currently each 2D polygon has the same number of segments. But I would soon need to support polygons with varying number of segments.
In future, I would like to extend this ML model to 3D polyhedrons too, but I don't know how to even build a feature vector to describe for 3D polyhedron yet. I guess that's a problem for another day.
machine-learning neural-network prediction machine-learning-model
$endgroup$
$begingroup$
Do you have labeled comparisons between polygons across sets?
$endgroup$
– jonnor
Apr 7 at 12:34
$begingroup$
Nope. I do not have any comparisons across sets.
$endgroup$
– mak
Apr 7 at 13:52
$begingroup$
That makes it a bit hard. Key here is to be able to formulate the problem as a standard type of ML problem. You can have a look at the Ranking via Pairwise Comparisons for some inspiration, but I'm not sure if it fit entirely...
$endgroup$
– jonnor
Apr 7 at 14:55
$begingroup$
Thanks a lot for your suggestions! Even I had considered pairwise comparisons and I guess they might work, but the performance would go O(n^2). I also considered RNNs but they are meant for ordered sequences, not for unordered sets.
$endgroup$
– mak
Apr 7 at 15:08
$begingroup$
How many polygons in each set, and how many sets?
$endgroup$
– jonnor
Apr 7 at 15:16
|
show 3 more comments
$begingroup$
Problem: I would like to build a machine learning model that can predict the best candidate from any given set. What could be a good architecture for such a model?
Given: I have several training examples, each of which consists of:
- a set of candidates.
- a descriptor for the set as a whole.
- a label that tells which one of those candidates is the best in that set.
Details:
- I will have around 10K such sets.
- The number of candidates in every set may be different (may vary roughly from 10 to 100)
- Every set is unordered.
- The descriptor of each set is currently a fixed length one-hot vector. I'm open to add more features to it though.
- Each candidate is represented by a fixed length feature vector. (However in future, the number of features describing each candidate may also differ for every candidate).
What I tried but didn't work:
One approach I tried was a simple MLP that takes one candidate as input and outputs whether or not the candidate is the best. But since this MLP wouldn't know which set the candidate belongs to, it fails in situations where a candidate is the best in one set but the same candidate is not the best in another set.
To get into some more specifics, in my current problem, each candidate is a 2D polygon with a fixed number of line segments. Labelling on the training examples is being done manually to pick the most good looking polygon in a given set of polylines. Each polygon is described by an array of (x,y) coordinates.
One problem I face is that I don't have a natural starting point for a polygon to begin it's array of (x,y) coordinates from. Currently I'm choosing the starting point to be the one with the minimum value of x+y and going counterclockwise from there.
Currently each 2D polygon has the same number of segments. But I would soon need to support polygons with varying number of segments.
In future, I would like to extend this ML model to 3D polyhedrons too, but I don't know how to even build a feature vector to describe for 3D polyhedron yet. I guess that's a problem for another day.
machine-learning neural-network prediction machine-learning-model
$endgroup$
Problem: I would like to build a machine learning model that can predict the best candidate from any given set. What could be a good architecture for such a model?
Given: I have several training examples, each of which consists of:
- a set of candidates.
- a descriptor for the set as a whole.
- a label that tells which one of those candidates is the best in that set.
Details:
- I will have around 10K such sets.
- The number of candidates in every set may be different (may vary roughly from 10 to 100)
- Every set is unordered.
- The descriptor of each set is currently a fixed length one-hot vector. I'm open to add more features to it though.
- Each candidate is represented by a fixed length feature vector. (However in future, the number of features describing each candidate may also differ for every candidate).
What I tried but didn't work:
One approach I tried was a simple MLP that takes one candidate as input and outputs whether or not the candidate is the best. But since this MLP wouldn't know which set the candidate belongs to, it fails in situations where a candidate is the best in one set but the same candidate is not the best in another set.
To get into some more specifics, in my current problem, each candidate is a 2D polygon with a fixed number of line segments. Labelling on the training examples is being done manually to pick the most good looking polygon in a given set of polylines. Each polygon is described by an array of (x,y) coordinates.
One problem I face is that I don't have a natural starting point for a polygon to begin it's array of (x,y) coordinates from. Currently I'm choosing the starting point to be the one with the minimum value of x+y and going counterclockwise from there.
Currently each 2D polygon has the same number of segments. But I would soon need to support polygons with varying number of segments.
In future, I would like to extend this ML model to 3D polyhedrons too, but I don't know how to even build a feature vector to describe for 3D polyhedron yet. I guess that's a problem for another day.
machine-learning neural-network prediction machine-learning-model
machine-learning neural-network prediction machine-learning-model
edited Apr 9 at 8:08
mak
asked Apr 7 at 5:50
makmak
33
33
$begingroup$
Do you have labeled comparisons between polygons across sets?
$endgroup$
– jonnor
Apr 7 at 12:34
$begingroup$
Nope. I do not have any comparisons across sets.
$endgroup$
– mak
Apr 7 at 13:52
$begingroup$
That makes it a bit hard. Key here is to be able to formulate the problem as a standard type of ML problem. You can have a look at the Ranking via Pairwise Comparisons for some inspiration, but I'm not sure if it fit entirely...
$endgroup$
– jonnor
Apr 7 at 14:55
$begingroup$
Thanks a lot for your suggestions! Even I had considered pairwise comparisons and I guess they might work, but the performance would go O(n^2). I also considered RNNs but they are meant for ordered sequences, not for unordered sets.
$endgroup$
– mak
Apr 7 at 15:08
$begingroup$
How many polygons in each set, and how many sets?
$endgroup$
– jonnor
Apr 7 at 15:16
|
show 3 more comments
$begingroup$
Do you have labeled comparisons between polygons across sets?
$endgroup$
– jonnor
Apr 7 at 12:34
$begingroup$
Nope. I do not have any comparisons across sets.
$endgroup$
– mak
Apr 7 at 13:52
$begingroup$
That makes it a bit hard. Key here is to be able to formulate the problem as a standard type of ML problem. You can have a look at the Ranking via Pairwise Comparisons for some inspiration, but I'm not sure if it fit entirely...
$endgroup$
– jonnor
Apr 7 at 14:55
$begingroup$
Thanks a lot for your suggestions! Even I had considered pairwise comparisons and I guess they might work, but the performance would go O(n^2). I also considered RNNs but they are meant for ordered sequences, not for unordered sets.
$endgroup$
– mak
Apr 7 at 15:08
$begingroup$
How many polygons in each set, and how many sets?
$endgroup$
– jonnor
Apr 7 at 15:16
$begingroup$
Do you have labeled comparisons between polygons across sets?
$endgroup$
– jonnor
Apr 7 at 12:34
$begingroup$
Do you have labeled comparisons between polygons across sets?
$endgroup$
– jonnor
Apr 7 at 12:34
$begingroup$
Nope. I do not have any comparisons across sets.
$endgroup$
– mak
Apr 7 at 13:52
$begingroup$
Nope. I do not have any comparisons across sets.
$endgroup$
– mak
Apr 7 at 13:52
$begingroup$
That makes it a bit hard. Key here is to be able to formulate the problem as a standard type of ML problem. You can have a look at the Ranking via Pairwise Comparisons for some inspiration, but I'm not sure if it fit entirely...
$endgroup$
– jonnor
Apr 7 at 14:55
$begingroup$
That makes it a bit hard. Key here is to be able to formulate the problem as a standard type of ML problem. You can have a look at the Ranking via Pairwise Comparisons for some inspiration, but I'm not sure if it fit entirely...
$endgroup$
– jonnor
Apr 7 at 14:55
$begingroup$
Thanks a lot for your suggestions! Even I had considered pairwise comparisons and I guess they might work, but the performance would go O(n^2). I also considered RNNs but they are meant for ordered sequences, not for unordered sets.
$endgroup$
– mak
Apr 7 at 15:08
$begingroup$
Thanks a lot for your suggestions! Even I had considered pairwise comparisons and I guess they might work, but the performance would go O(n^2). I also considered RNNs but they are meant for ordered sequences, not for unordered sets.
$endgroup$
– mak
Apr 7 at 15:08
$begingroup$
How many polygons in each set, and how many sets?
$endgroup$
– jonnor
Apr 7 at 15:16
$begingroup$
How many polygons in each set, and how many sets?
$endgroup$
– jonnor
Apr 7 at 15:16
|
show 3 more comments
1 Answer
1
active
oldest
votes
$begingroup$
I think it would make more sense to train a model to grade (regression) each candidate, them from candidates of a particular set you can use the best candidate from its "grade".
Also, you should try changing the information from raw cloud of points to more meaningful geometric form irfomation:
- Number of vertices/segments
- Segments length mean and variance
- Skewness
- Size and direction of major and minor axis
- Center position
- Moments of area (first,second,third...)
Update
To apply a regression model you will need to generate grades for the training set and that might be a challenge. For that I will propose a few heuristics to generate this:
Since you have about 10k samples, you could assign to every candidate the probability of been the best candidate in any set (for example, if he is the best candidate in 10 sets you can give him a grade $frac1010,000$.
You could try clustering the samples and assigning a grade to every cluster as the probability of a candidate in that cluster been the best candidate by $fracN_bestN_cluster$, where $N_best$ is the number of best candidates in any set in that cluster and $N_cluster$ is the number of candidates in that cluster
You can assign to each best candidate a grade like $1$ if he is best candidate in every set it appears and $1-e^-alpha N$ for every $N$ times it appears in a set without been the best candidate. You will have to tune the decay rate $alpha$ like any hyperparameter.
$endgroup$
$begingroup$
Thanks @Pedro for your useful answer! However, in order to create a regression model, I would need to train it with grades for each candidate in the training examples. How do you suggest I generate those grades?
$endgroup$
– mak
Apr 8 at 6:23
$begingroup$
True, I proposed a few heuristics to that and updates the answer. Sorry for forgetting that crucial point lol
$endgroup$
– Pedro Henrique Monforte
Apr 8 at 12:37
$begingroup$
Could you return to us with a small report of the success of any of my tips? I am a computer vision researcher and geometry-related models are really useful to my field.
$endgroup$
– Pedro Henrique Monforte
Apr 8 at 14:59
$begingroup$
Thanks @Pedro, all of your ideas are very useful. In my case though, your first idea (individual probability based) and your third idea ($1-e^-alpha N$) might be a bit difficult to implement because I don't have ready information about which candidates appear in multiple sets. I can search for multiple occurrences, but that too is tricky because two candidates may be very similar but not exactly identical due to numerical noise. Your second idea (probability associated with clusters) seems like it might work for me. I'll try out and let you know. Thank you for your wonderful ideas!
$endgroup$
– mak
Apr 9 at 8:01
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48784%2fmachine-learning-model-to-predict-the-best-candidate%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
I think it would make more sense to train a model to grade (regression) each candidate, them from candidates of a particular set you can use the best candidate from its "grade".
Also, you should try changing the information from raw cloud of points to more meaningful geometric form irfomation:
- Number of vertices/segments
- Segments length mean and variance
- Skewness
- Size and direction of major and minor axis
- Center position
- Moments of area (first,second,third...)
Update
To apply a regression model you will need to generate grades for the training set and that might be a challenge. For that I will propose a few heuristics to generate this:
Since you have about 10k samples, you could assign to every candidate the probability of been the best candidate in any set (for example, if he is the best candidate in 10 sets you can give him a grade $frac1010,000$.
You could try clustering the samples and assigning a grade to every cluster as the probability of a candidate in that cluster been the best candidate by $fracN_bestN_cluster$, where $N_best$ is the number of best candidates in any set in that cluster and $N_cluster$ is the number of candidates in that cluster
You can assign to each best candidate a grade like $1$ if he is best candidate in every set it appears and $1-e^-alpha N$ for every $N$ times it appears in a set without been the best candidate. You will have to tune the decay rate $alpha$ like any hyperparameter.
$endgroup$
$begingroup$
Thanks @Pedro for your useful answer! However, in order to create a regression model, I would need to train it with grades for each candidate in the training examples. How do you suggest I generate those grades?
$endgroup$
– mak
Apr 8 at 6:23
$begingroup$
True, I proposed a few heuristics to that and updates the answer. Sorry for forgetting that crucial point lol
$endgroup$
– Pedro Henrique Monforte
Apr 8 at 12:37
$begingroup$
Could you return to us with a small report of the success of any of my tips? I am a computer vision researcher and geometry-related models are really useful to my field.
$endgroup$
– Pedro Henrique Monforte
Apr 8 at 14:59
$begingroup$
Thanks @Pedro, all of your ideas are very useful. In my case though, your first idea (individual probability based) and your third idea ($1-e^-alpha N$) might be a bit difficult to implement because I don't have ready information about which candidates appear in multiple sets. I can search for multiple occurrences, but that too is tricky because two candidates may be very similar but not exactly identical due to numerical noise. Your second idea (probability associated with clusters) seems like it might work for me. I'll try out and let you know. Thank you for your wonderful ideas!
$endgroup$
– mak
Apr 9 at 8:01
add a comment |
$begingroup$
I think it would make more sense to train a model to grade (regression) each candidate, them from candidates of a particular set you can use the best candidate from its "grade".
Also, you should try changing the information from raw cloud of points to more meaningful geometric form irfomation:
- Number of vertices/segments
- Segments length mean and variance
- Skewness
- Size and direction of major and minor axis
- Center position
- Moments of area (first,second,third...)
Update
To apply a regression model you will need to generate grades for the training set and that might be a challenge. For that I will propose a few heuristics to generate this:
Since you have about 10k samples, you could assign to every candidate the probability of been the best candidate in any set (for example, if he is the best candidate in 10 sets you can give him a grade $frac1010,000$.
You could try clustering the samples and assigning a grade to every cluster as the probability of a candidate in that cluster been the best candidate by $fracN_bestN_cluster$, where $N_best$ is the number of best candidates in any set in that cluster and $N_cluster$ is the number of candidates in that cluster
You can assign to each best candidate a grade like $1$ if he is best candidate in every set it appears and $1-e^-alpha N$ for every $N$ times it appears in a set without been the best candidate. You will have to tune the decay rate $alpha$ like any hyperparameter.
$endgroup$
$begingroup$
Thanks @Pedro for your useful answer! However, in order to create a regression model, I would need to train it with grades for each candidate in the training examples. How do you suggest I generate those grades?
$endgroup$
– mak
Apr 8 at 6:23
$begingroup$
True, I proposed a few heuristics to that and updates the answer. Sorry for forgetting that crucial point lol
$endgroup$
– Pedro Henrique Monforte
Apr 8 at 12:37
$begingroup$
Could you return to us with a small report of the success of any of my tips? I am a computer vision researcher and geometry-related models are really useful to my field.
$endgroup$
– Pedro Henrique Monforte
Apr 8 at 14:59
$begingroup$
Thanks @Pedro, all of your ideas are very useful. In my case though, your first idea (individual probability based) and your third idea ($1-e^-alpha N$) might be a bit difficult to implement because I don't have ready information about which candidates appear in multiple sets. I can search for multiple occurrences, but that too is tricky because two candidates may be very similar but not exactly identical due to numerical noise. Your second idea (probability associated with clusters) seems like it might work for me. I'll try out and let you know. Thank you for your wonderful ideas!
$endgroup$
– mak
Apr 9 at 8:01
add a comment |
$begingroup$
I think it would make more sense to train a model to grade (regression) each candidate, them from candidates of a particular set you can use the best candidate from its "grade".
Also, you should try changing the information from raw cloud of points to more meaningful geometric form irfomation:
- Number of vertices/segments
- Segments length mean and variance
- Skewness
- Size and direction of major and minor axis
- Center position
- Moments of area (first,second,third...)
Update
To apply a regression model you will need to generate grades for the training set and that might be a challenge. For that I will propose a few heuristics to generate this:
Since you have about 10k samples, you could assign to every candidate the probability of been the best candidate in any set (for example, if he is the best candidate in 10 sets you can give him a grade $frac1010,000$.
You could try clustering the samples and assigning a grade to every cluster as the probability of a candidate in that cluster been the best candidate by $fracN_bestN_cluster$, where $N_best$ is the number of best candidates in any set in that cluster and $N_cluster$ is the number of candidates in that cluster
You can assign to each best candidate a grade like $1$ if he is best candidate in every set it appears and $1-e^-alpha N$ for every $N$ times it appears in a set without been the best candidate. You will have to tune the decay rate $alpha$ like any hyperparameter.
$endgroup$
I think it would make more sense to train a model to grade (regression) each candidate, them from candidates of a particular set you can use the best candidate from its "grade".
Also, you should try changing the information from raw cloud of points to more meaningful geometric form irfomation:
- Number of vertices/segments
- Segments length mean and variance
- Skewness
- Size and direction of major and minor axis
- Center position
- Moments of area (first,second,third...)
Update
To apply a regression model you will need to generate grades for the training set and that might be a challenge. For that I will propose a few heuristics to generate this:
Since you have about 10k samples, you could assign to every candidate the probability of been the best candidate in any set (for example, if he is the best candidate in 10 sets you can give him a grade $frac1010,000$.
You could try clustering the samples and assigning a grade to every cluster as the probability of a candidate in that cluster been the best candidate by $fracN_bestN_cluster$, where $N_best$ is the number of best candidates in any set in that cluster and $N_cluster$ is the number of candidates in that cluster
You can assign to each best candidate a grade like $1$ if he is best candidate in every set it appears and $1-e^-alpha N$ for every $N$ times it appears in a set without been the best candidate. You will have to tune the decay rate $alpha$ like any hyperparameter.
edited Apr 8 at 12:36
answered Apr 7 at 22:06
Pedro Henrique MonfortePedro Henrique Monforte
569219
569219
$begingroup$
Thanks @Pedro for your useful answer! However, in order to create a regression model, I would need to train it with grades for each candidate in the training examples. How do you suggest I generate those grades?
$endgroup$
– mak
Apr 8 at 6:23
$begingroup$
True, I proposed a few heuristics to that and updates the answer. Sorry for forgetting that crucial point lol
$endgroup$
– Pedro Henrique Monforte
Apr 8 at 12:37
$begingroup$
Could you return to us with a small report of the success of any of my tips? I am a computer vision researcher and geometry-related models are really useful to my field.
$endgroup$
– Pedro Henrique Monforte
Apr 8 at 14:59
$begingroup$
Thanks @Pedro, all of your ideas are very useful. In my case though, your first idea (individual probability based) and your third idea ($1-e^-alpha N$) might be a bit difficult to implement because I don't have ready information about which candidates appear in multiple sets. I can search for multiple occurrences, but that too is tricky because two candidates may be very similar but not exactly identical due to numerical noise. Your second idea (probability associated with clusters) seems like it might work for me. I'll try out and let you know. Thank you for your wonderful ideas!
$endgroup$
– mak
Apr 9 at 8:01
add a comment |
$begingroup$
Thanks @Pedro for your useful answer! However, in order to create a regression model, I would need to train it with grades for each candidate in the training examples. How do you suggest I generate those grades?
$endgroup$
– mak
Apr 8 at 6:23
$begingroup$
True, I proposed a few heuristics to that and updates the answer. Sorry for forgetting that crucial point lol
$endgroup$
– Pedro Henrique Monforte
Apr 8 at 12:37
$begingroup$
Could you return to us with a small report of the success of any of my tips? I am a computer vision researcher and geometry-related models are really useful to my field.
$endgroup$
– Pedro Henrique Monforte
Apr 8 at 14:59
$begingroup$
Thanks @Pedro, all of your ideas are very useful. In my case though, your first idea (individual probability based) and your third idea ($1-e^-alpha N$) might be a bit difficult to implement because I don't have ready information about which candidates appear in multiple sets. I can search for multiple occurrences, but that too is tricky because two candidates may be very similar but not exactly identical due to numerical noise. Your second idea (probability associated with clusters) seems like it might work for me. I'll try out and let you know. Thank you for your wonderful ideas!
$endgroup$
– mak
Apr 9 at 8:01
$begingroup$
Thanks @Pedro for your useful answer! However, in order to create a regression model, I would need to train it with grades for each candidate in the training examples. How do you suggest I generate those grades?
$endgroup$
– mak
Apr 8 at 6:23
$begingroup$
Thanks @Pedro for your useful answer! However, in order to create a regression model, I would need to train it with grades for each candidate in the training examples. How do you suggest I generate those grades?
$endgroup$
– mak
Apr 8 at 6:23
$begingroup$
True, I proposed a few heuristics to that and updates the answer. Sorry for forgetting that crucial point lol
$endgroup$
– Pedro Henrique Monforte
Apr 8 at 12:37
$begingroup$
True, I proposed a few heuristics to that and updates the answer. Sorry for forgetting that crucial point lol
$endgroup$
– Pedro Henrique Monforte
Apr 8 at 12:37
$begingroup$
Could you return to us with a small report of the success of any of my tips? I am a computer vision researcher and geometry-related models are really useful to my field.
$endgroup$
– Pedro Henrique Monforte
Apr 8 at 14:59
$begingroup$
Could you return to us with a small report of the success of any of my tips? I am a computer vision researcher and geometry-related models are really useful to my field.
$endgroup$
– Pedro Henrique Monforte
Apr 8 at 14:59
$begingroup$
Thanks @Pedro, all of your ideas are very useful. In my case though, your first idea (individual probability based) and your third idea ($1-e^-alpha N$) might be a bit difficult to implement because I don't have ready information about which candidates appear in multiple sets. I can search for multiple occurrences, but that too is tricky because two candidates may be very similar but not exactly identical due to numerical noise. Your second idea (probability associated with clusters) seems like it might work for me. I'll try out and let you know. Thank you for your wonderful ideas!
$endgroup$
– mak
Apr 9 at 8:01
$begingroup$
Thanks @Pedro, all of your ideas are very useful. In my case though, your first idea (individual probability based) and your third idea ($1-e^-alpha N$) might be a bit difficult to implement because I don't have ready information about which candidates appear in multiple sets. I can search for multiple occurrences, but that too is tricky because two candidates may be very similar but not exactly identical due to numerical noise. Your second idea (probability associated with clusters) seems like it might work for me. I'll try out and let you know. Thank you for your wonderful ideas!
$endgroup$
– mak
Apr 9 at 8:01
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48784%2fmachine-learning-model-to-predict-the-best-candidate%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Do you have labeled comparisons between polygons across sets?
$endgroup$
– jonnor
Apr 7 at 12:34
$begingroup$
Nope. I do not have any comparisons across sets.
$endgroup$
– mak
Apr 7 at 13:52
$begingroup$
That makes it a bit hard. Key here is to be able to formulate the problem as a standard type of ML problem. You can have a look at the Ranking via Pairwise Comparisons for some inspiration, but I'm not sure if it fit entirely...
$endgroup$
– jonnor
Apr 7 at 14:55
$begingroup$
Thanks a lot for your suggestions! Even I had considered pairwise comparisons and I guess they might work, but the performance would go O(n^2). I also considered RNNs but they are meant for ordered sequences, not for unordered sets.
$endgroup$
– mak
Apr 7 at 15:08
$begingroup$
How many polygons in each set, and how many sets?
$endgroup$
– jonnor
Apr 7 at 15:16