Machine Learning Validation Set Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsHow exactly does a validation data-set work work in machine learning?Possible Reason for low Test accuracy and high AUCIntuitive interpretation of ratios between training set scores and validation set scoresReporting test result for cross-validation with Neural NetworkClustering documents - how to evaluate results?Training score at parameter tuning lower than on hold out test set (RandomForestClassifier)What to report in the build model, asses model and evaluate results steps of CRISP-DM?Machine learning - 'train_test_split' function in scikit-learn: should I repeat it several times?Why would a validation set wear out slower than a test set?Feature Scaling and normalization in cross-validation set
How to deal with a team lead who never gives me credit?
Why do people hide their license plates in the EU?
Apollo command module space walk?
How do pianists reach extremely loud dynamics?
Is it a good idea to use CNN to classify 1D signal?
Why do we bend a book to keep it straight?
Why aren't air breathing engines used as small first stages
What would be the ideal power source for a cybernetic eye?
How to remove list items depending on predecessor in python
Amount of permutations on an NxNxN Rubik's Cube
How to find all the available tools in mac terminal?
How widely used is the term Treppenwitz? Is it something that most Germans know?
Withdrew £2800, but only £2000 shows as withdrawn on online banking; what are my obligations?
A binary hook-length formula?
What does an IRS interview request entail when called in to verify expenses for a sole proprietor small business?
How does debian/ubuntu knows a package has a updated version
What is the meaning of the new sigil in Game of Thrones Season 8 intro?
Can an alien society believe that their star system is the universe?
Is it fair for a professor to grade us on the possession of past papers?
Can a USB port passively 'listen only'?
Use second argument for optional first argument if not provided in macro
Do I really need recursive chmod to restrict access to a folder?
Can I cast Passwall to drop an enemy into a 20-foot pit?
Novel: non-telepath helps overthrow rule by telepaths
Machine Learning Validation Set
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsHow exactly does a validation data-set work work in machine learning?Possible Reason for low Test accuracy and high AUCIntuitive interpretation of ratios between training set scores and validation set scoresReporting test result for cross-validation with Neural NetworkClustering documents - how to evaluate results?Training score at parameter tuning lower than on hold out test set (RandomForestClassifier)What to report in the build model, asses model and evaluate results steps of CRISP-DM?Machine learning - 'train_test_split' function in scikit-learn: should I repeat it several times?Why would a validation set wear out slower than a test set?Feature Scaling and normalization in cross-validation set
$begingroup$
I have read that validation set is used for Hyper-parameter tuning and comparing models. But, what if my algorithm/model does not have any hyperparameter? Should I use validation set at all? Because comparing models can be done using Test set also.
machine-learning data-science-model
$endgroup$
add a comment |
$begingroup$
I have read that validation set is used for Hyper-parameter tuning and comparing models. But, what if my algorithm/model does not have any hyperparameter? Should I use validation set at all? Because comparing models can be done using Test set also.
machine-learning data-science-model
$endgroup$
$begingroup$
Does your model progress in a loop? similar to neural networks? In that case you have a different model after each iteration and validation set can be used to keep the best model (at a specific iteration). Otherwise, you have only one model and validation set has no use.
$endgroup$
– Esmailian
Mar 3 at 17:15
$begingroup$
What do you mean to state by 'algorithm does not have any hyperparameter?'. Can you please elaborate on your problem.
$endgroup$
– thanatoz
Apr 3 at 10:34
add a comment |
$begingroup$
I have read that validation set is used for Hyper-parameter tuning and comparing models. But, what if my algorithm/model does not have any hyperparameter? Should I use validation set at all? Because comparing models can be done using Test set also.
machine-learning data-science-model
$endgroup$
I have read that validation set is used for Hyper-parameter tuning and comparing models. But, what if my algorithm/model does not have any hyperparameter? Should I use validation set at all? Because comparing models can be done using Test set also.
machine-learning data-science-model
machine-learning data-science-model
asked Mar 3 at 16:35
Rishab BamraraRishab Bamrara
61
61
$begingroup$
Does your model progress in a loop? similar to neural networks? In that case you have a different model after each iteration and validation set can be used to keep the best model (at a specific iteration). Otherwise, you have only one model and validation set has no use.
$endgroup$
– Esmailian
Mar 3 at 17:15
$begingroup$
What do you mean to state by 'algorithm does not have any hyperparameter?'. Can you please elaborate on your problem.
$endgroup$
– thanatoz
Apr 3 at 10:34
add a comment |
$begingroup$
Does your model progress in a loop? similar to neural networks? In that case you have a different model after each iteration and validation set can be used to keep the best model (at a specific iteration). Otherwise, you have only one model and validation set has no use.
$endgroup$
– Esmailian
Mar 3 at 17:15
$begingroup$
What do you mean to state by 'algorithm does not have any hyperparameter?'. Can you please elaborate on your problem.
$endgroup$
– thanatoz
Apr 3 at 10:34
$begingroup$
Does your model progress in a loop? similar to neural networks? In that case you have a different model after each iteration and validation set can be used to keep the best model (at a specific iteration). Otherwise, you have only one model and validation set has no use.
$endgroup$
– Esmailian
Mar 3 at 17:15
$begingroup$
Does your model progress in a loop? similar to neural networks? In that case you have a different model after each iteration and validation set can be used to keep the best model (at a specific iteration). Otherwise, you have only one model and validation set has no use.
$endgroup$
– Esmailian
Mar 3 at 17:15
$begingroup$
What do you mean to state by 'algorithm does not have any hyperparameter?'. Can you please elaborate on your problem.
$endgroup$
– thanatoz
Apr 3 at 10:34
$begingroup$
What do you mean to state by 'algorithm does not have any hyperparameter?'. Can you please elaborate on your problem.
$endgroup$
– thanatoz
Apr 3 at 10:34
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
The validation set is there to stop you from using the test set until you are done tuning your model. When you are done tuning, you would like to have a realistic view of how the model will perform on unseen data, which is where the test set comes into play.
But tuning the model is not only hyperparameters. It involves things like feature selection, feature engineering and aslo the choice of algorithm. Even though it seems like you are already decided on a model, you should consider alternatives as it might mot be the optimal choice.
$endgroup$
add a comment |
$begingroup$
Comparing models cannot (or should not) be done using a test set alone. You should always have a final set of data held out to estimate your generalization error. Let’s say you compare 100 different algorithms. One will eventually perform well on the test set just due to the nature of that particular data. You need the final holdout set to get a less biased estimate.
Comparing models can be looked at the same way as tuning hyperparameters. Think of it this way, when you are tuning hyperparameters, you are comparing models. In terms of requirements comparing random forest with 200 tress vs random forest with 500 trees is no different then comparing random forest to a neural net.
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46586%2fmachine-learning-validation-set%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
The validation set is there to stop you from using the test set until you are done tuning your model. When you are done tuning, you would like to have a realistic view of how the model will perform on unseen data, which is where the test set comes into play.
But tuning the model is not only hyperparameters. It involves things like feature selection, feature engineering and aslo the choice of algorithm. Even though it seems like you are already decided on a model, you should consider alternatives as it might mot be the optimal choice.
$endgroup$
add a comment |
$begingroup$
The validation set is there to stop you from using the test set until you are done tuning your model. When you are done tuning, you would like to have a realistic view of how the model will perform on unseen data, which is where the test set comes into play.
But tuning the model is not only hyperparameters. It involves things like feature selection, feature engineering and aslo the choice of algorithm. Even though it seems like you are already decided on a model, you should consider alternatives as it might mot be the optimal choice.
$endgroup$
add a comment |
$begingroup$
The validation set is there to stop you from using the test set until you are done tuning your model. When you are done tuning, you would like to have a realistic view of how the model will perform on unseen data, which is where the test set comes into play.
But tuning the model is not only hyperparameters. It involves things like feature selection, feature engineering and aslo the choice of algorithm. Even though it seems like you are already decided on a model, you should consider alternatives as it might mot be the optimal choice.
$endgroup$
The validation set is there to stop you from using the test set until you are done tuning your model. When you are done tuning, you would like to have a realistic view of how the model will perform on unseen data, which is where the test set comes into play.
But tuning the model is not only hyperparameters. It involves things like feature selection, feature engineering and aslo the choice of algorithm. Even though it seems like you are already decided on a model, you should consider alternatives as it might mot be the optimal choice.
answered Mar 3 at 17:24
user10283726user10283726
313
313
add a comment |
add a comment |
$begingroup$
Comparing models cannot (or should not) be done using a test set alone. You should always have a final set of data held out to estimate your generalization error. Let’s say you compare 100 different algorithms. One will eventually perform well on the test set just due to the nature of that particular data. You need the final holdout set to get a less biased estimate.
Comparing models can be looked at the same way as tuning hyperparameters. Think of it this way, when you are tuning hyperparameters, you are comparing models. In terms of requirements comparing random forest with 200 tress vs random forest with 500 trees is no different then comparing random forest to a neural net.
$endgroup$
add a comment |
$begingroup$
Comparing models cannot (or should not) be done using a test set alone. You should always have a final set of data held out to estimate your generalization error. Let’s say you compare 100 different algorithms. One will eventually perform well on the test set just due to the nature of that particular data. You need the final holdout set to get a less biased estimate.
Comparing models can be looked at the same way as tuning hyperparameters. Think of it this way, when you are tuning hyperparameters, you are comparing models. In terms of requirements comparing random forest with 200 tress vs random forest with 500 trees is no different then comparing random forest to a neural net.
$endgroup$
add a comment |
$begingroup$
Comparing models cannot (or should not) be done using a test set alone. You should always have a final set of data held out to estimate your generalization error. Let’s say you compare 100 different algorithms. One will eventually perform well on the test set just due to the nature of that particular data. You need the final holdout set to get a less biased estimate.
Comparing models can be looked at the same way as tuning hyperparameters. Think of it this way, when you are tuning hyperparameters, you are comparing models. In terms of requirements comparing random forest with 200 tress vs random forest with 500 trees is no different then comparing random forest to a neural net.
$endgroup$
Comparing models cannot (or should not) be done using a test set alone. You should always have a final set of data held out to estimate your generalization error. Let’s say you compare 100 different algorithms. One will eventually perform well on the test set just due to the nature of that particular data. You need the final holdout set to get a less biased estimate.
Comparing models can be looked at the same way as tuning hyperparameters. Think of it this way, when you are tuning hyperparameters, you are comparing models. In terms of requirements comparing random forest with 200 tress vs random forest with 500 trees is no different then comparing random forest to a neural net.
answered Mar 4 at 1:19
astelastel
1392
1392
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46586%2fmachine-learning-validation-set%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Does your model progress in a loop? similar to neural networks? In that case you have a different model after each iteration and validation set can be used to keep the best model (at a specific iteration). Otherwise, you have only one model and validation set has no use.
$endgroup$
– Esmailian
Mar 3 at 17:15
$begingroup$
What do you mean to state by 'algorithm does not have any hyperparameter?'. Can you please elaborate on your problem.
$endgroup$
– thanatoz
Apr 3 at 10:34