Machine Learning Validation Set Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsHow exactly does a validation data-set work work in machine learning?Possible Reason for low Test accuracy and high AUCIntuitive interpretation of ratios between training set scores and validation set scoresReporting test result for cross-validation with Neural NetworkClustering documents - how to evaluate results?Training score at parameter tuning lower than on hold out test set (RandomForestClassifier)What to report in the build model, asses model and evaluate results steps of CRISP-DM?Machine learning - 'train_test_split' function in scikit-learn: should I repeat it several times?Why would a validation set wear out slower than a test set?Feature Scaling and normalization in cross-validation set

How to deal with a team lead who never gives me credit?

Why do people hide their license plates in the EU?

Apollo command module space walk?

How do pianists reach extremely loud dynamics?

Is it a good idea to use CNN to classify 1D signal?

Why do we bend a book to keep it straight?

Why aren't air breathing engines used as small first stages

What would be the ideal power source for a cybernetic eye?

How to remove list items depending on predecessor in python

Amount of permutations on an NxNxN Rubik's Cube

How to find all the available tools in mac terminal?

How widely used is the term Treppenwitz? Is it something that most Germans know?

Withdrew £2800, but only £2000 shows as withdrawn on online banking; what are my obligations?

A binary hook-length formula?

What does an IRS interview request entail when called in to verify expenses for a sole proprietor small business?

How does debian/ubuntu knows a package has a updated version

What is the meaning of the new sigil in Game of Thrones Season 8 intro?

Can an alien society believe that their star system is the universe?

Is it fair for a professor to grade us on the possession of past papers?

Can a USB port passively 'listen only'?

Use second argument for optional first argument if not provided in macro

Do I really need recursive chmod to restrict access to a folder?

Can I cast Passwall to drop an enemy into a 20-foot pit?

Novel: non-telepath helps overthrow rule by telepaths



Machine Learning Validation Set



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsHow exactly does a validation data-set work work in machine learning?Possible Reason for low Test accuracy and high AUCIntuitive interpretation of ratios between training set scores and validation set scoresReporting test result for cross-validation with Neural NetworkClustering documents - how to evaluate results?Training score at parameter tuning lower than on hold out test set (RandomForestClassifier)What to report in the build model, asses model and evaluate results steps of CRISP-DM?Machine learning - 'train_test_split' function in scikit-learn: should I repeat it several times?Why would a validation set wear out slower than a test set?Feature Scaling and normalization in cross-validation set










1












$begingroup$


I have read that validation set is used for Hyper-parameter tuning and comparing models. But, what if my algorithm/model does not have any hyperparameter? Should I use validation set at all? Because comparing models can be done using Test set also.










share|improve this question









$endgroup$











  • $begingroup$
    Does your model progress in a loop? similar to neural networks? In that case you have a different model after each iteration and validation set can be used to keep the best model (at a specific iteration). Otherwise, you have only one model and validation set has no use.
    $endgroup$
    – Esmailian
    Mar 3 at 17:15











  • $begingroup$
    What do you mean to state by 'algorithm does not have any hyperparameter?'. Can you please elaborate on your problem.
    $endgroup$
    – thanatoz
    Apr 3 at 10:34















1












$begingroup$


I have read that validation set is used for Hyper-parameter tuning and comparing models. But, what if my algorithm/model does not have any hyperparameter? Should I use validation set at all? Because comparing models can be done using Test set also.










share|improve this question









$endgroup$











  • $begingroup$
    Does your model progress in a loop? similar to neural networks? In that case you have a different model after each iteration and validation set can be used to keep the best model (at a specific iteration). Otherwise, you have only one model and validation set has no use.
    $endgroup$
    – Esmailian
    Mar 3 at 17:15











  • $begingroup$
    What do you mean to state by 'algorithm does not have any hyperparameter?'. Can you please elaborate on your problem.
    $endgroup$
    – thanatoz
    Apr 3 at 10:34













1












1








1





$begingroup$


I have read that validation set is used for Hyper-parameter tuning and comparing models. But, what if my algorithm/model does not have any hyperparameter? Should I use validation set at all? Because comparing models can be done using Test set also.










share|improve this question









$endgroup$




I have read that validation set is used for Hyper-parameter tuning and comparing models. But, what if my algorithm/model does not have any hyperparameter? Should I use validation set at all? Because comparing models can be done using Test set also.







machine-learning data-science-model






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Mar 3 at 16:35









Rishab BamraraRishab Bamrara

61




61











  • $begingroup$
    Does your model progress in a loop? similar to neural networks? In that case you have a different model after each iteration and validation set can be used to keep the best model (at a specific iteration). Otherwise, you have only one model and validation set has no use.
    $endgroup$
    – Esmailian
    Mar 3 at 17:15











  • $begingroup$
    What do you mean to state by 'algorithm does not have any hyperparameter?'. Can you please elaborate on your problem.
    $endgroup$
    – thanatoz
    Apr 3 at 10:34
















  • $begingroup$
    Does your model progress in a loop? similar to neural networks? In that case you have a different model after each iteration and validation set can be used to keep the best model (at a specific iteration). Otherwise, you have only one model and validation set has no use.
    $endgroup$
    – Esmailian
    Mar 3 at 17:15











  • $begingroup$
    What do you mean to state by 'algorithm does not have any hyperparameter?'. Can you please elaborate on your problem.
    $endgroup$
    – thanatoz
    Apr 3 at 10:34















$begingroup$
Does your model progress in a loop? similar to neural networks? In that case you have a different model after each iteration and validation set can be used to keep the best model (at a specific iteration). Otherwise, you have only one model and validation set has no use.
$endgroup$
– Esmailian
Mar 3 at 17:15





$begingroup$
Does your model progress in a loop? similar to neural networks? In that case you have a different model after each iteration and validation set can be used to keep the best model (at a specific iteration). Otherwise, you have only one model and validation set has no use.
$endgroup$
– Esmailian
Mar 3 at 17:15













$begingroup$
What do you mean to state by 'algorithm does not have any hyperparameter?'. Can you please elaborate on your problem.
$endgroup$
– thanatoz
Apr 3 at 10:34




$begingroup$
What do you mean to state by 'algorithm does not have any hyperparameter?'. Can you please elaborate on your problem.
$endgroup$
– thanatoz
Apr 3 at 10:34










2 Answers
2






active

oldest

votes


















0












$begingroup$

The validation set is there to stop you from using the test set until you are done tuning your model. When you are done tuning, you would like to have a realistic view of how the model will perform on unseen data, which is where the test set comes into play.



But tuning the model is not only hyperparameters. It involves things like feature selection, feature engineering and aslo the choice of algorithm. Even though it seems like you are already decided on a model, you should consider alternatives as it might mot be the optimal choice.






share|improve this answer









$endgroup$




















    -1












    $begingroup$

    Comparing models cannot (or should not) be done using a test set alone. You should always have a final set of data held out to estimate your generalization error. Let’s say you compare 100 different algorithms. One will eventually perform well on the test set just due to the nature of that particular data. You need the final holdout set to get a less biased estimate.



    Comparing models can be looked at the same way as tuning hyperparameters. Think of it this way, when you are tuning hyperparameters, you are comparing models. In terms of requirements comparing random forest with 200 tress vs random forest with 500 trees is no different then comparing random forest to a neural net.






    share|improve this answer









    $endgroup$













      Your Answer








      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "557"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46586%2fmachine-learning-validation-set%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      0












      $begingroup$

      The validation set is there to stop you from using the test set until you are done tuning your model. When you are done tuning, you would like to have a realistic view of how the model will perform on unseen data, which is where the test set comes into play.



      But tuning the model is not only hyperparameters. It involves things like feature selection, feature engineering and aslo the choice of algorithm. Even though it seems like you are already decided on a model, you should consider alternatives as it might mot be the optimal choice.






      share|improve this answer









      $endgroup$

















        0












        $begingroup$

        The validation set is there to stop you from using the test set until you are done tuning your model. When you are done tuning, you would like to have a realistic view of how the model will perform on unseen data, which is where the test set comes into play.



        But tuning the model is not only hyperparameters. It involves things like feature selection, feature engineering and aslo the choice of algorithm. Even though it seems like you are already decided on a model, you should consider alternatives as it might mot be the optimal choice.






        share|improve this answer









        $endgroup$















          0












          0








          0





          $begingroup$

          The validation set is there to stop you from using the test set until you are done tuning your model. When you are done tuning, you would like to have a realistic view of how the model will perform on unseen data, which is where the test set comes into play.



          But tuning the model is not only hyperparameters. It involves things like feature selection, feature engineering and aslo the choice of algorithm. Even though it seems like you are already decided on a model, you should consider alternatives as it might mot be the optimal choice.






          share|improve this answer









          $endgroup$



          The validation set is there to stop you from using the test set until you are done tuning your model. When you are done tuning, you would like to have a realistic view of how the model will perform on unseen data, which is where the test set comes into play.



          But tuning the model is not only hyperparameters. It involves things like feature selection, feature engineering and aslo the choice of algorithm. Even though it seems like you are already decided on a model, you should consider alternatives as it might mot be the optimal choice.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Mar 3 at 17:24









          user10283726user10283726

          313




          313





















              -1












              $begingroup$

              Comparing models cannot (or should not) be done using a test set alone. You should always have a final set of data held out to estimate your generalization error. Let’s say you compare 100 different algorithms. One will eventually perform well on the test set just due to the nature of that particular data. You need the final holdout set to get a less biased estimate.



              Comparing models can be looked at the same way as tuning hyperparameters. Think of it this way, when you are tuning hyperparameters, you are comparing models. In terms of requirements comparing random forest with 200 tress vs random forest with 500 trees is no different then comparing random forest to a neural net.






              share|improve this answer









              $endgroup$

















                -1












                $begingroup$

                Comparing models cannot (or should not) be done using a test set alone. You should always have a final set of data held out to estimate your generalization error. Let’s say you compare 100 different algorithms. One will eventually perform well on the test set just due to the nature of that particular data. You need the final holdout set to get a less biased estimate.



                Comparing models can be looked at the same way as tuning hyperparameters. Think of it this way, when you are tuning hyperparameters, you are comparing models. In terms of requirements comparing random forest with 200 tress vs random forest with 500 trees is no different then comparing random forest to a neural net.






                share|improve this answer









                $endgroup$















                  -1












                  -1








                  -1





                  $begingroup$

                  Comparing models cannot (or should not) be done using a test set alone. You should always have a final set of data held out to estimate your generalization error. Let’s say you compare 100 different algorithms. One will eventually perform well on the test set just due to the nature of that particular data. You need the final holdout set to get a less biased estimate.



                  Comparing models can be looked at the same way as tuning hyperparameters. Think of it this way, when you are tuning hyperparameters, you are comparing models. In terms of requirements comparing random forest with 200 tress vs random forest with 500 trees is no different then comparing random forest to a neural net.






                  share|improve this answer









                  $endgroup$



                  Comparing models cannot (or should not) be done using a test set alone. You should always have a final set of data held out to estimate your generalization error. Let’s say you compare 100 different algorithms. One will eventually perform well on the test set just due to the nature of that particular data. You need the final holdout set to get a less biased estimate.



                  Comparing models can be looked at the same way as tuning hyperparameters. Think of it this way, when you are tuning hyperparameters, you are comparing models. In terms of requirements comparing random forest with 200 tress vs random forest with 500 trees is no different then comparing random forest to a neural net.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Mar 4 at 1:19









                  astelastel

                  1392




                  1392



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Data Science Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46586%2fmachine-learning-validation-set%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

                      Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

                      Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High