How important is the input data for a ML model?Predicting Soccer: guessing which matches a model will predict correctlyBest regression model to use for sales predictionPrediction model for marketing to prospective customers (using pandas)Python: How to make model predict in a generalized manner using ML AlgorithmPython: Handling imbalance Classes in python Machine LearningDecision Tree generating leaves for only one caseLogistic regression on biased datamodel to predict annual outcome based on previous years dataMachine Learning in real timeWhat Machine Learning Algorithm could I use to determine some measure in a date?

Does an advisor owe his/her student anything? Will an advisor keep a PhD student only out of pity?

Multiplicative persistence

A variation to the phrase "hanging over my shoulders"

What is going on with gets(stdin) on the site coderbyte?

Doesn't the system of the Supreme Court oppose justice?

Do we have to expect a queue for the shuttle from Watford Junction to Harry Potter Studio?

Can you use Vicious Mockery to win an argument or gain favours?

A Trivial Diagnosis

Does Doodling or Improvising on the Piano Have Any Benefits?

Make a Bowl of Alphabet Soup

Shouldn’t conservatives embrace universal basic income?

What's the name of the logical fallacy where a debater extends a statement far beyond the original statement to make it true?

Is this part of the description of the Archfey warlock's Misty Escape feature redundant?

15% tax on $7.5k earnings. Is that right?

"before" and "want" for the same systemd service?

Review your own paper in Mathematics

Why is it that I can sometimes guess the next note?

What kind of floor tile is this?

Merge org tables

Does "he squandered his car on drink" sound natural?

How to draw a matrix with arrows in limited space

Giving feedback to someone without sounding prejudiced

How do you make your own symbol when Detexify fails?

The IT department bottlenecks progress, how should I handle this?



How important is the input data for a ML model?


Predicting Soccer: guessing which matches a model will predict correctlyBest regression model to use for sales predictionPrediction model for marketing to prospective customers (using pandas)Python: How to make model predict in a generalized manner using ML AlgorithmPython: Handling imbalance Classes in python Machine LearningDecision Tree generating leaves for only one caseLogistic regression on biased datamodel to predict annual outcome based on previous years dataMachine Learning in real timeWhat Machine Learning Algorithm could I use to determine some measure in a date?













2












$begingroup$


Last 4-6 weeks, I have been learning and working for the first time on ML. Reading blogs, articles, documentations, etc. and practising. Have asked lot of questions here on Stack Overflow as well.



While I have got some amount of hands-on experience, but still got a very basic doubt (confusion) --
When I take my input data set with 1000 records, the model prediction accuracy is say 75%. When I keep 50000 records, the model accuracy is 65%.



1) Does that mean the model responds completely based on the i/p data being fed into?



2) If #1 is true, then in real-world where we don't have control on input data, how will the model work?



Ex. For suggesting products to a customer, the input data to the model would be the past customer buying experiences. As the quantity of input data increases, the prediction accuracy will increase or decrease?



Please let me know if I need to add further details to my question.



Thanks.



Edit - 1 - Below added frequency distribution of my input data:



enter image description here



Edit - 2 - Adding Confusion matrix and Classification report:



enter image description hereenter image description here










share|improve this question











$endgroup$











  • $begingroup$
    It looks like your model overfits did you try to do a train/test split?
    $endgroup$
    – Robin Nicole
    Mar 18 at 23:00










  • $begingroup$
    Thanks Robin. Yes, I've have a 75/25 split. Just out of curiosity, may I ask what hint made you think that the model overfits? ps. Added frequency distribution of my input data in the question.
    $endgroup$
    – ranit.b
    Mar 18 at 23:07











  • $begingroup$
    So I guess it is your test accuracy which decreases. If your training accuracy keeps on increasing but your test accuracy decreases it meanss your model is overfitting.
    $endgroup$
    – Robin Nicole
    Mar 18 at 23:15










  • $begingroup$
    Would you please consider moving this as a Comment to the original question? It works much better that way than a standalone answer.
    $endgroup$
    – I_Play_With_Data
    2 days ago










  • $begingroup$
    Just added: Confusion Matrix and Classification Report. Do you think this is a case of class imbalance ? (where the output classes with less training data are never predicted)
    $endgroup$
    – ranit.b
    2 days ago
















2












$begingroup$


Last 4-6 weeks, I have been learning and working for the first time on ML. Reading blogs, articles, documentations, etc. and practising. Have asked lot of questions here on Stack Overflow as well.



While I have got some amount of hands-on experience, but still got a very basic doubt (confusion) --
When I take my input data set with 1000 records, the model prediction accuracy is say 75%. When I keep 50000 records, the model accuracy is 65%.



1) Does that mean the model responds completely based on the i/p data being fed into?



2) If #1 is true, then in real-world where we don't have control on input data, how will the model work?



Ex. For suggesting products to a customer, the input data to the model would be the past customer buying experiences. As the quantity of input data increases, the prediction accuracy will increase or decrease?



Please let me know if I need to add further details to my question.



Thanks.



Edit - 1 - Below added frequency distribution of my input data:



enter image description here



Edit - 2 - Adding Confusion matrix and Classification report:



enter image description hereenter image description here










share|improve this question











$endgroup$











  • $begingroup$
    It looks like your model overfits did you try to do a train/test split?
    $endgroup$
    – Robin Nicole
    Mar 18 at 23:00










  • $begingroup$
    Thanks Robin. Yes, I've have a 75/25 split. Just out of curiosity, may I ask what hint made you think that the model overfits? ps. Added frequency distribution of my input data in the question.
    $endgroup$
    – ranit.b
    Mar 18 at 23:07











  • $begingroup$
    So I guess it is your test accuracy which decreases. If your training accuracy keeps on increasing but your test accuracy decreases it meanss your model is overfitting.
    $endgroup$
    – Robin Nicole
    Mar 18 at 23:15










  • $begingroup$
    Would you please consider moving this as a Comment to the original question? It works much better that way than a standalone answer.
    $endgroup$
    – I_Play_With_Data
    2 days ago










  • $begingroup$
    Just added: Confusion Matrix and Classification Report. Do you think this is a case of class imbalance ? (where the output classes with less training data are never predicted)
    $endgroup$
    – ranit.b
    2 days ago














2












2








2





$begingroup$


Last 4-6 weeks, I have been learning and working for the first time on ML. Reading blogs, articles, documentations, etc. and practising. Have asked lot of questions here on Stack Overflow as well.



While I have got some amount of hands-on experience, but still got a very basic doubt (confusion) --
When I take my input data set with 1000 records, the model prediction accuracy is say 75%. When I keep 50000 records, the model accuracy is 65%.



1) Does that mean the model responds completely based on the i/p data being fed into?



2) If #1 is true, then in real-world where we don't have control on input data, how will the model work?



Ex. For suggesting products to a customer, the input data to the model would be the past customer buying experiences. As the quantity of input data increases, the prediction accuracy will increase or decrease?



Please let me know if I need to add further details to my question.



Thanks.



Edit - 1 - Below added frequency distribution of my input data:



enter image description here



Edit - 2 - Adding Confusion matrix and Classification report:



enter image description hereenter image description here










share|improve this question











$endgroup$




Last 4-6 weeks, I have been learning and working for the first time on ML. Reading blogs, articles, documentations, etc. and practising. Have asked lot of questions here on Stack Overflow as well.



While I have got some amount of hands-on experience, but still got a very basic doubt (confusion) --
When I take my input data set with 1000 records, the model prediction accuracy is say 75%. When I keep 50000 records, the model accuracy is 65%.



1) Does that mean the model responds completely based on the i/p data being fed into?



2) If #1 is true, then in real-world where we don't have control on input data, how will the model work?



Ex. For suggesting products to a customer, the input data to the model would be the past customer buying experiences. As the quantity of input data increases, the prediction accuracy will increase or decrease?



Please let me know if I need to add further details to my question.



Thanks.



Edit - 1 - Below added frequency distribution of my input data:



enter image description here



Edit - 2 - Adding Confusion matrix and Classification report:



enter image description hereenter image description here







machine-learning predictive-modeling machine-learning-model






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 2 days ago







ranit.b

















asked Mar 18 at 22:31









ranit.branit.b

547




547











  • $begingroup$
    It looks like your model overfits did you try to do a train/test split?
    $endgroup$
    – Robin Nicole
    Mar 18 at 23:00










  • $begingroup$
    Thanks Robin. Yes, I've have a 75/25 split. Just out of curiosity, may I ask what hint made you think that the model overfits? ps. Added frequency distribution of my input data in the question.
    $endgroup$
    – ranit.b
    Mar 18 at 23:07











  • $begingroup$
    So I guess it is your test accuracy which decreases. If your training accuracy keeps on increasing but your test accuracy decreases it meanss your model is overfitting.
    $endgroup$
    – Robin Nicole
    Mar 18 at 23:15










  • $begingroup$
    Would you please consider moving this as a Comment to the original question? It works much better that way than a standalone answer.
    $endgroup$
    – I_Play_With_Data
    2 days ago










  • $begingroup$
    Just added: Confusion Matrix and Classification Report. Do you think this is a case of class imbalance ? (where the output classes with less training data are never predicted)
    $endgroup$
    – ranit.b
    2 days ago

















  • $begingroup$
    It looks like your model overfits did you try to do a train/test split?
    $endgroup$
    – Robin Nicole
    Mar 18 at 23:00










  • $begingroup$
    Thanks Robin. Yes, I've have a 75/25 split. Just out of curiosity, may I ask what hint made you think that the model overfits? ps. Added frequency distribution of my input data in the question.
    $endgroup$
    – ranit.b
    Mar 18 at 23:07











  • $begingroup$
    So I guess it is your test accuracy which decreases. If your training accuracy keeps on increasing but your test accuracy decreases it meanss your model is overfitting.
    $endgroup$
    – Robin Nicole
    Mar 18 at 23:15










  • $begingroup$
    Would you please consider moving this as a Comment to the original question? It works much better that way than a standalone answer.
    $endgroup$
    – I_Play_With_Data
    2 days ago










  • $begingroup$
    Just added: Confusion Matrix and Classification Report. Do you think this is a case of class imbalance ? (where the output classes with less training data are never predicted)
    $endgroup$
    – ranit.b
    2 days ago
















$begingroup$
It looks like your model overfits did you try to do a train/test split?
$endgroup$
– Robin Nicole
Mar 18 at 23:00




$begingroup$
It looks like your model overfits did you try to do a train/test split?
$endgroup$
– Robin Nicole
Mar 18 at 23:00












$begingroup$
Thanks Robin. Yes, I've have a 75/25 split. Just out of curiosity, may I ask what hint made you think that the model overfits? ps. Added frequency distribution of my input data in the question.
$endgroup$
– ranit.b
Mar 18 at 23:07





$begingroup$
Thanks Robin. Yes, I've have a 75/25 split. Just out of curiosity, may I ask what hint made you think that the model overfits? ps. Added frequency distribution of my input data in the question.
$endgroup$
– ranit.b
Mar 18 at 23:07













$begingroup$
So I guess it is your test accuracy which decreases. If your training accuracy keeps on increasing but your test accuracy decreases it meanss your model is overfitting.
$endgroup$
– Robin Nicole
Mar 18 at 23:15




$begingroup$
So I guess it is your test accuracy which decreases. If your training accuracy keeps on increasing but your test accuracy decreases it meanss your model is overfitting.
$endgroup$
– Robin Nicole
Mar 18 at 23:15












$begingroup$
Would you please consider moving this as a Comment to the original question? It works much better that way than a standalone answer.
$endgroup$
– I_Play_With_Data
2 days ago




$begingroup$
Would you please consider moving this as a Comment to the original question? It works much better that way than a standalone answer.
$endgroup$
– I_Play_With_Data
2 days ago












$begingroup$
Just added: Confusion Matrix and Classification Report. Do you think this is a case of class imbalance ? (where the output classes with less training data are never predicted)
$endgroup$
– ranit.b
2 days ago





$begingroup$
Just added: Confusion Matrix and Classification Report. Do you think this is a case of class imbalance ? (where the output classes with less training data are never predicted)
$endgroup$
– ranit.b
2 days ago











2 Answers
2






active

oldest

votes


















2












$begingroup$

To answer your first question, the accuracy of the model highly depends on the "quality" of the input data. Basically, your training data should represent the same scenario as that of the final model deployment environment.



There are two possible reasons why the scenario you mentioned is happening,



  1. When you added more data, maybe there is no good relationship between input features and label for the new examples. It is always said that less and clean data is better than large and messy data.



  2. If 49000 records added afterward are from the same set(i.e. have a good relationship between label and features) as that of 1000 before, there are again two possible reasons



    A. If accuracy on the train dataset is small along with test dataset. e.g. training accuracy is 70% and test accuracy is 65%, then you are underfitting data. Model is very complex and dataset is small in terms of the number of examples.



    B. If your training accuracy is near 100% and test accuracy is 65%, you are overfitting data. Model is complex, so you should go with some simple algorithm.



    NOTE* Since you haven't mentioned about training accuracy, it is difficult to say what out of the two above is happening.



Now coming to your second question about real-world deployment. There is something called model staleness over time which is basically the problem of reducing model accuracy over time. This is the article by a product manager at Google explaining the staleness problem and how it can be solved. This will answer your second question.



Let me know if something is not clear.






share|improve this answer










New contributor




Sagar Shelke is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$












  • $begingroup$
    Many thanks, Sagar. Great piece of advice. Let me ponder my model from your viewpoint. And, will go through the article you posted bit later tonight.
    $endgroup$
    – ranit.b
    2 days ago


















0












$begingroup$

There is a false myth that more data means better classification. The model also needs to build up in its complexity otherwise, the model is just overfitting on the data.



Taking only a few random samples from the data is the best strategy to train models, rather than inputting every bit of data that we can find.






share|improve this answer









$endgroup$








  • 1




    $begingroup$
    Thanks for sharing your thoughts. I guess, I've to understand the overfitting concept more properly.
    $endgroup$
    – ranit.b
    2 days ago











Your Answer





StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47565%2fhow-important-is-the-input-data-for-a-ml-model%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









2












$begingroup$

To answer your first question, the accuracy of the model highly depends on the "quality" of the input data. Basically, your training data should represent the same scenario as that of the final model deployment environment.



There are two possible reasons why the scenario you mentioned is happening,



  1. When you added more data, maybe there is no good relationship between input features and label for the new examples. It is always said that less and clean data is better than large and messy data.



  2. If 49000 records added afterward are from the same set(i.e. have a good relationship between label and features) as that of 1000 before, there are again two possible reasons



    A. If accuracy on the train dataset is small along with test dataset. e.g. training accuracy is 70% and test accuracy is 65%, then you are underfitting data. Model is very complex and dataset is small in terms of the number of examples.



    B. If your training accuracy is near 100% and test accuracy is 65%, you are overfitting data. Model is complex, so you should go with some simple algorithm.



    NOTE* Since you haven't mentioned about training accuracy, it is difficult to say what out of the two above is happening.



Now coming to your second question about real-world deployment. There is something called model staleness over time which is basically the problem of reducing model accuracy over time. This is the article by a product manager at Google explaining the staleness problem and how it can be solved. This will answer your second question.



Let me know if something is not clear.






share|improve this answer










New contributor




Sagar Shelke is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$












  • $begingroup$
    Many thanks, Sagar. Great piece of advice. Let me ponder my model from your viewpoint. And, will go through the article you posted bit later tonight.
    $endgroup$
    – ranit.b
    2 days ago















2












$begingroup$

To answer your first question, the accuracy of the model highly depends on the "quality" of the input data. Basically, your training data should represent the same scenario as that of the final model deployment environment.



There are two possible reasons why the scenario you mentioned is happening,



  1. When you added more data, maybe there is no good relationship between input features and label for the new examples. It is always said that less and clean data is better than large and messy data.



  2. If 49000 records added afterward are from the same set(i.e. have a good relationship between label and features) as that of 1000 before, there are again two possible reasons



    A. If accuracy on the train dataset is small along with test dataset. e.g. training accuracy is 70% and test accuracy is 65%, then you are underfitting data. Model is very complex and dataset is small in terms of the number of examples.



    B. If your training accuracy is near 100% and test accuracy is 65%, you are overfitting data. Model is complex, so you should go with some simple algorithm.



    NOTE* Since you haven't mentioned about training accuracy, it is difficult to say what out of the two above is happening.



Now coming to your second question about real-world deployment. There is something called model staleness over time which is basically the problem of reducing model accuracy over time. This is the article by a product manager at Google explaining the staleness problem and how it can be solved. This will answer your second question.



Let me know if something is not clear.






share|improve this answer










New contributor




Sagar Shelke is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$












  • $begingroup$
    Many thanks, Sagar. Great piece of advice. Let me ponder my model from your viewpoint. And, will go through the article you posted bit later tonight.
    $endgroup$
    – ranit.b
    2 days ago













2












2








2





$begingroup$

To answer your first question, the accuracy of the model highly depends on the "quality" of the input data. Basically, your training data should represent the same scenario as that of the final model deployment environment.



There are two possible reasons why the scenario you mentioned is happening,



  1. When you added more data, maybe there is no good relationship between input features and label for the new examples. It is always said that less and clean data is better than large and messy data.



  2. If 49000 records added afterward are from the same set(i.e. have a good relationship between label and features) as that of 1000 before, there are again two possible reasons



    A. If accuracy on the train dataset is small along with test dataset. e.g. training accuracy is 70% and test accuracy is 65%, then you are underfitting data. Model is very complex and dataset is small in terms of the number of examples.



    B. If your training accuracy is near 100% and test accuracy is 65%, you are overfitting data. Model is complex, so you should go with some simple algorithm.



    NOTE* Since you haven't mentioned about training accuracy, it is difficult to say what out of the two above is happening.



Now coming to your second question about real-world deployment. There is something called model staleness over time which is basically the problem of reducing model accuracy over time. This is the article by a product manager at Google explaining the staleness problem and how it can be solved. This will answer your second question.



Let me know if something is not clear.






share|improve this answer










New contributor




Sagar Shelke is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$



To answer your first question, the accuracy of the model highly depends on the "quality" of the input data. Basically, your training data should represent the same scenario as that of the final model deployment environment.



There are two possible reasons why the scenario you mentioned is happening,



  1. When you added more data, maybe there is no good relationship between input features and label for the new examples. It is always said that less and clean data is better than large and messy data.



  2. If 49000 records added afterward are from the same set(i.e. have a good relationship between label and features) as that of 1000 before, there are again two possible reasons



    A. If accuracy on the train dataset is small along with test dataset. e.g. training accuracy is 70% and test accuracy is 65%, then you are underfitting data. Model is very complex and dataset is small in terms of the number of examples.



    B. If your training accuracy is near 100% and test accuracy is 65%, you are overfitting data. Model is complex, so you should go with some simple algorithm.



    NOTE* Since you haven't mentioned about training accuracy, it is difficult to say what out of the two above is happening.



Now coming to your second question about real-world deployment. There is something called model staleness over time which is basically the problem of reducing model accuracy over time. This is the article by a product manager at Google explaining the staleness problem and how it can be solved. This will answer your second question.



Let me know if something is not clear.







share|improve this answer










New contributor




Sagar Shelke is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this answer



share|improve this answer








edited 2 days ago





















New contributor




Sagar Shelke is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









answered 2 days ago









Sagar ShelkeSagar Shelke

362




362




New contributor




Sagar Shelke is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Sagar Shelke is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Sagar Shelke is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











  • $begingroup$
    Many thanks, Sagar. Great piece of advice. Let me ponder my model from your viewpoint. And, will go through the article you posted bit later tonight.
    $endgroup$
    – ranit.b
    2 days ago
















  • $begingroup$
    Many thanks, Sagar. Great piece of advice. Let me ponder my model from your viewpoint. And, will go through the article you posted bit later tonight.
    $endgroup$
    – ranit.b
    2 days ago















$begingroup$
Many thanks, Sagar. Great piece of advice. Let me ponder my model from your viewpoint. And, will go through the article you posted bit later tonight.
$endgroup$
– ranit.b
2 days ago




$begingroup$
Many thanks, Sagar. Great piece of advice. Let me ponder my model from your viewpoint. And, will go through the article you posted bit later tonight.
$endgroup$
– ranit.b
2 days ago











0












$begingroup$

There is a false myth that more data means better classification. The model also needs to build up in its complexity otherwise, the model is just overfitting on the data.



Taking only a few random samples from the data is the best strategy to train models, rather than inputting every bit of data that we can find.






share|improve this answer









$endgroup$








  • 1




    $begingroup$
    Thanks for sharing your thoughts. I guess, I've to understand the overfitting concept more properly.
    $endgroup$
    – ranit.b
    2 days ago
















0












$begingroup$

There is a false myth that more data means better classification. The model also needs to build up in its complexity otherwise, the model is just overfitting on the data.



Taking only a few random samples from the data is the best strategy to train models, rather than inputting every bit of data that we can find.






share|improve this answer









$endgroup$








  • 1




    $begingroup$
    Thanks for sharing your thoughts. I guess, I've to understand the overfitting concept more properly.
    $endgroup$
    – ranit.b
    2 days ago














0












0








0





$begingroup$

There is a false myth that more data means better classification. The model also needs to build up in its complexity otherwise, the model is just overfitting on the data.



Taking only a few random samples from the data is the best strategy to train models, rather than inputting every bit of data that we can find.






share|improve this answer









$endgroup$



There is a false myth that more data means better classification. The model also needs to build up in its complexity otherwise, the model is just overfitting on the data.



Taking only a few random samples from the data is the best strategy to train models, rather than inputting every bit of data that we can find.







share|improve this answer












share|improve this answer



share|improve this answer










answered 2 days ago









abunickabhiabunickabhi

1838




1838







  • 1




    $begingroup$
    Thanks for sharing your thoughts. I guess, I've to understand the overfitting concept more properly.
    $endgroup$
    – ranit.b
    2 days ago













  • 1




    $begingroup$
    Thanks for sharing your thoughts. I guess, I've to understand the overfitting concept more properly.
    $endgroup$
    – ranit.b
    2 days ago








1




1




$begingroup$
Thanks for sharing your thoughts. I guess, I've to understand the overfitting concept more properly.
$endgroup$
– ranit.b
2 days ago





$begingroup$
Thanks for sharing your thoughts. I guess, I've to understand the overfitting concept more properly.
$endgroup$
– ranit.b
2 days ago


















draft saved

draft discarded
















































Thanks for contributing an answer to Data Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47565%2fhow-important-is-the-input-data-for-a-ml-model%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High