Confusion in applying k-fold cross validation to dataset2019 Community Moderator Electionhow to generate sample dataset for classification problemTraining Validation Testing set split for facial expression datasetConsistently inconsistent cross-validation results that are wildly different from original model accuracyK fold cross validation algorithmReporting test result for cross-validation with Neural NetworkLinear Regression + KFold cross validationEM-ELM Cross validationOversampling before Cross-Validation, is it a problem?Cross validation for highly imbalanced data with undersamplingPCA, SMOTE and cross validation- how to combine them together?

To string or not to string

How to add double frame in tcolorbox?

Why don't electron-positron collisions release infinite energy?

What would happen to a modern skyscraper if it rains micro blackholes?

Fully-Firstable Anagram Sets

Can a Warlock become Neutral Good?

How old can references or sources in a thesis be?

How can bays and straits be determined in a procedurally generated map?

Is it unprofessional to ask if a job posting on GlassDoor is real?

Mage Armor with Defense fighting style (for Adventurers League bladeslinger)

Fencing style for blades that can attack from a distance

Watching something be written to a file live with tail

I'm planning on buying a laser printer but concerned about the life cycle of toner in the machine

"to be prejudice towards/against someone" vs "to be prejudiced against/towards someone"

US citizen flying to France today and my passport expires in less than 2 months

Today is the Center

Is it possible to do 50 km distance without any previous training?

Can an x86 CPU running in real mode be considered to be basically an 8086 CPU?

The Clique vs. Independent Set Problem

How do we improve the relationship with a client software team that performs poorly and is becoming less collaborative?

In Japanese, what’s the difference between “Tonari ni” (となりに) and “Tsugi” (つぎ)? When would you use one over the other?

Are the number of citations and number of published articles the most important criteria for a tenure promotion?

Replacing matching entries in one column of a file by another column from a different file

Theorems that impeded progress

Confusion in applying k-fold cross validation to dataset

2019 Community Moderator Electionhow to generate sample dataset for classification problemTraining Validation Testing set split for facial expression datasetConsistently inconsistent cross-validation results that are wildly different from original model accuracyK fold cross validation algorithmReporting test result for cross-validation with Neural NetworkLinear Regression + KFold cross validationEM-ELM Cross validationOversampling before Cross-Validation, is it a problem?Cross validation for highly imbalanced data with undersamplingPCA, SMOTE and cross validation- how to combine them together?

I have a data set which is already divided into 10 folds with each fold having training,validation and test sets. I'm not able to understand how to apply 10 fold cross validation on this data set.

In general, if we want to apply k-fold cross validation on a data set, the procedure is as follows

enter image description here

In my case, the data set is already divided into 10 folds and each fold contains validation and test sets in addition to training set. It would be helpful if someone can guide me, how to 10 fold cross validation for this kind of data set.

asked Mar 27 at 16:00

Kalyan Katikapalli

1

$begingroup$
Welcome to this site! If you want to do K-fold CV on these K folds, ignore the inner training-validation-test separations, do the CV, then report the test score. Otherwise, why you are not allowed to ignore the inner separations and merge them? The answer to this question is key and depends on your specific case.
$endgroup$
– Esmailian
Mar 27 at 16:11

add a comment |

I have a data set which is already divided into 10 folds with each fold having training,validation and test sets. I'm not able to understand how to apply 10 fold cross validation on this data set.

In general, if we want to apply k-fold cross validation on a data set, the procedure is as follows

enter image description here

asked Mar 27 at 16:00

Kalyan Katikapalli

1

$begingroup$
Welcome to this site! If you want to do K-fold CV on these K folds, ignore the inner training-validation-test separations, do the CV, then report the test score. Otherwise, why you are not allowed to ignore the inner separations and merge them? The answer to this question is key and depends on your specific case.
$endgroup$
– Esmailian
Mar 27 at 16:11

add a comment |

I have a data set which is already divided into 10 folds with each fold having training,validation and test sets. I'm not able to understand how to apply 10 fold cross validation on this data set.

In general, if we want to apply k-fold cross validation on a data set, the procedure is as follows

enter image description here

asked Mar 27 at 16:00

Kalyan Katikapalli

I have a data set which is already divided into 10 folds with each fold having training,validation and test sets. I'm not able to understand how to apply 10 fold cross validation on this data set.

In general, if we want to apply k-fold cross validation on a data set, the procedure is as follows

enter image description here

machine-learning

asked Mar 27 at 16:00

Kalyan Katikapalli

asked Mar 27 at 16:00

Kalyan Katikapalli

asked Mar 27 at 16:00

Kalyan Katikapalli

asked Mar 27 at 16:00

Kalyan Katikapalli

asked Mar 27 at 16:00

Kalyan Katikapalli

1

$begingroup$
Welcome to this site! If you want to do K-fold CV on these K folds, ignore the inner training-validation-test separations, do the CV, then report the test score. Otherwise, why you are not allowed to ignore the inner separations and merge them? The answer to this question is key and depends on your specific case.
$endgroup$
– Esmailian
Mar 27 at 16:11

add a comment |

1

$begingroup$
Welcome to this site! If you want to do K-fold CV on these K folds, ignore the inner training-validation-test separations, do the CV, then report the test score. Otherwise, why you are not allowed to ignore the inner separations and merge them? The answer to this question is key and depends on your specific case.
$endgroup$
– Esmailian
Mar 27 at 16:11

Welcome to this site! If you want to do K-fold CV on these K folds, ignore the inner training-validation-test separations, do the CV, then report the test score. Otherwise, why you are not allowed to ignore the inner separations and merge them? The answer to this question is key and depends on your specific case.

– Esmailian
Mar 27 at 16:11

add a comment |

1 Answer
1

active

oldest

votes

In 10 fold cross-validation, you split your dataset into 10 sections, 9 of them are for train and one for test set (there is no validation set), for example, if your dataset is 100 samples, inside a loop, in the first fold (first loop iter), the model train on 90 samples and the rest 10 are for testing the model, and loop is continued until all the dataset is used for training and testing.

for more, see here

and in python, you can implement 10 fold cross-validation using sklearn library here

Now, because your dataset is already split into 10 fold, you have two choices:

1- The easiest way is to combine your dataset into one set then using a specific library to do the 10 fold cross validation for you.

2- write code by yourself to loop over your 10 fold data, in the first iter use the first section for testing and the rest 9 for the training, in the second iter, use the second section for testing, and the first and other 8 sections for training, the loop should continue 10 times until all the data is used for training and testing.

this is the idea behind 10 fold cross validation if this not applicable for your dataset, I think 10 fold is not good in your case.

edited Mar 28 at 13:02

answered Mar 27 at 16:35

honar.cs

31614

$begingroup$
The data set is already split to 10 folds with each fold internally split into train,test and validation sets. In this case, how to apply 10-fold cross validation?. @honar.cs
$endgroup$
– Kalyan Katikapalli
Mar 27 at 16:38

$begingroup$
The answer you told is applicable in the case where each fold doesn't have internal split into train, test and validation sets.
$endgroup$
– Kalyan Katikapalli
Mar 27 at 16:40

$begingroup$
I can ignore the internal split and apply cv. Here the question is, "Is there any other strategy to handle these kinds of datasets"?
$endgroup$
– Kalyan Katikapalli
Mar 28 at 2:11

$begingroup$
the question is updated, see if it can help you.
$endgroup$
– honar.cs
Mar 28 at 13:06

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48095%2fconfusion-in-applying-k-fold-cross-validation-to-dataset%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

for more, see here

and in python, you can implement 10 fold cross-validation using sklearn library here

Now, because your dataset is already split into 10 fold, you have two choices:

1- The easiest way is to combine your dataset into one set then using a specific library to do the 10 fold cross validation for you.

this is the idea behind 10 fold cross validation if this not applicable for your dataset, I think 10 fold is not good in your case.

edited Mar 28 at 13:02

answered Mar 27 at 16:35

honar.cs

31614

$begingroup$
The data set is already split to 10 folds with each fold internally split into train,test and validation sets. In this case, how to apply 10-fold cross validation?. @honar.cs
$endgroup$
– Kalyan Katikapalli
Mar 27 at 16:38

$begingroup$
The answer you told is applicable in the case where each fold doesn't have internal split into train, test and validation sets.
$endgroup$
– Kalyan Katikapalli
Mar 27 at 16:40

$begingroup$
I can ignore the internal split and apply cv. Here the question is, "Is there any other strategy to handle these kinds of datasets"?
$endgroup$
– Kalyan Katikapalli
Mar 28 at 2:11

$begingroup$
the question is updated, see if it can help you.
$endgroup$
– honar.cs
Mar 28 at 13:06

add a comment |

for more, see here

and in python, you can implement 10 fold cross-validation using sklearn library here

Now, because your dataset is already split into 10 fold, you have two choices:

1- The easiest way is to combine your dataset into one set then using a specific library to do the 10 fold cross validation for you.

this is the idea behind 10 fold cross validation if this not applicable for your dataset, I think 10 fold is not good in your case.

edited Mar 28 at 13:02

answered Mar 27 at 16:35

honar.cs

31614

$begingroup$
The data set is already split to 10 folds with each fold internally split into train,test and validation sets. In this case, how to apply 10-fold cross validation?. @honar.cs
$endgroup$
– Kalyan Katikapalli
Mar 27 at 16:38

$begingroup$
The answer you told is applicable in the case where each fold doesn't have internal split into train, test and validation sets.
$endgroup$
– Kalyan Katikapalli
Mar 27 at 16:40

$begingroup$
I can ignore the internal split and apply cv. Here the question is, "Is there any other strategy to handle these kinds of datasets"?
$endgroup$
– Kalyan Katikapalli
Mar 28 at 2:11

$begingroup$
the question is updated, see if it can help you.
$endgroup$
– honar.cs
Mar 28 at 13:06

add a comment |

for more, see here

and in python, you can implement 10 fold cross-validation using sklearn library here

Now, because your dataset is already split into 10 fold, you have two choices:

1- The easiest way is to combine your dataset into one set then using a specific library to do the 10 fold cross validation for you.

this is the idea behind 10 fold cross validation if this not applicable for your dataset, I think 10 fold is not good in your case.

edited Mar 28 at 13:02

answered Mar 27 at 16:35

honar.cs

31614

for more, see here

and in python, you can implement 10 fold cross-validation using sklearn library here

Now, because your dataset is already split into 10 fold, you have two choices:

1- The easiest way is to combine your dataset into one set then using a specific library to do the 10 fold cross validation for you.

this is the idea behind 10 fold cross validation if this not applicable for your dataset, I think 10 fold is not good in your case.

edited Mar 28 at 13:02

answered Mar 27 at 16:35

honar.cs

31614

edited Mar 28 at 13:02

answered Mar 27 at 16:35

honar.cs

31614

answered Mar 27 at 16:35

honar.cs

31614

answered Mar 27 at 16:35

honar.cs

31614

$begingroup$
The data set is already split to 10 folds with each fold internally split into train,test and validation sets. In this case, how to apply 10-fold cross validation?. @honar.cs
$endgroup$
– Kalyan Katikapalli
Mar 27 at 16:38

$begingroup$
The answer you told is applicable in the case where each fold doesn't have internal split into train, test and validation sets.
$endgroup$
– Kalyan Katikapalli
Mar 27 at 16:40

$begingroup$
I can ignore the internal split and apply cv. Here the question is, "Is there any other strategy to handle these kinds of datasets"?
$endgroup$
– Kalyan Katikapalli
Mar 28 at 2:11

$begingroup$
the question is updated, see if it can help you.
$endgroup$
– honar.cs
Mar 28 at 13:06

add a comment |

$begingroup$
The data set is already split to 10 folds with each fold internally split into train,test and validation sets. In this case, how to apply 10-fold cross validation?. @honar.cs
$endgroup$
– Kalyan Katikapalli
Mar 27 at 16:38

$begingroup$
The answer you told is applicable in the case where each fold doesn't have internal split into train, test and validation sets.
$endgroup$
– Kalyan Katikapalli
Mar 27 at 16:40

$begingroup$
I can ignore the internal split and apply cv. Here the question is, "Is there any other strategy to handle these kinds of datasets"?
$endgroup$
– Kalyan Katikapalli
Mar 28 at 2:11

$begingroup$
the question is updated, see if it can help you.
$endgroup$
– honar.cs
Mar 28 at 13:06

The data set is already split to 10 folds with each fold internally split into train,test and validation sets. In this case, how to apply 10-fold cross validation?. @honar.cs

– Kalyan Katikapalli
Mar 27 at 16:38

The answer you told is applicable in the case where each fold doesn't have internal split into train, test and validation sets.

– Kalyan Katikapalli
Mar 27 at 16:40

I can ignore the internal split and apply cv. Here the question is, "Is there any other strategy to handle these kinds of datasets"?

– Kalyan Katikapalli
Mar 28 at 2:11

the question is updated, see if it can help you.

– honar.cs
Mar 28 at 13:06

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Trjtdtk

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

1 Answer
1

1 Answer
1

1 Answer
1