Is filtering a dataset still a good option if the dataset is very small?2019 Community Moderator ElectionWhat is the meaning of spherical dataset?How should classification be done for a very small data set?Prediction questions related to the datasetIf an NMT dataset is artificially enlarged by splitting sequences up, should it still train for the same number of epochs?Good dataset for sentiment Analysis in Tickets for IT SupportDownsampling the dataset to create balanced dataset for neural modelsPrediction interval for very small datasetStructure the dataset for financial machine learningBest CNN architecture for binary classification of small images with a massive datasetHow to get the marker locations in the LINEMOD dataset?

Why didn't people conceal Tzaraat?

Why can't we play rap on piano?

How can I deal with my CEO asking me to hire someone with a higher salary than me, a co-founder?

How to properly check if the given string is empty in a POSIX shell script?

Does the Idaho Potato Commission associate potato skins with healthy eating?

Is it possible to create a QR code using text?

meaning of 腰を落としている

What is the fastest integer factorization to break RSA?

Am I breaking OOP practice with this architecture?

How would I stat a creature to be immune to everything but the Magic Missile spell? (just for fun)

Do Iron Man suits sport waste management systems?

How to travel to Japan while expressing milk?

How dangerous is XSS?

Detention in 1997

What exploit Are these user agents trying to use?

Can compressed videos be decoded back to their uncompresed original format?

In Bayesian inference, why are some terms dropped from the posterior predictive?

How does having to sign to support someone for elections fit with having a secret ballot?

Is this draw by repetition?

Is it "common practice in Fourier transform spectroscopy to multiply the measured interferogram by an apodizing function"? If so, why?

What exactly is ineptocracy?

Why do I get negative height?

Does the Cone of Cold spell freeze water?

Using "tail" to follow a file without displaying the most recent lines

Is filtering a dataset still a good option if the dataset is very small?

2019 Community Moderator ElectionWhat is the meaning of spherical dataset?How should classification be done for a very small data set?Prediction questions related to the datasetIf an NMT dataset is artificially enlarged by splitting sequences up, should it still train for the same number of epochs?Good dataset for sentiment Analysis in Tickets for IT SupportDownsampling the dataset to create balanced dataset for neural modelsPrediction interval for very small datasetStructure the dataset for financial machine learningBest CNN architecture for binary classification of small images with a massive datasetHow to get the marker locations in the LINEMOD dataset?

Suppose I have a data set as follow:

var1 var2 ... varN test1 test2
 x x ... x good v.good
 x x ... x good bad
 x x ... x meh bad
 x x ... x good good
 x x ... x v.bad bad
 x x ... x bad bad
 x x ... x meh good
 x x ... x good good
 x x ... x v.bad good
 x x ... x good bad

test2 is a more sophisticated version of test1, I want to know what makes my test2 bad if my test1 has the value good. For that, I filtered this data to only include rows where test1 has the value good.
My dataset becomes:

var1 var2 ... varN test1 test2 Y
 x x ... x good v.good 1
 x x ... x good bad 0
 x x ... x good bad 0
 x x ... x good good 1
 x x ... x good good 1
 x x ... x good bad 0

I did this since it will allow me to know exactly what changes in var1, ..., varN makes the test go from good to bad when using logistic regression or some heuristic approach.

My question is: Does this still hold if we have, per say, a dataset of 100 observation and that filtration slices it in half?

asked Mar 26 at 17:08

Mohamed Nidabdella

add a comment |

Suppose I have a data set as follow:

var1 var2 ... varN test1 test2
 x x ... x good v.good
 x x ... x good bad
 x x ... x meh bad
 x x ... x good good
 x x ... x v.bad bad
 x x ... x bad bad
 x x ... x meh good
 x x ... x good good
 x x ... x v.bad good
 x x ... x good bad

var1 var2 ... varN test1 test2 Y
 x x ... x good v.good 1
 x x ... x good bad 0
 x x ... x good bad 0
 x x ... x good good 1
 x x ... x good good 1
 x x ... x good bad 0

I did this since it will allow me to know exactly what changes in var1, ..., varN makes the test go from good to bad when using logistic regression or some heuristic approach.

My question is: Does this still hold if we have, per say, a dataset of 100 observation and that filtration slices it in half?

asked Mar 26 at 17:08

Mohamed Nidabdella

add a comment |

Suppose I have a data set as follow:

var1 var2 ... varN test1 test2
 x x ... x good v.good
 x x ... x good bad
 x x ... x meh bad
 x x ... x good good
 x x ... x v.bad bad
 x x ... x bad bad
 x x ... x meh good
 x x ... x good good
 x x ... x v.bad good
 x x ... x good bad

var1 var2 ... varN test1 test2 Y
 x x ... x good v.good 1
 x x ... x good bad 0
 x x ... x good bad 0
 x x ... x good good 1
 x x ... x good good 1
 x x ... x good bad 0

I did this since it will allow me to know exactly what changes in var1, ..., varN makes the test go from good to bad when using logistic regression or some heuristic approach.

My question is: Does this still hold if we have, per say, a dataset of 100 observation and that filtration slices it in half?

asked Mar 26 at 17:08

Mohamed Nidabdella

Suppose I have a data set as follow:

var1 var2 ... varN test1 test2
 x x ... x good v.good
 x x ... x good bad
 x x ... x meh bad
 x x ... x good good
 x x ... x v.bad bad
 x x ... x bad bad
 x x ... x meh good
 x x ... x good good
 x x ... x v.bad good
 x x ... x good bad

var1 var2 ... varN test1 test2 Y
 x x ... x good v.good 1
 x x ... x good bad 0
 x x ... x good bad 0
 x x ... x good good 1
 x x ... x good good 1
 x x ... x good bad 0

I did this since it will allow me to know exactly what changes in var1, ..., varN makes the test go from good to bad when using logistic regression or some heuristic approach.

My question is: Does this still hold if we have, per say, a dataset of 100 observation and that filtration slices it in half?

dataset feature-engineering feature-construction

asked Mar 26 at 17:08

Mohamed Nidabdella

asked Mar 26 at 17:08

Mohamed Nidabdella

asked Mar 26 at 17:08

Mohamed Nidabdella

asked Mar 26 at 17:08

Mohamed Nidabdella

asked Mar 26 at 17:08

Mohamed Nidabdella

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48048%2fis-filtering-a-dataset-still-a-good-option-if-the-dataset-is-very-small%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

bRmAqNJ9SaQ0luG5Qsw33 8MPP9sEXbXct,wA ih5ParSZfz,cwNm0dzm5pL2GTa,J8r8Psq5Y Rh9cmNiK YM 7 H,RrIN,q7ky1l8o,8wWvL

搜尋此網誌

Trjtdtk

0

Your Answer

Post as a guest

0

0

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

0

Your Answer

Sign up or log in

Post as a guest

Post as a guest

0

0

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli