Is filtering a dataset still a good option if the dataset is very small?2019 Community Moderator ElectionWhat is the meaning of spherical dataset?How should classification be done for a very small data set?Prediction questions related to the datasetIf an NMT dataset is artificially enlarged by splitting sequences up, should it still train for the same number of epochs?Good dataset for sentiment Analysis in Tickets for IT SupportDownsampling the dataset to create balanced dataset for neural modelsPrediction interval for very small datasetStructure the dataset for financial machine learningBest CNN architecture for binary classification of small images with a massive datasetHow to get the marker locations in the LINEMOD dataset?
Why didn't people conceal Tzaraat?
Why can't we play rap on piano?
How can I deal with my CEO asking me to hire someone with a higher salary than me, a co-founder?
How to properly check if the given string is empty in a POSIX shell script?
Does the Idaho Potato Commission associate potato skins with healthy eating?
Is it possible to create a QR code using text?
meaning of 腰を落としている
What is the fastest integer factorization to break RSA?
Am I breaking OOP practice with this architecture?
How would I stat a creature to be immune to everything but the Magic Missile spell? (just for fun)
Do Iron Man suits sport waste management systems?
How to travel to Japan while expressing milk?
How dangerous is XSS?
Detention in 1997
What exploit Are these user agents trying to use?
Can compressed videos be decoded back to their uncompresed original format?
In Bayesian inference, why are some terms dropped from the posterior predictive?
How does having to sign to support someone for elections fit with having a secret ballot?
Is this draw by repetition?
Is it "common practice in Fourier transform spectroscopy to multiply the measured interferogram by an apodizing function"? If so, why?
What exactly is ineptocracy?
Why do I get negative height?
Does the Cone of Cold spell freeze water?
Using "tail" to follow a file without displaying the most recent lines
Is filtering a dataset still a good option if the dataset is very small?
2019 Community Moderator ElectionWhat is the meaning of spherical dataset?How should classification be done for a very small data set?Prediction questions related to the datasetIf an NMT dataset is artificially enlarged by splitting sequences up, should it still train for the same number of epochs?Good dataset for sentiment Analysis in Tickets for IT SupportDownsampling the dataset to create balanced dataset for neural modelsPrediction interval for very small datasetStructure the dataset for financial machine learningBest CNN architecture for binary classification of small images with a massive datasetHow to get the marker locations in the LINEMOD dataset?
$begingroup$
Suppose I have a data set as follow:
var1 var2 ... varN test1 test2
x x ... x good v.good
x x ... x good bad
x x ... x meh bad
x x ... x good good
x x ... x v.bad bad
x x ... x bad bad
x x ... x meh good
x x ... x good good
x x ... x v.bad good
x x ... x good bad
test2
is a more sophisticated version of test1
, I want to know what makes my test2
bad if my test1
has the value good. For that, I filtered this data to only include rows where test1
has the value good.
My dataset becomes:
var1 var2 ... varN test1 test2 Y
x x ... x good v.good 1
x x ... x good bad 0
x x ... x good bad 0
x x ... x good good 1
x x ... x good good 1
x x ... x good bad 0
I did this since it will allow me to know exactly what changes in var1, ..., varN
makes the test go from good to bad when using logistic regression or some heuristic approach.
My question is: Does this still hold if we have, per say, a dataset of 100 observation and that filtration slices it in half?
dataset feature-engineering feature-construction
$endgroup$
add a comment |
$begingroup$
Suppose I have a data set as follow:
var1 var2 ... varN test1 test2
x x ... x good v.good
x x ... x good bad
x x ... x meh bad
x x ... x good good
x x ... x v.bad bad
x x ... x bad bad
x x ... x meh good
x x ... x good good
x x ... x v.bad good
x x ... x good bad
test2
is a more sophisticated version of test1
, I want to know what makes my test2
bad if my test1
has the value good. For that, I filtered this data to only include rows where test1
has the value good.
My dataset becomes:
var1 var2 ... varN test1 test2 Y
x x ... x good v.good 1
x x ... x good bad 0
x x ... x good bad 0
x x ... x good good 1
x x ... x good good 1
x x ... x good bad 0
I did this since it will allow me to know exactly what changes in var1, ..., varN
makes the test go from good to bad when using logistic regression or some heuristic approach.
My question is: Does this still hold if we have, per say, a dataset of 100 observation and that filtration slices it in half?
dataset feature-engineering feature-construction
$endgroup$
add a comment |
$begingroup$
Suppose I have a data set as follow:
var1 var2 ... varN test1 test2
x x ... x good v.good
x x ... x good bad
x x ... x meh bad
x x ... x good good
x x ... x v.bad bad
x x ... x bad bad
x x ... x meh good
x x ... x good good
x x ... x v.bad good
x x ... x good bad
test2
is a more sophisticated version of test1
, I want to know what makes my test2
bad if my test1
has the value good. For that, I filtered this data to only include rows where test1
has the value good.
My dataset becomes:
var1 var2 ... varN test1 test2 Y
x x ... x good v.good 1
x x ... x good bad 0
x x ... x good bad 0
x x ... x good good 1
x x ... x good good 1
x x ... x good bad 0
I did this since it will allow me to know exactly what changes in var1, ..., varN
makes the test go from good to bad when using logistic regression or some heuristic approach.
My question is: Does this still hold if we have, per say, a dataset of 100 observation and that filtration slices it in half?
dataset feature-engineering feature-construction
$endgroup$
Suppose I have a data set as follow:
var1 var2 ... varN test1 test2
x x ... x good v.good
x x ... x good bad
x x ... x meh bad
x x ... x good good
x x ... x v.bad bad
x x ... x bad bad
x x ... x meh good
x x ... x good good
x x ... x v.bad good
x x ... x good bad
test2
is a more sophisticated version of test1
, I want to know what makes my test2
bad if my test1
has the value good. For that, I filtered this data to only include rows where test1
has the value good.
My dataset becomes:
var1 var2 ... varN test1 test2 Y
x x ... x good v.good 1
x x ... x good bad 0
x x ... x good bad 0
x x ... x good good 1
x x ... x good good 1
x x ... x good bad 0
I did this since it will allow me to know exactly what changes in var1, ..., varN
makes the test go from good to bad when using logistic regression or some heuristic approach.
My question is: Does this still hold if we have, per say, a dataset of 100 observation and that filtration slices it in half?
dataset feature-engineering feature-construction
dataset feature-engineering feature-construction
asked Mar 26 at 17:08
Mohamed NidabdellaMohamed Nidabdella
11
11
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48048%2fis-filtering-a-dataset-still-a-good-option-if-the-dataset-is-very-small%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48048%2fis-filtering-a-dataset-still-a-good-option-if-the-dataset-is-very-small%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown