How to Normalise features for small datasets?applying word2vec on small text filesImplementing Batch normalisation in Neural networkClassification methods using one overlapping featureData preprocessing: Should we normalise images pixel-wise?What kind of neural network structure is suitable for image to image learning?Is this a problem for a Seq2Seq model?Classification of phone numbers belonging to same client

Which was the first story featuring espers?

Make a Bowl of Alphabet Soup

How can I write humor as character trait?

Does grappling negate Mirror Image?

Why is the Sun approximated as a black body at ~ 5800 K?

What do you call a word that can be spelled forward or backward forming two different words

What features enable the Su-25 Frogfoot to operate with such a wide variety of fuels?

Why do Radio Buttons not fill the entire outer circle?

Microchip documentation does not label CAN buss pins on micro controller pinout diagram

Why Shazam when there is already Superman?

What kind of floor tile is this?

C++ copy constructor called at return

The IT department bottlenecks progress, how should I handle this?

How do I tell my boss that I'm quitting soon, especially given that a colleague just left this week

Why should universal income be universal?

Is it necessary to use pronouns with the verb "essere"?

What's the name of the logical fallacy where a debater extends a statement far beyond the original statement to make it true?

Why is so much work done on numerical verification of the Riemann Hypothesis?

Taxes on Dividends in a Roth IRA

A variation to the phrase "hanging over my shoulders"

Non-trope happy ending?

Can you use Vicious Mockery to win an argument or gain favours?

US tourist/student visa

Why do ¬, ∀ and ∃ have the same precedence?

How to Normalise features for small datasets?

applying word2vec on small text filesImplementing Batch normalisation in Neural networkClassification methods using one overlapping featureData preprocessing: Should we normalise images pixel-wise?What kind of neural network structure is suitable for image to image learning?Is this a problem for a Seq2Seq model?Classification of phone numbers belonging to same client

I am working with a small dataset ( N = 50 ).
I would like to normalise my input features.
I am facing the following issues:

Because of the small size of the dataset the range of training input features would differ from the range of testing input features.

The input features do not have a theoretical upper bound.

Can you suggest normalisation techniques suitable for this task? Any paper suggestions would also be appreciated.

asked Feb 16 at 7:35

Pranav Garg

112

add a comment |

I am working with a small dataset ( N = 50 ).
I would like to normalise my input features.
I am facing the following issues:

Because of the small size of the dataset the range of training input features would differ from the range of testing input features.

The input features do not have a theoretical upper bound.

Can you suggest normalisation techniques suitable for this task? Any paper suggestions would also be appreciated.

asked Feb 16 at 7:35

Pranav Garg

112

add a comment |

I am working with a small dataset ( N = 50 ).
I would like to normalise my input features.
I am facing the following issues:

Because of the small size of the dataset the range of training input features would differ from the range of testing input features.

The input features do not have a theoretical upper bound.

Can you suggest normalisation techniques suitable for this task? Any paper suggestions would also be appreciated.

asked Feb 16 at 7:35

Pranav Garg

112

I am working with a small dataset ( N = 50 ).
I would like to normalise my input features.
I am facing the following issues:

Because of the small size of the dataset the range of training input features would differ from the range of testing input features.

The input features do not have a theoretical upper bound.

Can you suggest normalisation techniques suitable for this task? Any paper suggestions would also be appreciated.

machine-learning preprocessing

asked Feb 16 at 7:35

Pranav Garg

112

asked Feb 16 at 7:35

Pranav Garg

112

asked Feb 16 at 7:35

Pranav Garg

112

asked Feb 16 at 7:35

Pranav Garg

112

asked Feb 16 at 7:35

Pranav Garg

112

add a comment |

2 Answers
2

active

oldest

votes

You can use a MinMaxScaler on your train set, which will normalize your features inside [0, 1]. The same scaler can transform the test set and if there are values greater than the ones found in the train set, the scaler handles that by returning values greater than 1. Essentially, your test set will be normalized.

answered Mar 18 at 8:56

pcko1

1,581417

add a comment |

You should normalise your dataset after the split. You could try out a standard scaling as in this way you avoid taking into account the minimum and maximum value.

The fact that you have 2 different ranges in train and test is not positive at all. In this case you can do some resampling such as SMOTE.

Let me know

edited 2 days ago

answered Feb 16 at 7:55

3nomis

1929

1

$begingroup$
Normalizing before splitting is a bad practice, because you incorporate knowledge from the test set into your preprocessing
$endgroup$
– pcko1
Mar 18 at 8:53

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f45675%2fhow-to-normalise-features-for-small-datasets%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

answered Mar 18 at 8:56

pcko1

1,581417

add a comment |

answered Mar 18 at 8:56

pcko1

1,581417

add a comment |

answered Mar 18 at 8:56

pcko1

1,581417

answered Mar 18 at 8:56

pcko1

1,581417

answered Mar 18 at 8:56

pcko1

1,581417

answered Mar 18 at 8:56

pcko1

1,581417

answered Mar 18 at 8:56

pcko1

1,581417

add a comment |

edited 2 days ago

answered Feb 16 at 7:55

3nomis

1929

1

$begingroup$
Normalizing before splitting is a bad practice, because you incorporate knowledge from the test set into your preprocessing
$endgroup$
– pcko1
Mar 18 at 8:53

add a comment |

edited 2 days ago

answered Feb 16 at 7:55

3nomis

1929

1

$begingroup$
Normalizing before splitting is a bad practice, because you incorporate knowledge from the test set into your preprocessing
$endgroup$
– pcko1
Mar 18 at 8:53

add a comment |

edited 2 days ago

answered Feb 16 at 7:55

3nomis

1929

edited 2 days ago

answered Feb 16 at 7:55

3nomis

1929

edited 2 days ago

answered Feb 16 at 7:55

3nomis

1929

answered Feb 16 at 7:55

3nomis

1929

answered Feb 16 at 7:55

3nomis

1929

1

$begingroup$
Normalizing before splitting is a bad practice, because you incorporate knowledge from the test set into your preprocessing
$endgroup$
– pcko1
Mar 18 at 8:53

add a comment |

1

$begingroup$
Normalizing before splitting is a bad practice, because you incorporate knowledge from the test set into your preprocessing
$endgroup$
– pcko1
Mar 18 at 8:53

Normalizing before splitting is a bad practice, because you incorporate knowledge from the test set into your preprocessing

– pcko1
Mar 18 at 8:53

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

g0RirdTZrsgJHs2HqeB6oqeCj FfdZnzae0 p,rFYmL EVC6ZCpkQi2dcz0 xtY

搜尋此網誌

Trjtdtk

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

2 Answers
2

2 Answers
2

2 Answers
2