How to Normalise features for small datasets?applying word2vec on small text filesImplementing Batch normalisation in Neural networkClassification methods using one overlapping featureData preprocessing: Should we normalise images pixel-wise?What kind of neural network structure is suitable for image to image learning?Is this a problem for a Seq2Seq model?Classification of phone numbers belonging to same client

Which was the first story featuring espers?

Make a Bowl of Alphabet Soup

How can I write humor as character trait?

Does grappling negate Mirror Image?

Why is the Sun approximated as a black body at ~ 5800 K?

What do you call a word that can be spelled forward or backward forming two different words

What features enable the Su-25 Frogfoot to operate with such a wide variety of fuels?

Why do Radio Buttons not fill the entire outer circle?

Microchip documentation does not label CAN buss pins on micro controller pinout diagram

Why Shazam when there is already Superman?

What kind of floor tile is this?

C++ copy constructor called at return

The IT department bottlenecks progress, how should I handle this?

How do I tell my boss that I'm quitting soon, especially given that a colleague just left this week

Why should universal income be universal?

Is it necessary to use pronouns with the verb "essere"?

What's the name of the logical fallacy where a debater extends a statement far beyond the original statement to make it true?

Why is so much work done on numerical verification of the Riemann Hypothesis?

Taxes on Dividends in a Roth IRA

A variation to the phrase "hanging over my shoulders"

Non-trope happy ending?

Can you use Vicious Mockery to win an argument or gain favours?

US tourist/student visa

Why do ¬, ∀ and ∃ have the same precedence?



How to Normalise features for small datasets?


applying word2vec on small text filesImplementing Batch normalisation in Neural networkClassification methods using one overlapping featureData preprocessing: Should we normalise images pixel-wise?What kind of neural network structure is suitable for image to image learning?Is this a problem for a Seq2Seq model?Classification of phone numbers belonging to same client













2












$begingroup$


I am working with a small dataset ( N = 50 ).
I would like to normalise my input features.
I am facing the following issues:



  1. Because of the small size of the dataset the range of training input features would differ from the range of testing input features.

  2. The input features do not have a theoretical upper bound.

Can you suggest normalisation techniques suitable for this task? Any paper suggestions would also be appreciated.










share|improve this question









$endgroup$
















    2












    $begingroup$


    I am working with a small dataset ( N = 50 ).
    I would like to normalise my input features.
    I am facing the following issues:



    1. Because of the small size of the dataset the range of training input features would differ from the range of testing input features.

    2. The input features do not have a theoretical upper bound.

    Can you suggest normalisation techniques suitable for this task? Any paper suggestions would also be appreciated.










    share|improve this question









    $endgroup$














      2












      2








      2





      $begingroup$


      I am working with a small dataset ( N = 50 ).
      I would like to normalise my input features.
      I am facing the following issues:



      1. Because of the small size of the dataset the range of training input features would differ from the range of testing input features.

      2. The input features do not have a theoretical upper bound.

      Can you suggest normalisation techniques suitable for this task? Any paper suggestions would also be appreciated.










      share|improve this question









      $endgroup$




      I am working with a small dataset ( N = 50 ).
      I would like to normalise my input features.
      I am facing the following issues:



      1. Because of the small size of the dataset the range of training input features would differ from the range of testing input features.

      2. The input features do not have a theoretical upper bound.

      Can you suggest normalisation techniques suitable for this task? Any paper suggestions would also be appreciated.







      machine-learning preprocessing






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Feb 16 at 7:35









      Pranav GargPranav Garg

      112




      112




















          2 Answers
          2






          active

          oldest

          votes


















          0












          $begingroup$

          You can use a MinMaxScaler on your train set, which will normalize your features inside [0, 1]. The same scaler can transform the test set and if there are values greater than the ones found in the train set, the scaler handles that by returning values greater than 1. Essentially, your test set will be normalized.






          share|improve this answer









          $endgroup$




















            0












            $begingroup$

            You should normalise your dataset after the split. You could try out a standard scaling as in this way you avoid taking into account the minimum and maximum value.

            The fact that you have 2 different ranges in train and test is not positive at all. In this case you can do some resampling such as SMOTE.

            Let me know






            share|improve this answer











            $endgroup$








            • 1




              $begingroup$
              Normalizing before splitting is a bad practice, because you incorporate knowledge from the test set into your preprocessing
              $endgroup$
              – pcko1
              Mar 18 at 8:53










            Your Answer





            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "557"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f45675%2fhow-to-normalise-features-for-small-datasets%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0












            $begingroup$

            You can use a MinMaxScaler on your train set, which will normalize your features inside [0, 1]. The same scaler can transform the test set and if there are values greater than the ones found in the train set, the scaler handles that by returning values greater than 1. Essentially, your test set will be normalized.






            share|improve this answer









            $endgroup$

















              0












              $begingroup$

              You can use a MinMaxScaler on your train set, which will normalize your features inside [0, 1]. The same scaler can transform the test set and if there are values greater than the ones found in the train set, the scaler handles that by returning values greater than 1. Essentially, your test set will be normalized.






              share|improve this answer









              $endgroup$















                0












                0








                0





                $begingroup$

                You can use a MinMaxScaler on your train set, which will normalize your features inside [0, 1]. The same scaler can transform the test set and if there are values greater than the ones found in the train set, the scaler handles that by returning values greater than 1. Essentially, your test set will be normalized.






                share|improve this answer









                $endgroup$



                You can use a MinMaxScaler on your train set, which will normalize your features inside [0, 1]. The same scaler can transform the test set and if there are values greater than the ones found in the train set, the scaler handles that by returning values greater than 1. Essentially, your test set will be normalized.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 18 at 8:56









                pcko1pcko1

                1,581417




                1,581417





















                    0












                    $begingroup$

                    You should normalise your dataset after the split. You could try out a standard scaling as in this way you avoid taking into account the minimum and maximum value.

                    The fact that you have 2 different ranges in train and test is not positive at all. In this case you can do some resampling such as SMOTE.

                    Let me know






                    share|improve this answer











                    $endgroup$








                    • 1




                      $begingroup$
                      Normalizing before splitting is a bad practice, because you incorporate knowledge from the test set into your preprocessing
                      $endgroup$
                      – pcko1
                      Mar 18 at 8:53















                    0












                    $begingroup$

                    You should normalise your dataset after the split. You could try out a standard scaling as in this way you avoid taking into account the minimum and maximum value.

                    The fact that you have 2 different ranges in train and test is not positive at all. In this case you can do some resampling such as SMOTE.

                    Let me know






                    share|improve this answer











                    $endgroup$








                    • 1




                      $begingroup$
                      Normalizing before splitting is a bad practice, because you incorporate knowledge from the test set into your preprocessing
                      $endgroup$
                      – pcko1
                      Mar 18 at 8:53













                    0












                    0








                    0





                    $begingroup$

                    You should normalise your dataset after the split. You could try out a standard scaling as in this way you avoid taking into account the minimum and maximum value.

                    The fact that you have 2 different ranges in train and test is not positive at all. In this case you can do some resampling such as SMOTE.

                    Let me know






                    share|improve this answer











                    $endgroup$



                    You should normalise your dataset after the split. You could try out a standard scaling as in this way you avoid taking into account the minimum and maximum value.

                    The fact that you have 2 different ranges in train and test is not positive at all. In this case you can do some resampling such as SMOTE.

                    Let me know







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited 2 days ago

























                    answered Feb 16 at 7:55









                    3nomis3nomis

                    1929




                    1929







                    • 1




                      $begingroup$
                      Normalizing before splitting is a bad practice, because you incorporate knowledge from the test set into your preprocessing
                      $endgroup$
                      – pcko1
                      Mar 18 at 8:53












                    • 1




                      $begingroup$
                      Normalizing before splitting is a bad practice, because you incorporate knowledge from the test set into your preprocessing
                      $endgroup$
                      – pcko1
                      Mar 18 at 8:53







                    1




                    1




                    $begingroup$
                    Normalizing before splitting is a bad practice, because you incorporate knowledge from the test set into your preprocessing
                    $endgroup$
                    – pcko1
                    Mar 18 at 8:53




                    $begingroup$
                    Normalizing before splitting is a bad practice, because you incorporate knowledge from the test set into your preprocessing
                    $endgroup$
                    – pcko1
                    Mar 18 at 8:53

















                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Data Science Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f45675%2fhow-to-normalise-features-for-small-datasets%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

                    Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

                    Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High