Pros and cons of using the zscore of a dataset before normalizing it during feature engineering? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsHow to perform feature engineering on unknown features?Automatic Feature EngineeringHow To Merge Features in the Dataset Forest Cover Type Classification Problem?Is feature engineering still useful when using XGBoost?Feature engineering on distributionsFix missing data by adding another feature instead of using the mean?Feature engineering for hierarchical dataHow to do feature engineering for email cleaning / text extraction?Manual feature engineering based on the outputMachine learning algorithm that can use many instances to predict 1 continuous outcome per person

What does 丫 mean? 丫是什么意思?

Why is it faster to reheat something than it is to cook it?

How to change the tick of the color bar legend to black

Universal covering space of the real projective line?

License to disallow distribution in closed source software, but allow exceptions made by owner?

Central Vacuuming: Is it worth it, and how does it compare to normal vacuuming?

The Nth Gryphon Number

If Windows 7 doesn't support WSL, then what is "Subsystem for UNIX-based Applications"?

Where is the Next Backup Size entry on iOS 12?

What is the difference between CTSS and ITS?

What is the "studentd" process?

Is multiple magic items in one inherently imbalanced?

Should a wizard buy fine inks every time he want to copy spells into his spellbook?

Did any compiler fully use 80-bit floating point?

New Order #6: Easter Egg

What does it mean that physics no longer uses mechanical models to describe phenomena?

Co-worker has annoying ringtone

Why complex landing gears are used instead of simple,reliability and light weight muscle wire or shape memory alloys?

Asymptotics question

Why datecode is SO IMPORTANT to chip manufacturers?

Did Mueller's report provide an evidentiary basis for the claim of Russian govt election interference via social media?

Is it dangerous to install hacking tools on my private linux machine?

Is there hard evidence that the grant peer review system performs significantly better than random?

One-one communication



Pros and cons of using the zscore of a dataset before normalizing it during feature engineering?



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsHow to perform feature engineering on unknown features?Automatic Feature EngineeringHow To Merge Features in the Dataset Forest Cover Type Classification Problem?Is feature engineering still useful when using XGBoost?Feature engineering on distributionsFix missing data by adding another feature instead of using the mean?Feature engineering for hierarchical dataHow to do feature engineering for email cleaning / text extraction?Manual feature engineering based on the outputMachine learning algorithm that can use many instances to predict 1 continuous outcome per person










0












$begingroup$


Normalization is a common feature engineering technique. However, this post used standardize(zscore) on the dataset before normalizing it.



I think that would result in losing some of the information in data.



What are the pros and cons of doing this?










share|improve this question











$endgroup$
















    0












    $begingroup$


    Normalization is a common feature engineering technique. However, this post used standardize(zscore) on the dataset before normalizing it.



    I think that would result in losing some of the information in data.



    What are the pros and cons of doing this?










    share|improve this question











    $endgroup$














      0












      0








      0





      $begingroup$


      Normalization is a common feature engineering technique. However, this post used standardize(zscore) on the dataset before normalizing it.



      I think that would result in losing some of the information in data.



      What are the pros and cons of doing this?










      share|improve this question











      $endgroup$




      Normalization is a common feature engineering technique. However, this post used standardize(zscore) on the dataset before normalizing it.



      I think that would result in losing some of the information in data.



      What are the pros and cons of doing this?







      machine-learning feature-engineering






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Apr 4 at 3:40









      Ethan

      706625




      706625










      asked Apr 4 at 2:38









      fu DLfu DL

      84




      84




















          2 Answers
          2






          active

          oldest

          votes


















          0












          $begingroup$

          Z-scores normalisation are a way to compare results from a test to a “normal” population and bring them to a same comparable scale. Advantages of ZScore can thus be:



          $$ z_score = fracx-bar xsigma $$



          The Z score normalisation has the following advantages:



          1. Z Score can be used to compare raw scores that are taken from different tests

          2. Z score takes into account both the mean value and the variability in a set of raw scores.

          And the Disadvantages of Z score are:



          1. Z Score always assume a normal distribution.

          2. If the data is skewed, the distribution of the left and right of the origin line is not equal.





          share|improve this answer









          $endgroup$




















            0












            $begingroup$

            Normalizing an already normalized dataset should not change anything unless for some reason a different normalization scheme is used.






            share|improve this answer









            $endgroup$













              Your Answer








              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "557"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader:
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              ,
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













              draft saved

              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48562%2fpros-and-cons-of-using-the-zscore-of-a-dataset-before-normalizing-it-during-feat%23new-answer', 'question_page');

              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              0












              $begingroup$

              Z-scores normalisation are a way to compare results from a test to a “normal” population and bring them to a same comparable scale. Advantages of ZScore can thus be:



              $$ z_score = fracx-bar xsigma $$



              The Z score normalisation has the following advantages:



              1. Z Score can be used to compare raw scores that are taken from different tests

              2. Z score takes into account both the mean value and the variability in a set of raw scores.

              And the Disadvantages of Z score are:



              1. Z Score always assume a normal distribution.

              2. If the data is skewed, the distribution of the left and right of the origin line is not equal.





              share|improve this answer









              $endgroup$

















                0












                $begingroup$

                Z-scores normalisation are a way to compare results from a test to a “normal” population and bring them to a same comparable scale. Advantages of ZScore can thus be:



                $$ z_score = fracx-bar xsigma $$



                The Z score normalisation has the following advantages:



                1. Z Score can be used to compare raw scores that are taken from different tests

                2. Z score takes into account both the mean value and the variability in a set of raw scores.

                And the Disadvantages of Z score are:



                1. Z Score always assume a normal distribution.

                2. If the data is skewed, the distribution of the left and right of the origin line is not equal.





                share|improve this answer









                $endgroup$















                  0












                  0








                  0





                  $begingroup$

                  Z-scores normalisation are a way to compare results from a test to a “normal” population and bring them to a same comparable scale. Advantages of ZScore can thus be:



                  $$ z_score = fracx-bar xsigma $$



                  The Z score normalisation has the following advantages:



                  1. Z Score can be used to compare raw scores that are taken from different tests

                  2. Z score takes into account both the mean value and the variability in a set of raw scores.

                  And the Disadvantages of Z score are:



                  1. Z Score always assume a normal distribution.

                  2. If the data is skewed, the distribution of the left and right of the origin line is not equal.





                  share|improve this answer









                  $endgroup$



                  Z-scores normalisation are a way to compare results from a test to a “normal” population and bring them to a same comparable scale. Advantages of ZScore can thus be:



                  $$ z_score = fracx-bar xsigma $$



                  The Z score normalisation has the following advantages:



                  1. Z Score can be used to compare raw scores that are taken from different tests

                  2. Z score takes into account both the mean value and the variability in a set of raw scores.

                  And the Disadvantages of Z score are:



                  1. Z Score always assume a normal distribution.

                  2. If the data is skewed, the distribution of the left and right of the origin line is not equal.






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Apr 4 at 5:47









                  thanatozthanatoz

                  689421




                  689421





















                      0












                      $begingroup$

                      Normalizing an already normalized dataset should not change anything unless for some reason a different normalization scheme is used.






                      share|improve this answer









                      $endgroup$

















                        0












                        $begingroup$

                        Normalizing an already normalized dataset should not change anything unless for some reason a different normalization scheme is used.






                        share|improve this answer









                        $endgroup$















                          0












                          0








                          0





                          $begingroup$

                          Normalizing an already normalized dataset should not change anything unless for some reason a different normalization scheme is used.






                          share|improve this answer









                          $endgroup$



                          Normalizing an already normalized dataset should not change anything unless for some reason a different normalization scheme is used.







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Apr 4 at 7:36









                          seraliserali

                          1




                          1



























                              draft saved

                              draft discarded
















































                              Thanks for contributing an answer to Data Science Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid


                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.

                              Use MathJax to format equations. MathJax reference.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48562%2fpros-and-cons-of-using-the-zscore-of-a-dataset-before-normalizing-it-during-feat%23new-answer', 'question_page');

                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

                              Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

                              Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High