What's the difference between feature importance from Random Forest and Pearson correlation coefficient2019 Community Moderator ElectionFeature importance for random forest classification of a samplePredict buying behavior under the condition that a customer is advertised or notRandom Forest variable Importance Z Scorefeature importance via random forest and linear regression are differentFeature importance with scikit-learn Random Forest shows very high Standard DeviationSklearn Random Forest Prediction Correlation IssueVariable Importance Random Forest on RInterpretation of variable or feature importance in Random ForestWEKA Random Forest J48 Attribute ImportanceGet insights from Random forest::Variable Importance analysis

Can a monster with multiattack use this ability if they are missing a limb?

At which point does a character regain all their Hit Dice?

Why "be dealt cards" rather than "be dealing cards"?

I'm in charge of equipment buying but no one's ever happy with what I choose. How to fix this?

What is the oldest known work of fiction?

What is difference between behavior and behaviour

What defines a dissertation?

Is there an Impartial Brexit Deal comparison site?

What are the ramifications of creating a homebrew world without an Astral Plane?

Can I Retrieve Email Addresses from BCC?

How could Frankenstein get the parts for his _second_ creature?

How to prove that the query oracle is unitary?

Transcription Beats per minute

Was the picture area of a CRT a parallelogram (instead of a true rectangle)?

Is there any reason not to eat food that's been dropped on the surface of the moon?

What would be the benefits of having both a state and local currencies?

Applicability of Single Responsibility Principle

Time travel short story where a man arrives in the late 19th century in a time machine and then sends the machine back into the past

The baby cries all morning

What is the intuitive meaning of having a linear relationship between the logs of two variables?

Is it okay / does it make sense for another player to join a running game of Munchkin?

Can I use my Chinese passport to enter China after I acquired another citizenship?

Curses work by shouting - How to avoid collateral damage?

Is there a good way to store credentials outside of a password manager?



What's the difference between feature importance from Random Forest and Pearson correlation coefficient



2019 Community Moderator ElectionFeature importance for random forest classification of a samplePredict buying behavior under the condition that a customer is advertised or notRandom Forest variable Importance Z Scorefeature importance via random forest and linear regression are differentFeature importance with scikit-learn Random Forest shows very high Standard DeviationSklearn Random Forest Prediction Correlation IssueVariable Importance Random Forest on RInterpretation of variable or feature importance in Random ForestWEKA Random Forest J48 Attribute ImportanceGet insights from Random forest::Variable Importance analysis










3












$begingroup$


I have following business domain. I have a product with three outputs/labels. The outputs are impacted by 1000 procedures, each procedure is digitized and measured. The customer wants to know what is the most influential procedures on the outputs.



1.
From Pearson correlation coefficient we could learn how two variables' relationship, say 1 is proportional, -1 is negative proportional and 0 is no relation. So I could find the biggest value of Pearson correlation coefficient to find more influential procedures.



2.
From Random Forest algorithm, I could know the top feature importance. So I could identify also the most influential procedures.



Which one is better?










share|improve this question







New contributor




user84592 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$
















    3












    $begingroup$


    I have following business domain. I have a product with three outputs/labels. The outputs are impacted by 1000 procedures, each procedure is digitized and measured. The customer wants to know what is the most influential procedures on the outputs.



    1.
    From Pearson correlation coefficient we could learn how two variables' relationship, say 1 is proportional, -1 is negative proportional and 0 is no relation. So I could find the biggest value of Pearson correlation coefficient to find more influential procedures.



    2.
    From Random Forest algorithm, I could know the top feature importance. So I could identify also the most influential procedures.



    Which one is better?










    share|improve this question







    New contributor




    user84592 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$














      3












      3








      3


      1



      $begingroup$


      I have following business domain. I have a product with three outputs/labels. The outputs are impacted by 1000 procedures, each procedure is digitized and measured. The customer wants to know what is the most influential procedures on the outputs.



      1.
      From Pearson correlation coefficient we could learn how two variables' relationship, say 1 is proportional, -1 is negative proportional and 0 is no relation. So I could find the biggest value of Pearson correlation coefficient to find more influential procedures.



      2.
      From Random Forest algorithm, I could know the top feature importance. So I could identify also the most influential procedures.



      Which one is better?










      share|improve this question







      New contributor




      user84592 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      I have following business domain. I have a product with three outputs/labels. The outputs are impacted by 1000 procedures, each procedure is digitized and measured. The customer wants to know what is the most influential procedures on the outputs.



      1.
      From Pearson correlation coefficient we could learn how two variables' relationship, say 1 is proportional, -1 is negative proportional and 0 is no relation. So I could find the biggest value of Pearson correlation coefficient to find more influential procedures.



      2.
      From Random Forest algorithm, I could know the top feature importance. So I could identify also the most influential procedures.



      Which one is better?







      random-forest






      share|improve this question







      New contributor




      user84592 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question







      New contributor




      user84592 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question






      New contributor




      user84592 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked Mar 21 at 6:21









      user84592user84592

      1183




      1183




      New contributor




      user84592 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      user84592 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      user84592 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




















          2 Answers
          2






          active

          oldest

          votes


















          2












          $begingroup$

          Pearson correlations capture linear relationships between the input and target variables. Therefore this only makes sense for continuous inputs and a continuous target variable, and not continuous inputs with a binary/categorical output. Correlations essentially measure the positive/negative 'change' in one feature as you increase/decrease the other.



          So it doesn't make much sense to compare the relationship between your input features and the categorical outputs this way. You may as well calculate the mean input for each feature and each label, and calculate the differences between those. I found this answer on Cross-Validated which explains this much better than I can.



          Feature importance in tree based models is more likely to actually identify which features are most influential when differentiating your classes, provided that the model performs well. How this feature importance is calculated depends on the implementation, this article gives a good overview of how different tree based models calculate importance for features.






          share|improve this answer











          $endgroup$








          • 1




            $begingroup$
            This beautiful picture is for continuous-continuous variables. Continuous-categorical (feature-label) case is different, since "linear" relation has no meaning.
            $endgroup$
            – Esmailian
            Mar 21 at 14:43






          • 1




            $begingroup$
            Ah well noticed, I hadn't spotted this question was asking about categorical labels, I'll edit my answer :)
            $endgroup$
            – Dan Carter
            Mar 21 at 15:12


















          0












          $begingroup$

          I would say it depends a bit on what you want to achieve.



          A few things to keep in mind:



          Pearson gives you a correlation but what is if the information is in the absolute value- a RF has a much better chance to recognize this.
          Example data where there is some clear correlation but in the absolute value:



          a = [1,1,1,0,0,0, -1,-1,-1]
          b = [abs(x) for x in a]


          On the other hand RF importance is only relevant when the prediction is good - whatever good means for you. Pearson R has a very specific meaning that is always true- there is a correlation between the two variables.






          share|improve this answer









          $endgroup$












            Your Answer





            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "557"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );






            user84592 is a new contributor. Be nice, and check out our Code of Conduct.









            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47715%2fwhats-the-difference-between-feature-importance-from-random-forest-and-pearson%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            2












            $begingroup$

            Pearson correlations capture linear relationships between the input and target variables. Therefore this only makes sense for continuous inputs and a continuous target variable, and not continuous inputs with a binary/categorical output. Correlations essentially measure the positive/negative 'change' in one feature as you increase/decrease the other.



            So it doesn't make much sense to compare the relationship between your input features and the categorical outputs this way. You may as well calculate the mean input for each feature and each label, and calculate the differences between those. I found this answer on Cross-Validated which explains this much better than I can.



            Feature importance in tree based models is more likely to actually identify which features are most influential when differentiating your classes, provided that the model performs well. How this feature importance is calculated depends on the implementation, this article gives a good overview of how different tree based models calculate importance for features.






            share|improve this answer











            $endgroup$








            • 1




              $begingroup$
              This beautiful picture is for continuous-continuous variables. Continuous-categorical (feature-label) case is different, since "linear" relation has no meaning.
              $endgroup$
              – Esmailian
              Mar 21 at 14:43






            • 1




              $begingroup$
              Ah well noticed, I hadn't spotted this question was asking about categorical labels, I'll edit my answer :)
              $endgroup$
              – Dan Carter
              Mar 21 at 15:12















            2












            $begingroup$

            Pearson correlations capture linear relationships between the input and target variables. Therefore this only makes sense for continuous inputs and a continuous target variable, and not continuous inputs with a binary/categorical output. Correlations essentially measure the positive/negative 'change' in one feature as you increase/decrease the other.



            So it doesn't make much sense to compare the relationship between your input features and the categorical outputs this way. You may as well calculate the mean input for each feature and each label, and calculate the differences between those. I found this answer on Cross-Validated which explains this much better than I can.



            Feature importance in tree based models is more likely to actually identify which features are most influential when differentiating your classes, provided that the model performs well. How this feature importance is calculated depends on the implementation, this article gives a good overview of how different tree based models calculate importance for features.






            share|improve this answer











            $endgroup$








            • 1




              $begingroup$
              This beautiful picture is for continuous-continuous variables. Continuous-categorical (feature-label) case is different, since "linear" relation has no meaning.
              $endgroup$
              – Esmailian
              Mar 21 at 14:43






            • 1




              $begingroup$
              Ah well noticed, I hadn't spotted this question was asking about categorical labels, I'll edit my answer :)
              $endgroup$
              – Dan Carter
              Mar 21 at 15:12













            2












            2








            2





            $begingroup$

            Pearson correlations capture linear relationships between the input and target variables. Therefore this only makes sense for continuous inputs and a continuous target variable, and not continuous inputs with a binary/categorical output. Correlations essentially measure the positive/negative 'change' in one feature as you increase/decrease the other.



            So it doesn't make much sense to compare the relationship between your input features and the categorical outputs this way. You may as well calculate the mean input for each feature and each label, and calculate the differences between those. I found this answer on Cross-Validated which explains this much better than I can.



            Feature importance in tree based models is more likely to actually identify which features are most influential when differentiating your classes, provided that the model performs well. How this feature importance is calculated depends on the implementation, this article gives a good overview of how different tree based models calculate importance for features.






            share|improve this answer











            $endgroup$



            Pearson correlations capture linear relationships between the input and target variables. Therefore this only makes sense for continuous inputs and a continuous target variable, and not continuous inputs with a binary/categorical output. Correlations essentially measure the positive/negative 'change' in one feature as you increase/decrease the other.



            So it doesn't make much sense to compare the relationship between your input features and the categorical outputs this way. You may as well calculate the mean input for each feature and each label, and calculate the differences between those. I found this answer on Cross-Validated which explains this much better than I can.



            Feature importance in tree based models is more likely to actually identify which features are most influential when differentiating your classes, provided that the model performs well. How this feature importance is calculated depends on the implementation, this article gives a good overview of how different tree based models calculate importance for features.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Mar 21 at 15:47

























            answered Mar 21 at 14:16









            Dan CarterDan Carter

            7751218




            7751218







            • 1




              $begingroup$
              This beautiful picture is for continuous-continuous variables. Continuous-categorical (feature-label) case is different, since "linear" relation has no meaning.
              $endgroup$
              – Esmailian
              Mar 21 at 14:43






            • 1




              $begingroup$
              Ah well noticed, I hadn't spotted this question was asking about categorical labels, I'll edit my answer :)
              $endgroup$
              – Dan Carter
              Mar 21 at 15:12












            • 1




              $begingroup$
              This beautiful picture is for continuous-continuous variables. Continuous-categorical (feature-label) case is different, since "linear" relation has no meaning.
              $endgroup$
              – Esmailian
              Mar 21 at 14:43






            • 1




              $begingroup$
              Ah well noticed, I hadn't spotted this question was asking about categorical labels, I'll edit my answer :)
              $endgroup$
              – Dan Carter
              Mar 21 at 15:12







            1




            1




            $begingroup$
            This beautiful picture is for continuous-continuous variables. Continuous-categorical (feature-label) case is different, since "linear" relation has no meaning.
            $endgroup$
            – Esmailian
            Mar 21 at 14:43




            $begingroup$
            This beautiful picture is for continuous-continuous variables. Continuous-categorical (feature-label) case is different, since "linear" relation has no meaning.
            $endgroup$
            – Esmailian
            Mar 21 at 14:43




            1




            1




            $begingroup$
            Ah well noticed, I hadn't spotted this question was asking about categorical labels, I'll edit my answer :)
            $endgroup$
            – Dan Carter
            Mar 21 at 15:12




            $begingroup$
            Ah well noticed, I hadn't spotted this question was asking about categorical labels, I'll edit my answer :)
            $endgroup$
            – Dan Carter
            Mar 21 at 15:12











            0












            $begingroup$

            I would say it depends a bit on what you want to achieve.



            A few things to keep in mind:



            Pearson gives you a correlation but what is if the information is in the absolute value- a RF has a much better chance to recognize this.
            Example data where there is some clear correlation but in the absolute value:



            a = [1,1,1,0,0,0, -1,-1,-1]
            b = [abs(x) for x in a]


            On the other hand RF importance is only relevant when the prediction is good - whatever good means for you. Pearson R has a very specific meaning that is always true- there is a correlation between the two variables.






            share|improve this answer









            $endgroup$

















              0












              $begingroup$

              I would say it depends a bit on what you want to achieve.



              A few things to keep in mind:



              Pearson gives you a correlation but what is if the information is in the absolute value- a RF has a much better chance to recognize this.
              Example data where there is some clear correlation but in the absolute value:



              a = [1,1,1,0,0,0, -1,-1,-1]
              b = [abs(x) for x in a]


              On the other hand RF importance is only relevant when the prediction is good - whatever good means for you. Pearson R has a very specific meaning that is always true- there is a correlation between the two variables.






              share|improve this answer









              $endgroup$















                0












                0








                0





                $begingroup$

                I would say it depends a bit on what you want to achieve.



                A few things to keep in mind:



                Pearson gives you a correlation but what is if the information is in the absolute value- a RF has a much better chance to recognize this.
                Example data where there is some clear correlation but in the absolute value:



                a = [1,1,1,0,0,0, -1,-1,-1]
                b = [abs(x) for x in a]


                On the other hand RF importance is only relevant when the prediction is good - whatever good means for you. Pearson R has a very specific meaning that is always true- there is a correlation between the two variables.






                share|improve this answer









                $endgroup$



                I would say it depends a bit on what you want to achieve.



                A few things to keep in mind:



                Pearson gives you a correlation but what is if the information is in the absolute value- a RF has a much better chance to recognize this.
                Example data where there is some clear correlation but in the absolute value:



                a = [1,1,1,0,0,0, -1,-1,-1]
                b = [abs(x) for x in a]


                On the other hand RF importance is only relevant when the prediction is good - whatever good means for you. Pearson R has a very specific meaning that is always true- there is a correlation between the two variables.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Mar 21 at 11:11









                El BurroEl Burro

                455311




                455311




















                    user84592 is a new contributor. Be nice, and check out our Code of Conduct.









                    draft saved

                    draft discarded


















                    user84592 is a new contributor. Be nice, and check out our Code of Conduct.












                    user84592 is a new contributor. Be nice, and check out our Code of Conduct.











                    user84592 is a new contributor. Be nice, and check out our Code of Conduct.














                    Thanks for contributing an answer to Data Science Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47715%2fwhats-the-difference-between-feature-importance-from-random-forest-and-pearson%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

                    Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

                    Do these cracks on my tires look bad? The Next CEO of Stack OverflowDry rot tire should I replace?Having to replace tiresFishtailed so easily? Bad tires? ABS?Filling the tires with something other than air, to avoid puncture hassles?Used Michelin tires safe to install?Do these tyre cracks necessitate replacement?Rumbling noise: tires or mechanicalIs it possible to fix noisy feathered tires?Are bad winter tires still better than summer tires in winter?Torque converter failure - Related to replacing only 2 tires?Why use snow tires on all 4 wheels on 2-wheel-drive cars?