LightGBM - Why Exclusive Feature Bundling (EFB)?what is init_score in lightGBM?Injecting random values as one input feature for feature selection results in a odd beaviourLightGBM vs XGBoostWhat scale does LightGBM use for output?What approach for creating a multi-classification model based on all categorical features (1 with 5,000 levels)?Catboost Categorical Features Handling Options (CTR settings)?Boruta Feature Selection packageSuggestions on using model in production 1 test at a timeHow does L1 Regularization work in lightGBMWhy am I getting accuracy of Xgboost model 0.00%?

What word means "to make something obsolete"?

Can a cyclic Amine form an Amide?

Junior developer struggles: how to communicate with management?

Disabling Resource Governor in SQL Server

What does air vanishing on contact sound like?

Hang 20lb projector screen on Hardieplank

How to assert on pagereference where the endpoint of pagereference is predefined

Any examples of headwear for races with animal ears?

Would "lab meat" be able to feed a much larger global population

Packet sniffer for MacOS Mojave and above

How did Captain America use this power?

Is it legal to define an unnamed struct?

Point of the the Dothraki's attack in GoT S8E3?

My ID is expired, can I fly to the Bahamas with my passport?

Save terminal output to a txt file

How can I close a gap between my fence and my neighbor's that's on his side of the property line?

Is Cola "probably the best-known" Latin word in the world? If not, which might it be?

Can I use 1000v rectifier diodes instead of 600v rectifier diodes?

Can commander tax be proliferated?

You look catfish vs You look like a catfish?

If Melisandre foresaw another character closing blue eyes, why did she follow Stannis?

Is it cheaper to drop cargo than to land it?

Power LED from 3.3V Power Pin without Resistor

Why are notes ordered like they are on a piano?



LightGBM - Why Exclusive Feature Bundling (EFB)?


what is init_score in lightGBM?Injecting random values as one input feature for feature selection results in a odd beaviourLightGBM vs XGBoostWhat scale does LightGBM use for output?What approach for creating a multi-classification model based on all categorical features (1 with 5,000 levels)?Catboost Categorical Features Handling Options (CTR settings)?Boruta Feature Selection packageSuggestions on using model in production 1 test at a timeHow does L1 Regularization work in lightGBMWhy am I getting accuracy of Xgboost model 0.00%?













2












$begingroup$


I'm currently studying GBDT and started reading LightGBM's research paper.



In section 4. they explain the Exclusive Feature Bundling algorithm, which aims at reducing the number of features by regrouping mutually exclusive features into bundles, treating them as a single feature. The researchers emphasize the fact that one must be able to retrieve the original values of the features from the bundle.



Question: If we have a categorical feature that has been one-hot encoded, won't this algorithm simply reverse the one-hot encoding to a numeric encoding, thereby cancelling all the benefits of our previous encoding? (suppression of hierarchy between categories etc.)










share|improve this question











$endgroup$
















    2












    $begingroup$


    I'm currently studying GBDT and started reading LightGBM's research paper.



    In section 4. they explain the Exclusive Feature Bundling algorithm, which aims at reducing the number of features by regrouping mutually exclusive features into bundles, treating them as a single feature. The researchers emphasize the fact that one must be able to retrieve the original values of the features from the bundle.



    Question: If we have a categorical feature that has been one-hot encoded, won't this algorithm simply reverse the one-hot encoding to a numeric encoding, thereby cancelling all the benefits of our previous encoding? (suppression of hierarchy between categories etc.)










    share|improve this question











    $endgroup$














      2












      2








      2


      2



      $begingroup$


      I'm currently studying GBDT and started reading LightGBM's research paper.



      In section 4. they explain the Exclusive Feature Bundling algorithm, which aims at reducing the number of features by regrouping mutually exclusive features into bundles, treating them as a single feature. The researchers emphasize the fact that one must be able to retrieve the original values of the features from the bundle.



      Question: If we have a categorical feature that has been one-hot encoded, won't this algorithm simply reverse the one-hot encoding to a numeric encoding, thereby cancelling all the benefits of our previous encoding? (suppression of hierarchy between categories etc.)










      share|improve this question











      $endgroup$




      I'm currently studying GBDT and started reading LightGBM's research paper.



      In section 4. they explain the Exclusive Feature Bundling algorithm, which aims at reducing the number of features by regrouping mutually exclusive features into bundles, treating them as a single feature. The researchers emphasize the fact that one must be able to retrieve the original values of the features from the bundle.



      Question: If we have a categorical feature that has been one-hot encoded, won't this algorithm simply reverse the one-hot encoding to a numeric encoding, thereby cancelling all the benefits of our previous encoding? (suppression of hierarchy between categories etc.)







      feature-selection decision-trees xgboost machine-learning-model gbm






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Apr 10 at 13:07









      ebrahimi

      76021022




      76021022










      asked Nov 30 '18 at 14:36









      T. MorvanT. Morvan

      111




      111




















          2 Answers
          2






          active

          oldest

          votes


















          2












          $begingroup$

          I've read that paper so many times before in so many ways. What I can say on the matter is that the paper does not describe explicitly what the framework particularly does. It just gives an hint of their intuitive idea of bundling of the features in an efficient way. But specificly, it does not say that it does a 'reversion of one-hot-encoding' in particular to your question.



          I tried giving categorical inputs directly and as one-hot-encoded to compare the time that it takes to compute. There was a significant difference: giving directly was all better in multiple datasets compared to giving as one-hot-encoded.



          Possibilities:



          1)It is possible that LightGBM Framework can find out that we give the features as one-hot-encoded from the sparsity, it is possible that the algorithm does not treat one-hot-encoded with EFB.



          2)It is also possible that LightGBM uses EFB on one-hot-encoded samples but it may be harmful, or not good as EFB on direct categorical inputs. (I go for this one)



          But still, I do not think that EFB will reverse one-hot-encoding since EFB is explained as a unique way of treating the categorical features. But it possibly 'bundles the unbundled features' when treating one-hot-encoded inputs.



          I used the word 'probably' so much times out of implicitness of the paper. What I can advice to you is that to send an e-mail to one of the authors of the paper, I do not think that they would refuse to explain it. Or if you are brave, go for the GitHub Repo of LightGBM, to check the codes by yourself.
          I hope that I could give you an insight. If you come up with an exact answer on the matter, please let me know. Please do not hesitate to further discuss this, I'll be around. Good luck, have fun!






          share|improve this answer











          $endgroup$




















            1












            $begingroup$

            From what the paper describes, EFB serves to speed up by reducing number of features. I think it is not saying there is no other effects. Of course whether other 'effects' are real concerns is another question.



            Also, EFB does not only deal with one-hot encoded features, but continuous features also.



            I also think it would not bundle all one-hot encoded features with the possibility of getting an overflow value.






            share|improve this answer









            $endgroup$













              Your Answer








              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "557"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader:
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              ,
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













              draft saved

              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f41907%2flightgbm-why-exclusive-feature-bundling-efb%23new-answer', 'question_page');

              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              2












              $begingroup$

              I've read that paper so many times before in so many ways. What I can say on the matter is that the paper does not describe explicitly what the framework particularly does. It just gives an hint of their intuitive idea of bundling of the features in an efficient way. But specificly, it does not say that it does a 'reversion of one-hot-encoding' in particular to your question.



              I tried giving categorical inputs directly and as one-hot-encoded to compare the time that it takes to compute. There was a significant difference: giving directly was all better in multiple datasets compared to giving as one-hot-encoded.



              Possibilities:



              1)It is possible that LightGBM Framework can find out that we give the features as one-hot-encoded from the sparsity, it is possible that the algorithm does not treat one-hot-encoded with EFB.



              2)It is also possible that LightGBM uses EFB on one-hot-encoded samples but it may be harmful, or not good as EFB on direct categorical inputs. (I go for this one)



              But still, I do not think that EFB will reverse one-hot-encoding since EFB is explained as a unique way of treating the categorical features. But it possibly 'bundles the unbundled features' when treating one-hot-encoded inputs.



              I used the word 'probably' so much times out of implicitness of the paper. What I can advice to you is that to send an e-mail to one of the authors of the paper, I do not think that they would refuse to explain it. Or if you are brave, go for the GitHub Repo of LightGBM, to check the codes by yourself.
              I hope that I could give you an insight. If you come up with an exact answer on the matter, please let me know. Please do not hesitate to further discuss this, I'll be around. Good luck, have fun!






              share|improve this answer











              $endgroup$

















                2












                $begingroup$

                I've read that paper so many times before in so many ways. What I can say on the matter is that the paper does not describe explicitly what the framework particularly does. It just gives an hint of their intuitive idea of bundling of the features in an efficient way. But specificly, it does not say that it does a 'reversion of one-hot-encoding' in particular to your question.



                I tried giving categorical inputs directly and as one-hot-encoded to compare the time that it takes to compute. There was a significant difference: giving directly was all better in multiple datasets compared to giving as one-hot-encoded.



                Possibilities:



                1)It is possible that LightGBM Framework can find out that we give the features as one-hot-encoded from the sparsity, it is possible that the algorithm does not treat one-hot-encoded with EFB.



                2)It is also possible that LightGBM uses EFB on one-hot-encoded samples but it may be harmful, or not good as EFB on direct categorical inputs. (I go for this one)



                But still, I do not think that EFB will reverse one-hot-encoding since EFB is explained as a unique way of treating the categorical features. But it possibly 'bundles the unbundled features' when treating one-hot-encoded inputs.



                I used the word 'probably' so much times out of implicitness of the paper. What I can advice to you is that to send an e-mail to one of the authors of the paper, I do not think that they would refuse to explain it. Or if you are brave, go for the GitHub Repo of LightGBM, to check the codes by yourself.
                I hope that I could give you an insight. If you come up with an exact answer on the matter, please let me know. Please do not hesitate to further discuss this, I'll be around. Good luck, have fun!






                share|improve this answer











                $endgroup$















                  2












                  2








                  2





                  $begingroup$

                  I've read that paper so many times before in so many ways. What I can say on the matter is that the paper does not describe explicitly what the framework particularly does. It just gives an hint of their intuitive idea of bundling of the features in an efficient way. But specificly, it does not say that it does a 'reversion of one-hot-encoding' in particular to your question.



                  I tried giving categorical inputs directly and as one-hot-encoded to compare the time that it takes to compute. There was a significant difference: giving directly was all better in multiple datasets compared to giving as one-hot-encoded.



                  Possibilities:



                  1)It is possible that LightGBM Framework can find out that we give the features as one-hot-encoded from the sparsity, it is possible that the algorithm does not treat one-hot-encoded with EFB.



                  2)It is also possible that LightGBM uses EFB on one-hot-encoded samples but it may be harmful, or not good as EFB on direct categorical inputs. (I go for this one)



                  But still, I do not think that EFB will reverse one-hot-encoding since EFB is explained as a unique way of treating the categorical features. But it possibly 'bundles the unbundled features' when treating one-hot-encoded inputs.



                  I used the word 'probably' so much times out of implicitness of the paper. What I can advice to you is that to send an e-mail to one of the authors of the paper, I do not think that they would refuse to explain it. Or if you are brave, go for the GitHub Repo of LightGBM, to check the codes by yourself.
                  I hope that I could give you an insight. If you come up with an exact answer on the matter, please let me know. Please do not hesitate to further discuss this, I'll be around. Good luck, have fun!






                  share|improve this answer











                  $endgroup$



                  I've read that paper so many times before in so many ways. What I can say on the matter is that the paper does not describe explicitly what the framework particularly does. It just gives an hint of their intuitive idea of bundling of the features in an efficient way. But specificly, it does not say that it does a 'reversion of one-hot-encoding' in particular to your question.



                  I tried giving categorical inputs directly and as one-hot-encoded to compare the time that it takes to compute. There was a significant difference: giving directly was all better in multiple datasets compared to giving as one-hot-encoded.



                  Possibilities:



                  1)It is possible that LightGBM Framework can find out that we give the features as one-hot-encoded from the sparsity, it is possible that the algorithm does not treat one-hot-encoded with EFB.



                  2)It is also possible that LightGBM uses EFB on one-hot-encoded samples but it may be harmful, or not good as EFB on direct categorical inputs. (I go for this one)



                  But still, I do not think that EFB will reverse one-hot-encoding since EFB is explained as a unique way of treating the categorical features. But it possibly 'bundles the unbundled features' when treating one-hot-encoded inputs.



                  I used the word 'probably' so much times out of implicitness of the paper. What I can advice to you is that to send an e-mail to one of the authors of the paper, I do not think that they would refuse to explain it. Or if you are brave, go for the GitHub Repo of LightGBM, to check the codes by yourself.
                  I hope that I could give you an insight. If you come up with an exact answer on the matter, please let me know. Please do not hesitate to further discuss this, I'll be around. Good luck, have fun!







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Apr 10 at 13:07









                  ebrahimi

                  76021022




                  76021022










                  answered Dec 2 '18 at 21:03









                  Ugur MULUKUgur MULUK

                  4047




                  4047





















                      1












                      $begingroup$

                      From what the paper describes, EFB serves to speed up by reducing number of features. I think it is not saying there is no other effects. Of course whether other 'effects' are real concerns is another question.



                      Also, EFB does not only deal with one-hot encoded features, but continuous features also.



                      I also think it would not bundle all one-hot encoded features with the possibility of getting an overflow value.






                      share|improve this answer









                      $endgroup$

















                        1












                        $begingroup$

                        From what the paper describes, EFB serves to speed up by reducing number of features. I think it is not saying there is no other effects. Of course whether other 'effects' are real concerns is another question.



                        Also, EFB does not only deal with one-hot encoded features, but continuous features also.



                        I also think it would not bundle all one-hot encoded features with the possibility of getting an overflow value.






                        share|improve this answer









                        $endgroup$















                          1












                          1








                          1





                          $begingroup$

                          From what the paper describes, EFB serves to speed up by reducing number of features. I think it is not saying there is no other effects. Of course whether other 'effects' are real concerns is another question.



                          Also, EFB does not only deal with one-hot encoded features, but continuous features also.



                          I also think it would not bundle all one-hot encoded features with the possibility of getting an overflow value.






                          share|improve this answer









                          $endgroup$



                          From what the paper describes, EFB serves to speed up by reducing number of features. I think it is not saying there is no other effects. Of course whether other 'effects' are real concerns is another question.



                          Also, EFB does not only deal with one-hot encoded features, but continuous features also.



                          I also think it would not bundle all one-hot encoded features with the possibility of getting an overflow value.







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Apr 10 at 10:51









                          Raymond KwokRaymond Kwok

                          111




                          111



























                              draft saved

                              draft discarded
















































                              Thanks for contributing an answer to Data Science Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid


                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.

                              Use MathJax to format equations. MathJax reference.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f41907%2flightgbm-why-exclusive-feature-bundling-efb%23new-answer', 'question_page');

                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

                              Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

                              Do these cracks on my tires look bad? The Next CEO of Stack OverflowDry rot tire should I replace?Having to replace tiresFishtailed so easily? Bad tires? ABS?Filling the tires with something other than air, to avoid puncture hassles?Used Michelin tires safe to install?Do these tyre cracks necessitate replacement?Rumbling noise: tires or mechanicalIs it possible to fix noisy feathered tires?Are bad winter tires still better than summer tires in winter?Torque converter failure - Related to replacing only 2 tires?Why use snow tires on all 4 wheels on 2-wheel-drive cars?