Lasso implementation Drawback Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsUsing clustering and Lasso with cvTroubleshooting Neural Network ImplementationIs it advisable to rerun LASSO multiple (2) times?Dropout backpropagation implementation detailsAlgorithm to apply Lasso and Ridge in Gradient descentIs removing poorly predicted data points a valid approach?Ridge and Lasso RegularizationWhy is ElasticNet performs worse than both Lasso and Ridge?boosting an xgboost classifier with another xgboost classifier using different sets of featuresDirect Feedback Alignment implementation

Is this Kuo-toa homebrew race balanced?

Statistical analysis applied to methods coming out of Machine Learning

Is this Half-dragon Quaggoth boss monster balanced?

Short story about astronauts fertilizing soil with their own bodies

Is a copyright notice with a non-existent name be invalid?

What was the last profitable war?

Besides transaction validation, are there any other uses of the Script language in Bitcoin

Does the transliteration of 'Dravidian' exist in Hindu scripture? Does 'Dravida' refer to a Geographical area or an ethnic group?

Vertical ranges of Column Plots in 12

New Order #6: Easter Egg

Is there a spell that can create a permanent fire?

Pointing to problems without suggesting solutions

How could a hydrazine and N2O4 cloud (or it's reactants) show up in weather radar?

"Destructive power" carried by a B-52?

Is there a verb for listening stealthily?

Weaponising the Grasp-at-a-Distance spell

One-one communication

Did any compiler fully use 80-bit floating point?

Found this skink in my tomato plant bucket. Is he trapped? Or could he leave if he wanted?

How does the body cool itself in a stillsuit?

Why is there so little support for joining EFTA in the British parliament?

Is the Mordenkainen's Sword spell underpowered?

Did John Wesley plagiarize Matthew Henry...?

The Nth Gryphon Number



Lasso implementation Drawback



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsUsing clustering and Lasso with cvTroubleshooting Neural Network ImplementationIs it advisable to rerun LASSO multiple (2) times?Dropout backpropagation implementation detailsAlgorithm to apply Lasso and Ridge in Gradient descentIs removing poorly predicted data points a valid approach?Ridge and Lasso RegularizationWhy is ElasticNet performs worse than both Lasso and Ridge?boosting an xgboost classifier with another xgboost classifier using different sets of featuresDirect Feedback Alignment implementation










1












$begingroup$


Recently I've been trying to implement Lasso by myself in R, not using the "glmnet" package, and based on an article by Tibshirani I wrote a raw code to implement coordinate descent method, and it does successfully force many coefficients to shrink to zero. But when I try to compare my result based on a famous R built-in data set(mtcars) with glmnet package, and apply different lambdas, only to find out that while other coefficients are reasonable, one coefficient that should be insignificant remain quite large. I've been trying to figure out the reason. Specifically I have some detailed questions:



  1. How to deal with the intercept in the formula? When the data set is scaled, it seems that the $beta_0$ can just be set to the mean value of the response variable. But even the glmnet package produces a different intercept when lambda is changed in the model.


  2. What could be the possible reason for the problem as I described above?


Hope someone could lend a hand, thanks! If any more details are needed, I can provide them.



Here's my source code in Git. I've also described my problem in detailed in README.md.










share|improve this question











$endgroup$











  • $begingroup$
    There is no need to normalise the intercept term. There are multifarious reasons, one of them is that it is just a single coefficient, but there are many other than that.
    $endgroup$
    – Vaalizaadeh
    Apr 3 at 18:01










  • $begingroup$
    Are you trying to implement just elastic net, i.e. regularized linear regression?
    $endgroup$
    – Matthew Drury
    Apr 3 at 20:25






  • 1




    $begingroup$
    interesting question but it is difficult to answer since it could be just a bug in your code or something completely different. one thing to consider is that the algorithm is usually described for the case where the data already has mean zero and unit variance. you need to adjust the formulas accordingly if that's not true for mtcars
    $endgroup$
    – oW_
    Apr 3 at 23:24










  • $begingroup$
    @oW_ But what's the case for glmnet? Seems it didn't adjust the final coefficient accordingly, because many of them remain pretty close to mine, and if they get adjusted, they should become very small, since the original corresponding data are quite large
    $endgroup$
    – DiaryofNewton
    Apr 4 at 2:01










  • $begingroup$
    @MatthewDrury Elastic net. The penalty term is a mixture of l1 norm and l2 norm. What can this suggest?
    $endgroup$
    – DiaryofNewton
    Apr 4 at 2:02















1












$begingroup$


Recently I've been trying to implement Lasso by myself in R, not using the "glmnet" package, and based on an article by Tibshirani I wrote a raw code to implement coordinate descent method, and it does successfully force many coefficients to shrink to zero. But when I try to compare my result based on a famous R built-in data set(mtcars) with glmnet package, and apply different lambdas, only to find out that while other coefficients are reasonable, one coefficient that should be insignificant remain quite large. I've been trying to figure out the reason. Specifically I have some detailed questions:



  1. How to deal with the intercept in the formula? When the data set is scaled, it seems that the $beta_0$ can just be set to the mean value of the response variable. But even the glmnet package produces a different intercept when lambda is changed in the model.


  2. What could be the possible reason for the problem as I described above?


Hope someone could lend a hand, thanks! If any more details are needed, I can provide them.



Here's my source code in Git. I've also described my problem in detailed in README.md.










share|improve this question











$endgroup$











  • $begingroup$
    There is no need to normalise the intercept term. There are multifarious reasons, one of them is that it is just a single coefficient, but there are many other than that.
    $endgroup$
    – Vaalizaadeh
    Apr 3 at 18:01










  • $begingroup$
    Are you trying to implement just elastic net, i.e. regularized linear regression?
    $endgroup$
    – Matthew Drury
    Apr 3 at 20:25






  • 1




    $begingroup$
    interesting question but it is difficult to answer since it could be just a bug in your code or something completely different. one thing to consider is that the algorithm is usually described for the case where the data already has mean zero and unit variance. you need to adjust the formulas accordingly if that's not true for mtcars
    $endgroup$
    – oW_
    Apr 3 at 23:24










  • $begingroup$
    @oW_ But what's the case for glmnet? Seems it didn't adjust the final coefficient accordingly, because many of them remain pretty close to mine, and if they get adjusted, they should become very small, since the original corresponding data are quite large
    $endgroup$
    – DiaryofNewton
    Apr 4 at 2:01










  • $begingroup$
    @MatthewDrury Elastic net. The penalty term is a mixture of l1 norm and l2 norm. What can this suggest?
    $endgroup$
    – DiaryofNewton
    Apr 4 at 2:02













1












1








1





$begingroup$


Recently I've been trying to implement Lasso by myself in R, not using the "glmnet" package, and based on an article by Tibshirani I wrote a raw code to implement coordinate descent method, and it does successfully force many coefficients to shrink to zero. But when I try to compare my result based on a famous R built-in data set(mtcars) with glmnet package, and apply different lambdas, only to find out that while other coefficients are reasonable, one coefficient that should be insignificant remain quite large. I've been trying to figure out the reason. Specifically I have some detailed questions:



  1. How to deal with the intercept in the formula? When the data set is scaled, it seems that the $beta_0$ can just be set to the mean value of the response variable. But even the glmnet package produces a different intercept when lambda is changed in the model.


  2. What could be the possible reason for the problem as I described above?


Hope someone could lend a hand, thanks! If any more details are needed, I can provide them.



Here's my source code in Git. I've also described my problem in detailed in README.md.










share|improve this question











$endgroup$




Recently I've been trying to implement Lasso by myself in R, not using the "glmnet" package, and based on an article by Tibshirani I wrote a raw code to implement coordinate descent method, and it does successfully force many coefficients to shrink to zero. But when I try to compare my result based on a famous R built-in data set(mtcars) with glmnet package, and apply different lambdas, only to find out that while other coefficients are reasonable, one coefficient that should be insignificant remain quite large. I've been trying to figure out the reason. Specifically I have some detailed questions:



  1. How to deal with the intercept in the formula? When the data set is scaled, it seems that the $beta_0$ can just be set to the mean value of the response variable. But even the glmnet package produces a different intercept when lambda is changed in the model.


  2. What could be the possible reason for the problem as I described above?


Hope someone could lend a hand, thanks! If any more details are needed, I can provide them.



Here's my source code in Git. I've also described my problem in detailed in README.md.







machine-learning






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Apr 6 at 11:41







DiaryofNewton

















asked Apr 3 at 17:06









DiaryofNewtonDiaryofNewton

62




62











  • $begingroup$
    There is no need to normalise the intercept term. There are multifarious reasons, one of them is that it is just a single coefficient, but there are many other than that.
    $endgroup$
    – Vaalizaadeh
    Apr 3 at 18:01










  • $begingroup$
    Are you trying to implement just elastic net, i.e. regularized linear regression?
    $endgroup$
    – Matthew Drury
    Apr 3 at 20:25






  • 1




    $begingroup$
    interesting question but it is difficult to answer since it could be just a bug in your code or something completely different. one thing to consider is that the algorithm is usually described for the case where the data already has mean zero and unit variance. you need to adjust the formulas accordingly if that's not true for mtcars
    $endgroup$
    – oW_
    Apr 3 at 23:24










  • $begingroup$
    @oW_ But what's the case for glmnet? Seems it didn't adjust the final coefficient accordingly, because many of them remain pretty close to mine, and if they get adjusted, they should become very small, since the original corresponding data are quite large
    $endgroup$
    – DiaryofNewton
    Apr 4 at 2:01










  • $begingroup$
    @MatthewDrury Elastic net. The penalty term is a mixture of l1 norm and l2 norm. What can this suggest?
    $endgroup$
    – DiaryofNewton
    Apr 4 at 2:02
















  • $begingroup$
    There is no need to normalise the intercept term. There are multifarious reasons, one of them is that it is just a single coefficient, but there are many other than that.
    $endgroup$
    – Vaalizaadeh
    Apr 3 at 18:01










  • $begingroup$
    Are you trying to implement just elastic net, i.e. regularized linear regression?
    $endgroup$
    – Matthew Drury
    Apr 3 at 20:25






  • 1




    $begingroup$
    interesting question but it is difficult to answer since it could be just a bug in your code or something completely different. one thing to consider is that the algorithm is usually described for the case where the data already has mean zero and unit variance. you need to adjust the formulas accordingly if that's not true for mtcars
    $endgroup$
    – oW_
    Apr 3 at 23:24










  • $begingroup$
    @oW_ But what's the case for glmnet? Seems it didn't adjust the final coefficient accordingly, because many of them remain pretty close to mine, and if they get adjusted, they should become very small, since the original corresponding data are quite large
    $endgroup$
    – DiaryofNewton
    Apr 4 at 2:01










  • $begingroup$
    @MatthewDrury Elastic net. The penalty term is a mixture of l1 norm and l2 norm. What can this suggest?
    $endgroup$
    – DiaryofNewton
    Apr 4 at 2:02















$begingroup$
There is no need to normalise the intercept term. There are multifarious reasons, one of them is that it is just a single coefficient, but there are many other than that.
$endgroup$
– Vaalizaadeh
Apr 3 at 18:01




$begingroup$
There is no need to normalise the intercept term. There are multifarious reasons, one of them is that it is just a single coefficient, but there are many other than that.
$endgroup$
– Vaalizaadeh
Apr 3 at 18:01












$begingroup$
Are you trying to implement just elastic net, i.e. regularized linear regression?
$endgroup$
– Matthew Drury
Apr 3 at 20:25




$begingroup$
Are you trying to implement just elastic net, i.e. regularized linear regression?
$endgroup$
– Matthew Drury
Apr 3 at 20:25




1




1




$begingroup$
interesting question but it is difficult to answer since it could be just a bug in your code or something completely different. one thing to consider is that the algorithm is usually described for the case where the data already has mean zero and unit variance. you need to adjust the formulas accordingly if that's not true for mtcars
$endgroup$
– oW_
Apr 3 at 23:24




$begingroup$
interesting question but it is difficult to answer since it could be just a bug in your code or something completely different. one thing to consider is that the algorithm is usually described for the case where the data already has mean zero and unit variance. you need to adjust the formulas accordingly if that's not true for mtcars
$endgroup$
– oW_
Apr 3 at 23:24












$begingroup$
@oW_ But what's the case for glmnet? Seems it didn't adjust the final coefficient accordingly, because many of them remain pretty close to mine, and if they get adjusted, they should become very small, since the original corresponding data are quite large
$endgroup$
– DiaryofNewton
Apr 4 at 2:01




$begingroup$
@oW_ But what's the case for glmnet? Seems it didn't adjust the final coefficient accordingly, because many of them remain pretty close to mine, and if they get adjusted, they should become very small, since the original corresponding data are quite large
$endgroup$
– DiaryofNewton
Apr 4 at 2:01












$begingroup$
@MatthewDrury Elastic net. The penalty term is a mixture of l1 norm and l2 norm. What can this suggest?
$endgroup$
– DiaryofNewton
Apr 4 at 2:02




$begingroup$
@MatthewDrury Elastic net. The penalty term is a mixture of l1 norm and l2 norm. What can this suggest?
$endgroup$
– DiaryofNewton
Apr 4 at 2:02










0






active

oldest

votes












Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48543%2flasso-implementation-drawback%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Data Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48543%2flasso-implementation-drawback%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Is flight data recorder erased after every flight?When are black boxes used?What protects the location beacon (pinger) of a flight data recorder?Is there anywhere I can pick up raw flight data recorder information?Who legally owns the Flight Data Recorder?Constructing flight recorder dataWhy are FDRs and CVRs still two separate physical devices?What are the data elements shown on the GE235 flight data recorder (FDR) plot?Are CVR and FDR reset after every flight?What is the format of data stored by a Flight Data Recorder?How much data is stored in the flight data recorder per hour in a typical flight of an A380?Is a smart flight data recorder possible?

Which is better: GPT or RelGAN for text generation?2019 Community Moderator ElectionWhat is the difference between TextGAN and LM for text generation?GANs (generative adversarial networks) possible for text as well?Generator loss not decreasing- text to image synthesisChoosing a right algorithm for template-based text generationHow should I format input and output for text generation with LSTMsGumbel Softmax vs Vanilla Softmax for GAN trainingWhich neural network to choose for classification from text/speech?NLP text autoencoder that generates text in poetic meterWhat is the interpretation of the expectation notation in the GAN formulation?What is the difference between TextGAN and LM for text generation?How to prepare the data for text generation task

Is there a general name for the setup in which payoffs are not known exactly but players try to influence each other's perception of the payoffs?Osborne, Nash equilibria and the correctness of beliefsIs there a name for this family of games (Binomial games?)?Perfect Bayesian EquilibriumCalculating mixed strategy equilibrium in battle of sexesPure Strategy SPNEIs there a commitment mechanism which allows players to achieve pareto optimal solutions?Extensive Form GamesAn $n$-player prisoner's dilemma where a coalition of 2 players is better off defectingTit-For-Stat Strategy Best RepliesPotential solutions of the $n$-player Prisoner's Dilemma