Lasso implementation Drawback Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsUsing clustering and Lasso with cvTroubleshooting Neural Network ImplementationIs it advisable to rerun LASSO multiple (2) times?Dropout backpropagation implementation detailsAlgorithm to apply Lasso and Ridge in Gradient descentIs removing poorly predicted data points a valid approach?Ridge and Lasso RegularizationWhy is ElasticNet performs worse than both Lasso and Ridge?boosting an xgboost classifier with another xgboost classifier using different sets of featuresDirect Feedback Alignment implementation
Is this Kuo-toa homebrew race balanced?
Statistical analysis applied to methods coming out of Machine Learning
Is this Half-dragon Quaggoth boss monster balanced?
Short story about astronauts fertilizing soil with their own bodies
Is a copyright notice with a non-existent name be invalid?
What was the last profitable war?
Besides transaction validation, are there any other uses of the Script language in Bitcoin
Does the transliteration of 'Dravidian' exist in Hindu scripture? Does 'Dravida' refer to a Geographical area or an ethnic group?
Vertical ranges of Column Plots in 12
New Order #6: Easter Egg
Is there a spell that can create a permanent fire?
Pointing to problems without suggesting solutions
How could a hydrazine and N2O4 cloud (or it's reactants) show up in weather radar?
"Destructive power" carried by a B-52?
Is there a verb for listening stealthily?
Weaponising the Grasp-at-a-Distance spell
One-one communication
Did any compiler fully use 80-bit floating point?
Found this skink in my tomato plant bucket. Is he trapped? Or could he leave if he wanted?
How does the body cool itself in a stillsuit?
Why is there so little support for joining EFTA in the British parliament?
Is the Mordenkainen's Sword spell underpowered?
Did John Wesley plagiarize Matthew Henry...?
The Nth Gryphon Number
Lasso implementation Drawback
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsUsing clustering and Lasso with cvTroubleshooting Neural Network ImplementationIs it advisable to rerun LASSO multiple (2) times?Dropout backpropagation implementation detailsAlgorithm to apply Lasso and Ridge in Gradient descentIs removing poorly predicted data points a valid approach?Ridge and Lasso RegularizationWhy is ElasticNet performs worse than both Lasso and Ridge?boosting an xgboost classifier with another xgboost classifier using different sets of featuresDirect Feedback Alignment implementation
$begingroup$
Recently I've been trying to implement Lasso by myself in R, not using the "glmnet" package, and based on an article by Tibshirani I wrote a raw code to implement coordinate descent method, and it does successfully force many coefficients to shrink to zero. But when I try to compare my result based on a famous R built-in data set(mtcars) with glmnet package, and apply different lambdas, only to find out that while other coefficients are reasonable, one coefficient that should be insignificant remain quite large. I've been trying to figure out the reason. Specifically I have some detailed questions:
How to deal with the intercept in the formula? When the data set is scaled, it seems that the $beta_0$ can just be set to the mean value of the response variable. But even the glmnet package produces a different intercept when lambda is changed in the model.
What could be the possible reason for the problem as I described above?
Hope someone could lend a hand, thanks! If any more details are needed, I can provide them.
Here's my source code in Git. I've also described my problem in detailed in README.md.
machine-learning
$endgroup$
|
show 5 more comments
$begingroup$
Recently I've been trying to implement Lasso by myself in R, not using the "glmnet" package, and based on an article by Tibshirani I wrote a raw code to implement coordinate descent method, and it does successfully force many coefficients to shrink to zero. But when I try to compare my result based on a famous R built-in data set(mtcars) with glmnet package, and apply different lambdas, only to find out that while other coefficients are reasonable, one coefficient that should be insignificant remain quite large. I've been trying to figure out the reason. Specifically I have some detailed questions:
How to deal with the intercept in the formula? When the data set is scaled, it seems that the $beta_0$ can just be set to the mean value of the response variable. But even the glmnet package produces a different intercept when lambda is changed in the model.
What could be the possible reason for the problem as I described above?
Hope someone could lend a hand, thanks! If any more details are needed, I can provide them.
Here's my source code in Git. I've also described my problem in detailed in README.md.
machine-learning
$endgroup$
$begingroup$
There is no need to normalise the intercept term. There are multifarious reasons, one of them is that it is just a single coefficient, but there are many other than that.
$endgroup$
– Vaalizaadeh
Apr 3 at 18:01
$begingroup$
Are you trying to implement just elastic net, i.e. regularized linear regression?
$endgroup$
– Matthew Drury
Apr 3 at 20:25
1
$begingroup$
interesting question but it is difficult to answer since it could be just a bug in your code or something completely different. one thing to consider is that the algorithm is usually described for the case where the data already has mean zero and unit variance. you need to adjust the formulas accordingly if that's not true formtcars
$endgroup$
– oW_♦
Apr 3 at 23:24
$begingroup$
@oW_ But what's the case for glmnet? Seems it didn't adjust the final coefficient accordingly, because many of them remain pretty close to mine, and if they get adjusted, they should become very small, since the original corresponding data are quite large
$endgroup$
– DiaryofNewton
Apr 4 at 2:01
$begingroup$
@MatthewDrury Elastic net. The penalty term is a mixture of l1 norm and l2 norm. What can this suggest?
$endgroup$
– DiaryofNewton
Apr 4 at 2:02
|
show 5 more comments
$begingroup$
Recently I've been trying to implement Lasso by myself in R, not using the "glmnet" package, and based on an article by Tibshirani I wrote a raw code to implement coordinate descent method, and it does successfully force many coefficients to shrink to zero. But when I try to compare my result based on a famous R built-in data set(mtcars) with glmnet package, and apply different lambdas, only to find out that while other coefficients are reasonable, one coefficient that should be insignificant remain quite large. I've been trying to figure out the reason. Specifically I have some detailed questions:
How to deal with the intercept in the formula? When the data set is scaled, it seems that the $beta_0$ can just be set to the mean value of the response variable. But even the glmnet package produces a different intercept when lambda is changed in the model.
What could be the possible reason for the problem as I described above?
Hope someone could lend a hand, thanks! If any more details are needed, I can provide them.
Here's my source code in Git. I've also described my problem in detailed in README.md.
machine-learning
$endgroup$
Recently I've been trying to implement Lasso by myself in R, not using the "glmnet" package, and based on an article by Tibshirani I wrote a raw code to implement coordinate descent method, and it does successfully force many coefficients to shrink to zero. But when I try to compare my result based on a famous R built-in data set(mtcars) with glmnet package, and apply different lambdas, only to find out that while other coefficients are reasonable, one coefficient that should be insignificant remain quite large. I've been trying to figure out the reason. Specifically I have some detailed questions:
How to deal with the intercept in the formula? When the data set is scaled, it seems that the $beta_0$ can just be set to the mean value of the response variable. But even the glmnet package produces a different intercept when lambda is changed in the model.
What could be the possible reason for the problem as I described above?
Hope someone could lend a hand, thanks! If any more details are needed, I can provide them.
Here's my source code in Git. I've also described my problem in detailed in README.md.
machine-learning
machine-learning
edited Apr 6 at 11:41
DiaryofNewton
asked Apr 3 at 17:06
DiaryofNewtonDiaryofNewton
62
62
$begingroup$
There is no need to normalise the intercept term. There are multifarious reasons, one of them is that it is just a single coefficient, but there are many other than that.
$endgroup$
– Vaalizaadeh
Apr 3 at 18:01
$begingroup$
Are you trying to implement just elastic net, i.e. regularized linear regression?
$endgroup$
– Matthew Drury
Apr 3 at 20:25
1
$begingroup$
interesting question but it is difficult to answer since it could be just a bug in your code or something completely different. one thing to consider is that the algorithm is usually described for the case where the data already has mean zero and unit variance. you need to adjust the formulas accordingly if that's not true formtcars
$endgroup$
– oW_♦
Apr 3 at 23:24
$begingroup$
@oW_ But what's the case for glmnet? Seems it didn't adjust the final coefficient accordingly, because many of them remain pretty close to mine, and if they get adjusted, they should become very small, since the original corresponding data are quite large
$endgroup$
– DiaryofNewton
Apr 4 at 2:01
$begingroup$
@MatthewDrury Elastic net. The penalty term is a mixture of l1 norm and l2 norm. What can this suggest?
$endgroup$
– DiaryofNewton
Apr 4 at 2:02
|
show 5 more comments
$begingroup$
There is no need to normalise the intercept term. There are multifarious reasons, one of them is that it is just a single coefficient, but there are many other than that.
$endgroup$
– Vaalizaadeh
Apr 3 at 18:01
$begingroup$
Are you trying to implement just elastic net, i.e. regularized linear regression?
$endgroup$
– Matthew Drury
Apr 3 at 20:25
1
$begingroup$
interesting question but it is difficult to answer since it could be just a bug in your code or something completely different. one thing to consider is that the algorithm is usually described for the case where the data already has mean zero and unit variance. you need to adjust the formulas accordingly if that's not true formtcars
$endgroup$
– oW_♦
Apr 3 at 23:24
$begingroup$
@oW_ But what's the case for glmnet? Seems it didn't adjust the final coefficient accordingly, because many of them remain pretty close to mine, and if they get adjusted, they should become very small, since the original corresponding data are quite large
$endgroup$
– DiaryofNewton
Apr 4 at 2:01
$begingroup$
@MatthewDrury Elastic net. The penalty term is a mixture of l1 norm and l2 norm. What can this suggest?
$endgroup$
– DiaryofNewton
Apr 4 at 2:02
$begingroup$
There is no need to normalise the intercept term. There are multifarious reasons, one of them is that it is just a single coefficient, but there are many other than that.
$endgroup$
– Vaalizaadeh
Apr 3 at 18:01
$begingroup$
There is no need to normalise the intercept term. There are multifarious reasons, one of them is that it is just a single coefficient, but there are many other than that.
$endgroup$
– Vaalizaadeh
Apr 3 at 18:01
$begingroup$
Are you trying to implement just elastic net, i.e. regularized linear regression?
$endgroup$
– Matthew Drury
Apr 3 at 20:25
$begingroup$
Are you trying to implement just elastic net, i.e. regularized linear regression?
$endgroup$
– Matthew Drury
Apr 3 at 20:25
1
1
$begingroup$
interesting question but it is difficult to answer since it could be just a bug in your code or something completely different. one thing to consider is that the algorithm is usually described for the case where the data already has mean zero and unit variance. you need to adjust the formulas accordingly if that's not true for
mtcars$endgroup$
– oW_♦
Apr 3 at 23:24
$begingroup$
interesting question but it is difficult to answer since it could be just a bug in your code or something completely different. one thing to consider is that the algorithm is usually described for the case where the data already has mean zero and unit variance. you need to adjust the formulas accordingly if that's not true for
mtcars$endgroup$
– oW_♦
Apr 3 at 23:24
$begingroup$
@oW_ But what's the case for glmnet? Seems it didn't adjust the final coefficient accordingly, because many of them remain pretty close to mine, and if they get adjusted, they should become very small, since the original corresponding data are quite large
$endgroup$
– DiaryofNewton
Apr 4 at 2:01
$begingroup$
@oW_ But what's the case for glmnet? Seems it didn't adjust the final coefficient accordingly, because many of them remain pretty close to mine, and if they get adjusted, they should become very small, since the original corresponding data are quite large
$endgroup$
– DiaryofNewton
Apr 4 at 2:01
$begingroup$
@MatthewDrury Elastic net. The penalty term is a mixture of l1 norm and l2 norm. What can this suggest?
$endgroup$
– DiaryofNewton
Apr 4 at 2:02
$begingroup$
@MatthewDrury Elastic net. The penalty term is a mixture of l1 norm and l2 norm. What can this suggest?
$endgroup$
– DiaryofNewton
Apr 4 at 2:02
|
show 5 more comments
0
active
oldest
votes
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48543%2flasso-implementation-drawback%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48543%2flasso-implementation-drawback%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
There is no need to normalise the intercept term. There are multifarious reasons, one of them is that it is just a single coefficient, but there are many other than that.
$endgroup$
– Vaalizaadeh
Apr 3 at 18:01
$begingroup$
Are you trying to implement just elastic net, i.e. regularized linear regression?
$endgroup$
– Matthew Drury
Apr 3 at 20:25
1
$begingroup$
interesting question but it is difficult to answer since it could be just a bug in your code or something completely different. one thing to consider is that the algorithm is usually described for the case where the data already has mean zero and unit variance. you need to adjust the formulas accordingly if that's not true for
mtcars$endgroup$
– oW_♦
Apr 3 at 23:24
$begingroup$
@oW_ But what's the case for glmnet? Seems it didn't adjust the final coefficient accordingly, because many of them remain pretty close to mine, and if they get adjusted, they should become very small, since the original corresponding data are quite large
$endgroup$
– DiaryofNewton
Apr 4 at 2:01
$begingroup$
@MatthewDrury Elastic net. The penalty term is a mixture of l1 norm and l2 norm. What can this suggest?
$endgroup$
– DiaryofNewton
Apr 4 at 2:02