Is it OK to try to find the best PCA k parameter as we do with other hyperparameters?2019 Community Moderator ElectionWhy do we choose principal components based on maximum variance explained?What is the actual output of Principal Component Analysis?Genetic Algorithm to find best parameter values of an estimaorWhat is the difference between model hyperparameters and model parameters?Preprocessing to-be-predicted data in ML with R - “learn” and “apply” featuresDimensionality reduction with PCA limitationsPCA with Catagorical Variable in RSklearn PCA with zero components examplemultivariate clustering, dimensionality reduction and data scalling for regressionShow importance of variables from a data set without a response variable? Use PCA?Scale of the data after PCA
What does "Puller Prush Person" mean?
Mortgage Pre-approval / Loan - Apply Alone or with Fiancée?
Accidentally leaked the solution to an assignment, what to do now? (I'm the prof)
NMaximize is not converging to a solution
Horror movie about a virus at the prom; beginning and end are stylized as a cartoon
Client team has low performances and low technical skills: we always fix their work and now they stop collaborate with us. How to solve?
Is it inappropriate for a student to attend their mentor's dissertation defense?
dbcc cleantable batch size explanation
Mutually beneficial digestive system symbiotes
How to regain access to running applications after accidentally zapping X.org?
Why doesn't H₄O²⁺ exist?
Could an aircraft fly or hover using only jets of compressed air?
Important Resources for Dark Age Civilizations?
Do infinite dimensional systems make sense?
Can a Cauchy sequence converge for one metric while not converging for another?
Has there ever been an airliner design involving reducing generator load by installing solar panels?
Is it possible to run Internet Explorer on OS X El Capitan?
Is it unprofessional to ask if a job posting on GlassDoor is real?
Approximately how much travel time was saved by the opening of the Suez Canal in 1869?
Can you really stack all of this on an Opportunity Attack?
Are astronomers waiting to see something in an image from a gravitational lens that they've already seen in an adjacent image?
How much of data wrangling is a data scientist's job?
Why can't I see bouncing of switch on oscilloscope screen?
Perform and show arithmetic with LuaLaTeX
Is it OK to try to find the best PCA k parameter as we do with other hyperparameters?
2019 Community Moderator ElectionWhy do we choose principal components based on maximum variance explained?What is the actual output of Principal Component Analysis?Genetic Algorithm to find best parameter values of an estimaorWhat is the difference between model hyperparameters and model parameters?Preprocessing to-be-predicted data in ML with R - “learn” and “apply” featuresDimensionality reduction with PCA limitationsPCA with Catagorical Variable in RSklearn PCA with zero components examplemultivariate clustering, dimensionality reduction and data scalling for regressionShow importance of variables from a data set without a response variable? Use PCA?Scale of the data after PCA
$begingroup$
Principal Component Analysis (PCA) is used to reduce n-dimensional data to k-dimensional data to speed things up in machine learning. After PCA is applied, one can check how much of the variance of the original dataset remains in the resulting dataset. A common goal is keeping variance between 90% and 99%.
My question is: is it considered a good practice to try different values of the k parameter (size of the resulting dataset's dimension) and then check the results of the resulting models against some cross-validation dataset in the same way as we do to pick good values of other hyperparameters like regularization lambdas and thresholds?
machine-learning pca hyperparameter
$endgroup$
add a comment |
$begingroup$
Principal Component Analysis (PCA) is used to reduce n-dimensional data to k-dimensional data to speed things up in machine learning. After PCA is applied, one can check how much of the variance of the original dataset remains in the resulting dataset. A common goal is keeping variance between 90% and 99%.
My question is: is it considered a good practice to try different values of the k parameter (size of the resulting dataset's dimension) and then check the results of the resulting models against some cross-validation dataset in the same way as we do to pick good values of other hyperparameters like regularization lambdas and thresholds?
machine-learning pca hyperparameter
$endgroup$
add a comment |
$begingroup$
Principal Component Analysis (PCA) is used to reduce n-dimensional data to k-dimensional data to speed things up in machine learning. After PCA is applied, one can check how much of the variance of the original dataset remains in the resulting dataset. A common goal is keeping variance between 90% and 99%.
My question is: is it considered a good practice to try different values of the k parameter (size of the resulting dataset's dimension) and then check the results of the resulting models against some cross-validation dataset in the same way as we do to pick good values of other hyperparameters like regularization lambdas and thresholds?
machine-learning pca hyperparameter
$endgroup$
Principal Component Analysis (PCA) is used to reduce n-dimensional data to k-dimensional data to speed things up in machine learning. After PCA is applied, one can check how much of the variance of the original dataset remains in the resulting dataset. A common goal is keeping variance between 90% and 99%.
My question is: is it considered a good practice to try different values of the k parameter (size of the resulting dataset's dimension) and then check the results of the resulting models against some cross-validation dataset in the same way as we do to pick good values of other hyperparameters like regularization lambdas and thresholds?
machine-learning pca hyperparameter
machine-learning pca hyperparameter
asked Mar 27 at 18:58
J. DoeJ. Doe
361
361
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Your emphasis on using a validation set rather than the training set for selecting $k$ is a good practice and should be followed. However, we can do even better!
The parameter $k$ in $textPCA$ is more special than a general hyper-parameter. Because, the solution to $textPCA(k)$ already exists in $textPCA(K)$, for $K > k$, which is the first $k$ Eigenvectors (corresponding to $k$ largest Eigenvalues) in $textPCA(K)$. Therefore, instead of running $textPCA(1)$, $textPCA(4)$, ..., $textPCA(K)$ separately on training data, as we do for a hyper-parameter in general, we only need to run $textPCA(K)$ to have the solution for all $k in 1,..,K$.
As a result, the process would be as follows:
- Run $textPCA$ for the largest acceptable $K$ on training set,
- Plot, or prepare ($k$, variance) on validation set,
- Select the $k$ that gives the minimum acceptable variance, e.g. 90% or 99%.
And, N-fold cross validation would be as follows:
- Run $textPCA$ for the largest acceptable $K$ on N training folds,
- Plot, or prepare ($k$, average of N variances) on held-out folds,
- Select the $k$ that gives the minimum acceptable average variance, e.g. 90% or 99%.
Also, here is a related post that asks "why do we choose principal components based on maximum variance explained?".
$endgroup$
$begingroup$
Is K-PCA the correct name for this? It sounds a bit confusing and remembers me of Kernel Principal Component Analysis (KPCA), which is a non-linear version of PCA
$endgroup$
– Pedro Henrique Monforte
Mar 28 at 2:36
$begingroup$
@PedroHenriqueMonforte Thanks! Notation updated.
$endgroup$
– Esmailian
Mar 28 at 11:05
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48103%2fis-it-ok-to-try-to-find-the-best-pca-k-parameter-as-we-do-with-other-hyperparame%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Your emphasis on using a validation set rather than the training set for selecting $k$ is a good practice and should be followed. However, we can do even better!
The parameter $k$ in $textPCA$ is more special than a general hyper-parameter. Because, the solution to $textPCA(k)$ already exists in $textPCA(K)$, for $K > k$, which is the first $k$ Eigenvectors (corresponding to $k$ largest Eigenvalues) in $textPCA(K)$. Therefore, instead of running $textPCA(1)$, $textPCA(4)$, ..., $textPCA(K)$ separately on training data, as we do for a hyper-parameter in general, we only need to run $textPCA(K)$ to have the solution for all $k in 1,..,K$.
As a result, the process would be as follows:
- Run $textPCA$ for the largest acceptable $K$ on training set,
- Plot, or prepare ($k$, variance) on validation set,
- Select the $k$ that gives the minimum acceptable variance, e.g. 90% or 99%.
And, N-fold cross validation would be as follows:
- Run $textPCA$ for the largest acceptable $K$ on N training folds,
- Plot, or prepare ($k$, average of N variances) on held-out folds,
- Select the $k$ that gives the minimum acceptable average variance, e.g. 90% or 99%.
Also, here is a related post that asks "why do we choose principal components based on maximum variance explained?".
$endgroup$
$begingroup$
Is K-PCA the correct name for this? It sounds a bit confusing and remembers me of Kernel Principal Component Analysis (KPCA), which is a non-linear version of PCA
$endgroup$
– Pedro Henrique Monforte
Mar 28 at 2:36
$begingroup$
@PedroHenriqueMonforte Thanks! Notation updated.
$endgroup$
– Esmailian
Mar 28 at 11:05
add a comment |
$begingroup$
Your emphasis on using a validation set rather than the training set for selecting $k$ is a good practice and should be followed. However, we can do even better!
The parameter $k$ in $textPCA$ is more special than a general hyper-parameter. Because, the solution to $textPCA(k)$ already exists in $textPCA(K)$, for $K > k$, which is the first $k$ Eigenvectors (corresponding to $k$ largest Eigenvalues) in $textPCA(K)$. Therefore, instead of running $textPCA(1)$, $textPCA(4)$, ..., $textPCA(K)$ separately on training data, as we do for a hyper-parameter in general, we only need to run $textPCA(K)$ to have the solution for all $k in 1,..,K$.
As a result, the process would be as follows:
- Run $textPCA$ for the largest acceptable $K$ on training set,
- Plot, or prepare ($k$, variance) on validation set,
- Select the $k$ that gives the minimum acceptable variance, e.g. 90% or 99%.
And, N-fold cross validation would be as follows:
- Run $textPCA$ for the largest acceptable $K$ on N training folds,
- Plot, or prepare ($k$, average of N variances) on held-out folds,
- Select the $k$ that gives the minimum acceptable average variance, e.g. 90% or 99%.
Also, here is a related post that asks "why do we choose principal components based on maximum variance explained?".
$endgroup$
$begingroup$
Is K-PCA the correct name for this? It sounds a bit confusing and remembers me of Kernel Principal Component Analysis (KPCA), which is a non-linear version of PCA
$endgroup$
– Pedro Henrique Monforte
Mar 28 at 2:36
$begingroup$
@PedroHenriqueMonforte Thanks! Notation updated.
$endgroup$
– Esmailian
Mar 28 at 11:05
add a comment |
$begingroup$
Your emphasis on using a validation set rather than the training set for selecting $k$ is a good practice and should be followed. However, we can do even better!
The parameter $k$ in $textPCA$ is more special than a general hyper-parameter. Because, the solution to $textPCA(k)$ already exists in $textPCA(K)$, for $K > k$, which is the first $k$ Eigenvectors (corresponding to $k$ largest Eigenvalues) in $textPCA(K)$. Therefore, instead of running $textPCA(1)$, $textPCA(4)$, ..., $textPCA(K)$ separately on training data, as we do for a hyper-parameter in general, we only need to run $textPCA(K)$ to have the solution for all $k in 1,..,K$.
As a result, the process would be as follows:
- Run $textPCA$ for the largest acceptable $K$ on training set,
- Plot, or prepare ($k$, variance) on validation set,
- Select the $k$ that gives the minimum acceptable variance, e.g. 90% or 99%.
And, N-fold cross validation would be as follows:
- Run $textPCA$ for the largest acceptable $K$ on N training folds,
- Plot, or prepare ($k$, average of N variances) on held-out folds,
- Select the $k$ that gives the minimum acceptable average variance, e.g. 90% or 99%.
Also, here is a related post that asks "why do we choose principal components based on maximum variance explained?".
$endgroup$
Your emphasis on using a validation set rather than the training set for selecting $k$ is a good practice and should be followed. However, we can do even better!
The parameter $k$ in $textPCA$ is more special than a general hyper-parameter. Because, the solution to $textPCA(k)$ already exists in $textPCA(K)$, for $K > k$, which is the first $k$ Eigenvectors (corresponding to $k$ largest Eigenvalues) in $textPCA(K)$. Therefore, instead of running $textPCA(1)$, $textPCA(4)$, ..., $textPCA(K)$ separately on training data, as we do for a hyper-parameter in general, we only need to run $textPCA(K)$ to have the solution for all $k in 1,..,K$.
As a result, the process would be as follows:
- Run $textPCA$ for the largest acceptable $K$ on training set,
- Plot, or prepare ($k$, variance) on validation set,
- Select the $k$ that gives the minimum acceptable variance, e.g. 90% or 99%.
And, N-fold cross validation would be as follows:
- Run $textPCA$ for the largest acceptable $K$ on N training folds,
- Plot, or prepare ($k$, average of N variances) on held-out folds,
- Select the $k$ that gives the minimum acceptable average variance, e.g. 90% or 99%.
Also, here is a related post that asks "why do we choose principal components based on maximum variance explained?".
edited Mar 30 at 22:06
answered Mar 27 at 20:01
EsmailianEsmailian
2,621318
2,621318
$begingroup$
Is K-PCA the correct name for this? It sounds a bit confusing and remembers me of Kernel Principal Component Analysis (KPCA), which is a non-linear version of PCA
$endgroup$
– Pedro Henrique Monforte
Mar 28 at 2:36
$begingroup$
@PedroHenriqueMonforte Thanks! Notation updated.
$endgroup$
– Esmailian
Mar 28 at 11:05
add a comment |
$begingroup$
Is K-PCA the correct name for this? It sounds a bit confusing and remembers me of Kernel Principal Component Analysis (KPCA), which is a non-linear version of PCA
$endgroup$
– Pedro Henrique Monforte
Mar 28 at 2:36
$begingroup$
@PedroHenriqueMonforte Thanks! Notation updated.
$endgroup$
– Esmailian
Mar 28 at 11:05
$begingroup$
Is K-PCA the correct name for this? It sounds a bit confusing and remembers me of Kernel Principal Component Analysis (KPCA), which is a non-linear version of PCA
$endgroup$
– Pedro Henrique Monforte
Mar 28 at 2:36
$begingroup$
Is K-PCA the correct name for this? It sounds a bit confusing and remembers me of Kernel Principal Component Analysis (KPCA), which is a non-linear version of PCA
$endgroup$
– Pedro Henrique Monforte
Mar 28 at 2:36
$begingroup$
@PedroHenriqueMonforte Thanks! Notation updated.
$endgroup$
– Esmailian
Mar 28 at 11:05
$begingroup$
@PedroHenriqueMonforte Thanks! Notation updated.
$endgroup$
– Esmailian
Mar 28 at 11:05
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48103%2fis-it-ok-to-try-to-find-the-best-pca-k-parameter-as-we-do-with-other-hyperparame%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown