Having difficult interpreting the eigenvectors for a simple 3x2 matrix The Next CEO of Stack Overflow2019 Community Moderator ElectionPrincipal Component Analysis, Eigenvectors lying in the span of the observed data points?ValueError: operands could not be broadcast together with shapes while using two sample independent t testCan't understand this simple matrix multiplication in pythonDimensions For Matrix MultiplicationOverfitting problem in modelLow silhouette coefficientNormalize matrix in Python numpyMultivariate VAR model: ValueError: x already contains a constantShow importance of variables from a data set without a response variable? Use PCA?Neural Network Data Normalization Setup
Where do students learn to solve polynomial equations these days?
New carbon wheel brake pads after use on aluminum wheel?
Strength of face-nailed connection for stair steps
Are the names of these months realistic?
Prepend last line of stdin to entire stdin
What difference does it make using sed with/without whitespaces?
How did Beeri the Hittite come up with naming his daughter Yehudit?
Is it professional to write unrelated content in an almost-empty email?
Audio Conversion With ADS1243
Sulfuric acid symmetry point group
What steps are necessary to read a Modern SSD in Medieval Europe?
Yu-Gi-Oh cards in Python 3
Is there a way to save my career from absolute disaster?
What was the first Unix version to run on a microcomputer?
How to use ReplaceAll on an expression that contains a rule
Is there an equivalent of cd - for cp or mv
Physiological effects of huge anime eyes
Do scriptures give a method to recognize a truly self-realized person/jivanmukta?
Purpose of level-shifter with same in and out voltages
Why is the US ranked as #45 in Press Freedom ratings, despite its extremely permissive free speech laws?
In the "Harry Potter and the Order of the Phoenix" videogame, what potion is used to sabotage Umbridge's Speakers?
Would a grinding machine be a simple and workable propulsion system for an interplanetary spacecraft?
Towers in the ocean; How deep can they be built?
Deriving the equation for variance
Having difficult interpreting the eigenvectors for a simple 3x2 matrix
The Next CEO of Stack Overflow2019 Community Moderator ElectionPrincipal Component Analysis, Eigenvectors lying in the span of the observed data points?ValueError: operands could not be broadcast together with shapes while using two sample independent t testCan't understand this simple matrix multiplication in pythonDimensions For Matrix MultiplicationOverfitting problem in modelLow silhouette coefficientNormalize matrix in Python numpyMultivariate VAR model: ValueError: x already contains a constantShow importance of variables from a data set without a response variable? Use PCA?Neural Network Data Normalization Setup
$begingroup$
I calculated the eigenvectors and eigenvalues from a covariance matrix given a data matrix of 3 columns and 2 rows.
I am trying to interpret results but I can't understand on how to interpret them.
Create a 2x3 matrix:
# Create a 2x3 matrix
data = np.around(np.random.uniform(size=(2,3)) * 100)
The data looks as follows:
[
[ 4., 65., 77.],
[68., 12., 89.]
]
# Here each row represents one data point
# and columns represent the features in the data set
# So there are 3 features and 2 data points
Calculate the mean for each feature in the data set.
mean = np.mean(data, axis = 0)
Center the data around origin, by subtracting mean from the data set.
difference = np.subtract(data, mean)
Now, calculate the covariance matrix:
cov = np.dot(difference.T, difference)
The cov matrix looks as follows:
[
[ 2048. , -1696. , 384. ],
[-1696. , 1404.5, -318. ],
[ 384. , -318. , 72. ]
]
As I understand about the covariance matrix, it explains the variance between all feature-pairs. Since there are 3 features, it gives out a 3x3 matrix explaining the variance between all possible pairs.
Finally, calculate the eigenvectors and eigenvalues:
val, vec = np.linalg.eigh(cov)
The vec matrix looks as follows:
[
[ 0.60999981, 0.21639063, 0.76228297],
[ 0.77451164, 0.040441 , -0.63126559],
[ 0.16742745, -0.97546892, 0.14292806]
]
How do I interpret the the vector matrix? I understand what are eigenvectors physically. They do not change in position when an object undergoes a transformation but only a scalar change by their eigenvalues.
What are some possible ways, I could use this vec matrix?
python pca numpy matrix
$endgroup$
add a comment |
$begingroup$
I calculated the eigenvectors and eigenvalues from a covariance matrix given a data matrix of 3 columns and 2 rows.
I am trying to interpret results but I can't understand on how to interpret them.
Create a 2x3 matrix:
# Create a 2x3 matrix
data = np.around(np.random.uniform(size=(2,3)) * 100)
The data looks as follows:
[
[ 4., 65., 77.],
[68., 12., 89.]
]
# Here each row represents one data point
# and columns represent the features in the data set
# So there are 3 features and 2 data points
Calculate the mean for each feature in the data set.
mean = np.mean(data, axis = 0)
Center the data around origin, by subtracting mean from the data set.
difference = np.subtract(data, mean)
Now, calculate the covariance matrix:
cov = np.dot(difference.T, difference)
The cov matrix looks as follows:
[
[ 2048. , -1696. , 384. ],
[-1696. , 1404.5, -318. ],
[ 384. , -318. , 72. ]
]
As I understand about the covariance matrix, it explains the variance between all feature-pairs. Since there are 3 features, it gives out a 3x3 matrix explaining the variance between all possible pairs.
Finally, calculate the eigenvectors and eigenvalues:
val, vec = np.linalg.eigh(cov)
The vec matrix looks as follows:
[
[ 0.60999981, 0.21639063, 0.76228297],
[ 0.77451164, 0.040441 , -0.63126559],
[ 0.16742745, -0.97546892, 0.14292806]
]
How do I interpret the the vector matrix? I understand what are eigenvectors physically. They do not change in position when an object undergoes a transformation but only a scalar change by their eigenvalues.
What are some possible ways, I could use this vec matrix?
python pca numpy matrix
$endgroup$
add a comment |
$begingroup$
I calculated the eigenvectors and eigenvalues from a covariance matrix given a data matrix of 3 columns and 2 rows.
I am trying to interpret results but I can't understand on how to interpret them.
Create a 2x3 matrix:
# Create a 2x3 matrix
data = np.around(np.random.uniform(size=(2,3)) * 100)
The data looks as follows:
[
[ 4., 65., 77.],
[68., 12., 89.]
]
# Here each row represents one data point
# and columns represent the features in the data set
# So there are 3 features and 2 data points
Calculate the mean for each feature in the data set.
mean = np.mean(data, axis = 0)
Center the data around origin, by subtracting mean from the data set.
difference = np.subtract(data, mean)
Now, calculate the covariance matrix:
cov = np.dot(difference.T, difference)
The cov matrix looks as follows:
[
[ 2048. , -1696. , 384. ],
[-1696. , 1404.5, -318. ],
[ 384. , -318. , 72. ]
]
As I understand about the covariance matrix, it explains the variance between all feature-pairs. Since there are 3 features, it gives out a 3x3 matrix explaining the variance between all possible pairs.
Finally, calculate the eigenvectors and eigenvalues:
val, vec = np.linalg.eigh(cov)
The vec matrix looks as follows:
[
[ 0.60999981, 0.21639063, 0.76228297],
[ 0.77451164, 0.040441 , -0.63126559],
[ 0.16742745, -0.97546892, 0.14292806]
]
How do I interpret the the vector matrix? I understand what are eigenvectors physically. They do not change in position when an object undergoes a transformation but only a scalar change by their eigenvalues.
What are some possible ways, I could use this vec matrix?
python pca numpy matrix
$endgroup$
I calculated the eigenvectors and eigenvalues from a covariance matrix given a data matrix of 3 columns and 2 rows.
I am trying to interpret results but I can't understand on how to interpret them.
Create a 2x3 matrix:
# Create a 2x3 matrix
data = np.around(np.random.uniform(size=(2,3)) * 100)
The data looks as follows:
[
[ 4., 65., 77.],
[68., 12., 89.]
]
# Here each row represents one data point
# and columns represent the features in the data set
# So there are 3 features and 2 data points
Calculate the mean for each feature in the data set.
mean = np.mean(data, axis = 0)
Center the data around origin, by subtracting mean from the data set.
difference = np.subtract(data, mean)
Now, calculate the covariance matrix:
cov = np.dot(difference.T, difference)
The cov matrix looks as follows:
[
[ 2048. , -1696. , 384. ],
[-1696. , 1404.5, -318. ],
[ 384. , -318. , 72. ]
]
As I understand about the covariance matrix, it explains the variance between all feature-pairs. Since there are 3 features, it gives out a 3x3 matrix explaining the variance between all possible pairs.
Finally, calculate the eigenvectors and eigenvalues:
val, vec = np.linalg.eigh(cov)
The vec matrix looks as follows:
[
[ 0.60999981, 0.21639063, 0.76228297],
[ 0.77451164, 0.040441 , -0.63126559],
[ 0.16742745, -0.97546892, 0.14292806]
]
How do I interpret the the vector matrix? I understand what are eigenvectors physically. They do not change in position when an object undergoes a transformation but only a scalar change by their eigenvalues.
What are some possible ways, I could use this vec matrix?
python pca numpy matrix
python pca numpy matrix
asked Mar 23 at 10:06
Suhail GuptaSuhail Gupta
26118
26118
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
The eigenvectors can be used to transform your data into a coordinate system in which no covariance is there. Assume we have a $p$ dimensional multivariate normal distribution with
$$mathcalN(boldsymbolx|boldsymbolmu=boldsymbol0,boldsymbolSigma)=dfrac1sqrt(2pi)^pdetboldsymbolSigmaexpleft[-dfrac12boldsymbolx^TboldsymbolSigmaboldsymbolx right].quad (*)$$
As the covariance matrix is real and symmetric we know it is diagonalizable and that we can scale the eigenvectors to represent an orthonormal basis by the set of all eigenvectors. The eigenvalue equation for the covariance matrix is given by
$$boldsymbolSigmaboldsymbolv_i=lambda_iboldsymbolv_i quad forall i=1,...,p.$$
We can combine all equations into a single matrix equation
$$boldsymbolSigma[boldsymbolv_1,ldots,boldsymbolv_p]=[boldsymbolv_1,ldots,boldsymbolv_p]textdiagleft[lambda_1,ldots,lambda_p right].$$
If we call $boldsymbolV=[boldsymbolv_1,ldots,boldsymbolv_p]$ and $boldsymbolLambda=left[lambda_1,ldots,lambda_p right].$ With these definitions in hand we can write the eigenvalue equation as
$$boldsymbolSigmaboldsymbolV=boldsymbolVboldsymbolLambda$$
$$implies boldsymbolSigma=boldsymbolVboldsymbolLambdaboldsymbolV^-1.$$
as $boldsymbolV$ is orthogonal (consists of orthonormal vectors) we can write $boldsymbolV^-1=boldsymbolV^T.$ This implies
$$boldsymbolSigma=boldsymbolVboldsymbolLambdaboldsymbolV^T.$$
We plug this into $(*)$ and introduce the new variable $boldsymbolz=boldsymbolV^Tboldsymbolx$.
$$mathcalN(boldsymbolx|boldsymbolmu=boldsymbol0,boldsymbolSigma)=dfrac1sqrt(2pi)^pdetboldsymbolVboldsymbolLambdaboldsymbolV^Texpleft[-dfrac12boldsymbolx^TboldsymbolVboldsymbolLambdaboldsymbolV^Tboldsymbolx right]$$
$$=dfrac1sqrt(2pi)^pdetboldsymbolVboldsymbolLambdaboldsymbolV^Texpleft[-dfrac12boldsymbolzboldsymbolLambdaboldsymbolz right]$$
$$implies mathcalN(boldsymbolz|boldsymbolmu=boldsymbol0,boldsymbolLambda)=dfrac1sqrt(2pi)^pdetboldsymbolLambdaexpleft[-dfrac12boldsymbolzboldsymbolLambdaboldsymbolz right].$$
In the last step I used $det boldsymbolABC=detboldsymbolAdetboldsymbolBdetboldsymbolC$, $detboldsymbolA=detboldsymbolA^T$ for orthogonal matrices and $|detboldsymbolA|=1$ for orthogonal matrices. Hence, we proved that we can use the eigenvectors to linearly transform our variables to obtain a new coordinate system which has a diagonal covariance matrix $boldsymbolLambda$ (which is diagonal). This implies that we do not have any covariance anymore.
Remark 1: This diagonalization is what the principal components analysis is also doing under the hood.
Remark 2: There is one suboptimal part of your code. You should explicitly determine the sample_size = data.shape[0] and then calculate cov = 1 / (sample_size - 1) * np.dot(difference.T, difference).
$endgroup$
$begingroup$
"... transform your data into a coordinate system in which no covariance is there."Could you please elaborate more on this? A real world example would help.
$endgroup$
– Suhail Gupta
Mar 23 at 10:48
$begingroup$
The covariance matrix has the covariance components on the off-diagonal. As $boldsymbolLambda$ is only a diagonal matrix we only have variance terms but no covariance terms.
$endgroup$
– MachineLearner
Mar 23 at 11:19
1
$begingroup$
Yeah, but what how do I interpret them physically? How are they useful?
$endgroup$
– Suhail Gupta
Mar 23 at 11:22
$begingroup$
They are very useful for highly correlated variables. E.g. these directions are the eigenfaces (see: en.wikipedia.org/wiki/Eigenface) for images. The eigenvector with the largest eigenvalue will capture the most possible amount of variance in a single linear direction.
$endgroup$
– MachineLearner
Mar 23 at 11:24
$begingroup$
What do you exactly mean bysingle linear direction?
$endgroup$
– Suhail Gupta
Mar 23 at 12:38
add a comment |
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47834%2fhaving-difficult-interpreting-the-eigenvectors-for-a-simple-3x2-matrix%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
The eigenvectors can be used to transform your data into a coordinate system in which no covariance is there. Assume we have a $p$ dimensional multivariate normal distribution with
$$mathcalN(boldsymbolx|boldsymbolmu=boldsymbol0,boldsymbolSigma)=dfrac1sqrt(2pi)^pdetboldsymbolSigmaexpleft[-dfrac12boldsymbolx^TboldsymbolSigmaboldsymbolx right].quad (*)$$
As the covariance matrix is real and symmetric we know it is diagonalizable and that we can scale the eigenvectors to represent an orthonormal basis by the set of all eigenvectors. The eigenvalue equation for the covariance matrix is given by
$$boldsymbolSigmaboldsymbolv_i=lambda_iboldsymbolv_i quad forall i=1,...,p.$$
We can combine all equations into a single matrix equation
$$boldsymbolSigma[boldsymbolv_1,ldots,boldsymbolv_p]=[boldsymbolv_1,ldots,boldsymbolv_p]textdiagleft[lambda_1,ldots,lambda_p right].$$
If we call $boldsymbolV=[boldsymbolv_1,ldots,boldsymbolv_p]$ and $boldsymbolLambda=left[lambda_1,ldots,lambda_p right].$ With these definitions in hand we can write the eigenvalue equation as
$$boldsymbolSigmaboldsymbolV=boldsymbolVboldsymbolLambda$$
$$implies boldsymbolSigma=boldsymbolVboldsymbolLambdaboldsymbolV^-1.$$
as $boldsymbolV$ is orthogonal (consists of orthonormal vectors) we can write $boldsymbolV^-1=boldsymbolV^T.$ This implies
$$boldsymbolSigma=boldsymbolVboldsymbolLambdaboldsymbolV^T.$$
We plug this into $(*)$ and introduce the new variable $boldsymbolz=boldsymbolV^Tboldsymbolx$.
$$mathcalN(boldsymbolx|boldsymbolmu=boldsymbol0,boldsymbolSigma)=dfrac1sqrt(2pi)^pdetboldsymbolVboldsymbolLambdaboldsymbolV^Texpleft[-dfrac12boldsymbolx^TboldsymbolVboldsymbolLambdaboldsymbolV^Tboldsymbolx right]$$
$$=dfrac1sqrt(2pi)^pdetboldsymbolVboldsymbolLambdaboldsymbolV^Texpleft[-dfrac12boldsymbolzboldsymbolLambdaboldsymbolz right]$$
$$implies mathcalN(boldsymbolz|boldsymbolmu=boldsymbol0,boldsymbolLambda)=dfrac1sqrt(2pi)^pdetboldsymbolLambdaexpleft[-dfrac12boldsymbolzboldsymbolLambdaboldsymbolz right].$$
In the last step I used $det boldsymbolABC=detboldsymbolAdetboldsymbolBdetboldsymbolC$, $detboldsymbolA=detboldsymbolA^T$ for orthogonal matrices and $|detboldsymbolA|=1$ for orthogonal matrices. Hence, we proved that we can use the eigenvectors to linearly transform our variables to obtain a new coordinate system which has a diagonal covariance matrix $boldsymbolLambda$ (which is diagonal). This implies that we do not have any covariance anymore.
Remark 1: This diagonalization is what the principal components analysis is also doing under the hood.
Remark 2: There is one suboptimal part of your code. You should explicitly determine the sample_size = data.shape[0] and then calculate cov = 1 / (sample_size - 1) * np.dot(difference.T, difference).
$endgroup$
$begingroup$
"... transform your data into a coordinate system in which no covariance is there."Could you please elaborate more on this? A real world example would help.
$endgroup$
– Suhail Gupta
Mar 23 at 10:48
$begingroup$
The covariance matrix has the covariance components on the off-diagonal. As $boldsymbolLambda$ is only a diagonal matrix we only have variance terms but no covariance terms.
$endgroup$
– MachineLearner
Mar 23 at 11:19
1
$begingroup$
Yeah, but what how do I interpret them physically? How are they useful?
$endgroup$
– Suhail Gupta
Mar 23 at 11:22
$begingroup$
They are very useful for highly correlated variables. E.g. these directions are the eigenfaces (see: en.wikipedia.org/wiki/Eigenface) for images. The eigenvector with the largest eigenvalue will capture the most possible amount of variance in a single linear direction.
$endgroup$
– MachineLearner
Mar 23 at 11:24
$begingroup$
What do you exactly mean bysingle linear direction?
$endgroup$
– Suhail Gupta
Mar 23 at 12:38
add a comment |
$begingroup$
The eigenvectors can be used to transform your data into a coordinate system in which no covariance is there. Assume we have a $p$ dimensional multivariate normal distribution with
$$mathcalN(boldsymbolx|boldsymbolmu=boldsymbol0,boldsymbolSigma)=dfrac1sqrt(2pi)^pdetboldsymbolSigmaexpleft[-dfrac12boldsymbolx^TboldsymbolSigmaboldsymbolx right].quad (*)$$
As the covariance matrix is real and symmetric we know it is diagonalizable and that we can scale the eigenvectors to represent an orthonormal basis by the set of all eigenvectors. The eigenvalue equation for the covariance matrix is given by
$$boldsymbolSigmaboldsymbolv_i=lambda_iboldsymbolv_i quad forall i=1,...,p.$$
We can combine all equations into a single matrix equation
$$boldsymbolSigma[boldsymbolv_1,ldots,boldsymbolv_p]=[boldsymbolv_1,ldots,boldsymbolv_p]textdiagleft[lambda_1,ldots,lambda_p right].$$
If we call $boldsymbolV=[boldsymbolv_1,ldots,boldsymbolv_p]$ and $boldsymbolLambda=left[lambda_1,ldots,lambda_p right].$ With these definitions in hand we can write the eigenvalue equation as
$$boldsymbolSigmaboldsymbolV=boldsymbolVboldsymbolLambda$$
$$implies boldsymbolSigma=boldsymbolVboldsymbolLambdaboldsymbolV^-1.$$
as $boldsymbolV$ is orthogonal (consists of orthonormal vectors) we can write $boldsymbolV^-1=boldsymbolV^T.$ This implies
$$boldsymbolSigma=boldsymbolVboldsymbolLambdaboldsymbolV^T.$$
We plug this into $(*)$ and introduce the new variable $boldsymbolz=boldsymbolV^Tboldsymbolx$.
$$mathcalN(boldsymbolx|boldsymbolmu=boldsymbol0,boldsymbolSigma)=dfrac1sqrt(2pi)^pdetboldsymbolVboldsymbolLambdaboldsymbolV^Texpleft[-dfrac12boldsymbolx^TboldsymbolVboldsymbolLambdaboldsymbolV^Tboldsymbolx right]$$
$$=dfrac1sqrt(2pi)^pdetboldsymbolVboldsymbolLambdaboldsymbolV^Texpleft[-dfrac12boldsymbolzboldsymbolLambdaboldsymbolz right]$$
$$implies mathcalN(boldsymbolz|boldsymbolmu=boldsymbol0,boldsymbolLambda)=dfrac1sqrt(2pi)^pdetboldsymbolLambdaexpleft[-dfrac12boldsymbolzboldsymbolLambdaboldsymbolz right].$$
In the last step I used $det boldsymbolABC=detboldsymbolAdetboldsymbolBdetboldsymbolC$, $detboldsymbolA=detboldsymbolA^T$ for orthogonal matrices and $|detboldsymbolA|=1$ for orthogonal matrices. Hence, we proved that we can use the eigenvectors to linearly transform our variables to obtain a new coordinate system which has a diagonal covariance matrix $boldsymbolLambda$ (which is diagonal). This implies that we do not have any covariance anymore.
Remark 1: This diagonalization is what the principal components analysis is also doing under the hood.
Remark 2: There is one suboptimal part of your code. You should explicitly determine the sample_size = data.shape[0] and then calculate cov = 1 / (sample_size - 1) * np.dot(difference.T, difference).
$endgroup$
$begingroup$
"... transform your data into a coordinate system in which no covariance is there."Could you please elaborate more on this? A real world example would help.
$endgroup$
– Suhail Gupta
Mar 23 at 10:48
$begingroup$
The covariance matrix has the covariance components on the off-diagonal. As $boldsymbolLambda$ is only a diagonal matrix we only have variance terms but no covariance terms.
$endgroup$
– MachineLearner
Mar 23 at 11:19
1
$begingroup$
Yeah, but what how do I interpret them physically? How are they useful?
$endgroup$
– Suhail Gupta
Mar 23 at 11:22
$begingroup$
They are very useful for highly correlated variables. E.g. these directions are the eigenfaces (see: en.wikipedia.org/wiki/Eigenface) for images. The eigenvector with the largest eigenvalue will capture the most possible amount of variance in a single linear direction.
$endgroup$
– MachineLearner
Mar 23 at 11:24
$begingroup$
What do you exactly mean bysingle linear direction?
$endgroup$
– Suhail Gupta
Mar 23 at 12:38
add a comment |
$begingroup$
The eigenvectors can be used to transform your data into a coordinate system in which no covariance is there. Assume we have a $p$ dimensional multivariate normal distribution with
$$mathcalN(boldsymbolx|boldsymbolmu=boldsymbol0,boldsymbolSigma)=dfrac1sqrt(2pi)^pdetboldsymbolSigmaexpleft[-dfrac12boldsymbolx^TboldsymbolSigmaboldsymbolx right].quad (*)$$
As the covariance matrix is real and symmetric we know it is diagonalizable and that we can scale the eigenvectors to represent an orthonormal basis by the set of all eigenvectors. The eigenvalue equation for the covariance matrix is given by
$$boldsymbolSigmaboldsymbolv_i=lambda_iboldsymbolv_i quad forall i=1,...,p.$$
We can combine all equations into a single matrix equation
$$boldsymbolSigma[boldsymbolv_1,ldots,boldsymbolv_p]=[boldsymbolv_1,ldots,boldsymbolv_p]textdiagleft[lambda_1,ldots,lambda_p right].$$
If we call $boldsymbolV=[boldsymbolv_1,ldots,boldsymbolv_p]$ and $boldsymbolLambda=left[lambda_1,ldots,lambda_p right].$ With these definitions in hand we can write the eigenvalue equation as
$$boldsymbolSigmaboldsymbolV=boldsymbolVboldsymbolLambda$$
$$implies boldsymbolSigma=boldsymbolVboldsymbolLambdaboldsymbolV^-1.$$
as $boldsymbolV$ is orthogonal (consists of orthonormal vectors) we can write $boldsymbolV^-1=boldsymbolV^T.$ This implies
$$boldsymbolSigma=boldsymbolVboldsymbolLambdaboldsymbolV^T.$$
We plug this into $(*)$ and introduce the new variable $boldsymbolz=boldsymbolV^Tboldsymbolx$.
$$mathcalN(boldsymbolx|boldsymbolmu=boldsymbol0,boldsymbolSigma)=dfrac1sqrt(2pi)^pdetboldsymbolVboldsymbolLambdaboldsymbolV^Texpleft[-dfrac12boldsymbolx^TboldsymbolVboldsymbolLambdaboldsymbolV^Tboldsymbolx right]$$
$$=dfrac1sqrt(2pi)^pdetboldsymbolVboldsymbolLambdaboldsymbolV^Texpleft[-dfrac12boldsymbolzboldsymbolLambdaboldsymbolz right]$$
$$implies mathcalN(boldsymbolz|boldsymbolmu=boldsymbol0,boldsymbolLambda)=dfrac1sqrt(2pi)^pdetboldsymbolLambdaexpleft[-dfrac12boldsymbolzboldsymbolLambdaboldsymbolz right].$$
In the last step I used $det boldsymbolABC=detboldsymbolAdetboldsymbolBdetboldsymbolC$, $detboldsymbolA=detboldsymbolA^T$ for orthogonal matrices and $|detboldsymbolA|=1$ for orthogonal matrices. Hence, we proved that we can use the eigenvectors to linearly transform our variables to obtain a new coordinate system which has a diagonal covariance matrix $boldsymbolLambda$ (which is diagonal). This implies that we do not have any covariance anymore.
Remark 1: This diagonalization is what the principal components analysis is also doing under the hood.
Remark 2: There is one suboptimal part of your code. You should explicitly determine the sample_size = data.shape[0] and then calculate cov = 1 / (sample_size - 1) * np.dot(difference.T, difference).
$endgroup$
The eigenvectors can be used to transform your data into a coordinate system in which no covariance is there. Assume we have a $p$ dimensional multivariate normal distribution with
$$mathcalN(boldsymbolx|boldsymbolmu=boldsymbol0,boldsymbolSigma)=dfrac1sqrt(2pi)^pdetboldsymbolSigmaexpleft[-dfrac12boldsymbolx^TboldsymbolSigmaboldsymbolx right].quad (*)$$
As the covariance matrix is real and symmetric we know it is diagonalizable and that we can scale the eigenvectors to represent an orthonormal basis by the set of all eigenvectors. The eigenvalue equation for the covariance matrix is given by
$$boldsymbolSigmaboldsymbolv_i=lambda_iboldsymbolv_i quad forall i=1,...,p.$$
We can combine all equations into a single matrix equation
$$boldsymbolSigma[boldsymbolv_1,ldots,boldsymbolv_p]=[boldsymbolv_1,ldots,boldsymbolv_p]textdiagleft[lambda_1,ldots,lambda_p right].$$
If we call $boldsymbolV=[boldsymbolv_1,ldots,boldsymbolv_p]$ and $boldsymbolLambda=left[lambda_1,ldots,lambda_p right].$ With these definitions in hand we can write the eigenvalue equation as
$$boldsymbolSigmaboldsymbolV=boldsymbolVboldsymbolLambda$$
$$implies boldsymbolSigma=boldsymbolVboldsymbolLambdaboldsymbolV^-1.$$
as $boldsymbolV$ is orthogonal (consists of orthonormal vectors) we can write $boldsymbolV^-1=boldsymbolV^T.$ This implies
$$boldsymbolSigma=boldsymbolVboldsymbolLambdaboldsymbolV^T.$$
We plug this into $(*)$ and introduce the new variable $boldsymbolz=boldsymbolV^Tboldsymbolx$.
$$mathcalN(boldsymbolx|boldsymbolmu=boldsymbol0,boldsymbolSigma)=dfrac1sqrt(2pi)^pdetboldsymbolVboldsymbolLambdaboldsymbolV^Texpleft[-dfrac12boldsymbolx^TboldsymbolVboldsymbolLambdaboldsymbolV^Tboldsymbolx right]$$
$$=dfrac1sqrt(2pi)^pdetboldsymbolVboldsymbolLambdaboldsymbolV^Texpleft[-dfrac12boldsymbolzboldsymbolLambdaboldsymbolz right]$$
$$implies mathcalN(boldsymbolz|boldsymbolmu=boldsymbol0,boldsymbolLambda)=dfrac1sqrt(2pi)^pdetboldsymbolLambdaexpleft[-dfrac12boldsymbolzboldsymbolLambdaboldsymbolz right].$$
In the last step I used $det boldsymbolABC=detboldsymbolAdetboldsymbolBdetboldsymbolC$, $detboldsymbolA=detboldsymbolA^T$ for orthogonal matrices and $|detboldsymbolA|=1$ for orthogonal matrices. Hence, we proved that we can use the eigenvectors to linearly transform our variables to obtain a new coordinate system which has a diagonal covariance matrix $boldsymbolLambda$ (which is diagonal). This implies that we do not have any covariance anymore.
Remark 1: This diagonalization is what the principal components analysis is also doing under the hood.
Remark 2: There is one suboptimal part of your code. You should explicitly determine the sample_size = data.shape[0] and then calculate cov = 1 / (sample_size - 1) * np.dot(difference.T, difference).
answered Mar 23 at 10:43
MachineLearnerMachineLearner
36910
36910
$begingroup$
"... transform your data into a coordinate system in which no covariance is there."Could you please elaborate more on this? A real world example would help.
$endgroup$
– Suhail Gupta
Mar 23 at 10:48
$begingroup$
The covariance matrix has the covariance components on the off-diagonal. As $boldsymbolLambda$ is only a diagonal matrix we only have variance terms but no covariance terms.
$endgroup$
– MachineLearner
Mar 23 at 11:19
1
$begingroup$
Yeah, but what how do I interpret them physically? How are they useful?
$endgroup$
– Suhail Gupta
Mar 23 at 11:22
$begingroup$
They are very useful for highly correlated variables. E.g. these directions are the eigenfaces (see: en.wikipedia.org/wiki/Eigenface) for images. The eigenvector with the largest eigenvalue will capture the most possible amount of variance in a single linear direction.
$endgroup$
– MachineLearner
Mar 23 at 11:24
$begingroup$
What do you exactly mean bysingle linear direction?
$endgroup$
– Suhail Gupta
Mar 23 at 12:38
add a comment |
$begingroup$
"... transform your data into a coordinate system in which no covariance is there."Could you please elaborate more on this? A real world example would help.
$endgroup$
– Suhail Gupta
Mar 23 at 10:48
$begingroup$
The covariance matrix has the covariance components on the off-diagonal. As $boldsymbolLambda$ is only a diagonal matrix we only have variance terms but no covariance terms.
$endgroup$
– MachineLearner
Mar 23 at 11:19
1
$begingroup$
Yeah, but what how do I interpret them physically? How are they useful?
$endgroup$
– Suhail Gupta
Mar 23 at 11:22
$begingroup$
They are very useful for highly correlated variables. E.g. these directions are the eigenfaces (see: en.wikipedia.org/wiki/Eigenface) for images. The eigenvector with the largest eigenvalue will capture the most possible amount of variance in a single linear direction.
$endgroup$
– MachineLearner
Mar 23 at 11:24
$begingroup$
What do you exactly mean bysingle linear direction?
$endgroup$
– Suhail Gupta
Mar 23 at 12:38
$begingroup$
"... transform your data into a coordinate system in which no covariance is there." Could you please elaborate more on this? A real world example would help.$endgroup$
– Suhail Gupta
Mar 23 at 10:48
$begingroup$
"... transform your data into a coordinate system in which no covariance is there." Could you please elaborate more on this? A real world example would help.$endgroup$
– Suhail Gupta
Mar 23 at 10:48
$begingroup$
The covariance matrix has the covariance components on the off-diagonal. As $boldsymbolLambda$ is only a diagonal matrix we only have variance terms but no covariance terms.
$endgroup$
– MachineLearner
Mar 23 at 11:19
$begingroup$
The covariance matrix has the covariance components on the off-diagonal. As $boldsymbolLambda$ is only a diagonal matrix we only have variance terms but no covariance terms.
$endgroup$
– MachineLearner
Mar 23 at 11:19
1
1
$begingroup$
Yeah, but what how do I interpret them physically? How are they useful?
$endgroup$
– Suhail Gupta
Mar 23 at 11:22
$begingroup$
Yeah, but what how do I interpret them physically? How are they useful?
$endgroup$
– Suhail Gupta
Mar 23 at 11:22
$begingroup$
They are very useful for highly correlated variables. E.g. these directions are the eigenfaces (see: en.wikipedia.org/wiki/Eigenface) for images. The eigenvector with the largest eigenvalue will capture the most possible amount of variance in a single linear direction.
$endgroup$
– MachineLearner
Mar 23 at 11:24
$begingroup$
They are very useful for highly correlated variables. E.g. these directions are the eigenfaces (see: en.wikipedia.org/wiki/Eigenface) for images. The eigenvector with the largest eigenvalue will capture the most possible amount of variance in a single linear direction.
$endgroup$
– MachineLearner
Mar 23 at 11:24
$begingroup$
What do you exactly mean by
single linear direction?$endgroup$
– Suhail Gupta
Mar 23 at 12:38
$begingroup$
What do you exactly mean by
single linear direction?$endgroup$
– Suhail Gupta
Mar 23 at 12:38
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47834%2fhaving-difficult-interpreting-the-eigenvectors-for-a-simple-3x2-matrix%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
