What is an eigenvector of a covariance matrix?

What is an eigenvector of a covariance matrix?

One of the most intuitive explanations of eigenvectors of a covariance matrix is that they are the directions in which the data varies the most

(More precisely, the first eigenvector is the direction in which the data varies the most, the second eigenvector is the direction of greatest variance among those that are orthogonal (perpendicular) to the first eigenvector, the third eigenvector is the direction of greatest variance among those orthogonal to the first two, and so on.)

Here is an example in 2 dimensions [1]:

Each data sample is a 2 dimensional point with coordinates x, y. The eigenvectors of the covariance matrix of these data samples are the vectors u and v; u, longer arrow, is the first eigenvector and v, the shorter arrow, is the second. (The eigenvalues are the length of the arrows.) As you can see, the first eigenvector points (from the mean of the data) in the direction in which the data varies the most in Euclidean space, and the second eigenvector is orthogonal (perpendicular) to the first.

It's a little trickier to visualize in 3 dimensions, but here's an attempt [2]:


In this case, imagine that all of the data points lie within the ellipsoid. v1, the direction in which the data varies the most, is the first eigenvector (lambda1 is the corresponding eigenvalue). v2 is the direction in which the data varies the most among those directions that are orthogonal to v1. And v3 is the direction of greatest variance among those directions that are orthogonal to v1 and v2 (though there is only one such orthogonal direction). 

[1] Image taken from Duncan Gillies's lecture on Principal Component Analysis
[2] Image taken from Fiber Crossing in Human Brain Depicted with Diffusion Tensor MR Imaging
  
 
Given a set of random variables {x1,...,xn}, the covariance matrix A is defined so that Ai,j=Cov(xi,xj). We can represent a linear combination bixi as a vector x=(b1,...,bn).

It turns out that the covariance of two such vectors x and y can be written as Cov(x,y)=xtAy. In particular, Var(x)=xtAx. This means that covariance is a Bilinear form.

Now, since A is a real symmetric matrix, there is an orthonormal basis for Rnof eigenvectors of A. Orthonormal in this case means that each vector's norm is 1 and they're orthogonal with respect to A, that is vt1Av2=0, or Cov(v1,v2)=0.

Next, suppose v is a unit eigenvector of A with eigenvalue λ. Then Var(v)=λv2=λ.

There are a couple interesting conclusions we can draw from this. First, since the eigenvectors form a basis {v1,...,vn}, every linear combination of the original random variables can actually be represented as a linear combination of the independent random variables vi. Second, every unit vector's variance is a weighted average of the eigenvalues. This means that the leading eigenvector is the direction of greatest variance, the next eigenvector has the greatest variance in the orthogonal subspace, and so on.

So, sum up, eigenvectors are uncorrelated linear combinations of the original set of random variables.

The primary application of this is Principal Components Analysis. If you have n features, you can find eigenvectors of the covariance matrix of the features. This allows you to represent the data with uncorrelated features. Moreover, the eigenvalues tell you the amount of variance in each feature, allowing you to choose a subset of the features that retain the most information about your data.
  
 
The largest eigenvector of a covariance matrix points into the direction of the largest variance. All other eigenvectors are orthogonal to the largest one.

Now, if this direction of the largest variance is axis-aligned (covariances are zero), then the eigenvalues simply correspond to the variances of the data:


It becomes a little more complicated if the covariance matrix is not diagonal, such that the covariances are not zero. In this case, the principal components (directions of largest variance) do no coincide with the axes, and the data is rotated. The eigenvalues then still correspond to the spread of the data in the direction of the largest variance, whereas the variance components of the covariance matrix still defines the spread of the data along the axes:


An in-depth discussion of how the covariance matrix can be interpreted from a geometric point of view (and the source of the above images) can be found on:A geometric interpretation of the covariance matrix
  
Shreyas GhugeShreyas Ghuge
3 upvotes by Sameer GuptaAnonymous, and Ram Shankar
 
Finding the directions of maximum and minimum variance is the same as looking for where the orthogonal least squares best fit line and plane of the data. The sums of squares for that line and plane can be written in terms of covariance matrix.The connections between them can be worked out to get the Eigen vectors of this covariance matrix.
  
Julius Bier KirkegaardJulius Bier Kirkegaardphysics, computers, 'n' stuff
4 upvotes by David Joyce (Professor of Mathematics at Clark University), Andrei Kucharavy (PhD student in Bioinformatics), Martin Andrews, and Laasya Alamuru
 
Finding the eigenvectors a covariance matrix is exactly the technique of Principal Component Analysis (PCA).

The eigenvectors are those variables that are linearly uncorrelated.
  
 
 
Write an answer
 
posted @ 2015-07-20 16:08  菜鸡一枚  阅读(1257)  评论(0编辑  收藏  举报