A must-know concept for Machine Learning
Some concepts live in the heart of data science. Eigenvectors and eigenvalues are one of those concepts. This article will aim to explain what eigenvectors and eigenvalues are, how they are calculated and how we can use them.
Eigenvalues and eigenvectors form the basics of computing and mathematics
Eigenvector —Every vector (list of numbers) has a direction when it is plotted on a XY chart. Eigenvectors are those vectors when a linear transformation (such as multiplying it to a scalar) is performed on them then their direction does not change. This attribute of Eigenvectors make them very valuable as I will explain in this article.
Eigenvalue— The scalar that is used to transform (stretch) an Eigenvector.
Eigenvectors and eigenvalues are used to reduce noise in data. They can help us improve efficiency in computational intensive tasks. They also eliminate features that have a strong correlation between them and also help in reducing over-fitting.
- Firstly I will give a brief introduction of eigenvectors and eigenvalues.
- Then I will explain how eigenvectors and eigenvalues are calculated but I will explain them from the foundation of matrix multiplication and addition so that we all can understand them thoroughly.
- I will then present a working example and we will calculate the eigenvectors and eigenvalues together
- Lastly I will outline how we can compute the eigenvectors and eigenvalues in Python.
It will build our confidence in dimension reduction techniques which are crucial to understand in data science
You are probably wondering why eigenvalues and eigenvectors
Introducing Eigenvalues and Eigenvectors — Where are they used?
When we are building forecasting models that are trained on images, sound and/or textual contents then the input feature sets can end up having a large set of features. It is also difficult to understand and visualize data with more than 3 dimensions. As a result, we often use one-hot encoding to transform values in textual features to separate numerical columns which can end up taking a large amount of space on disk.
Eigenvalues and Eigenvectors are the key tools to use in those scenarios
Eigenvalues and Eigenvectors have their importance in linear differential equations where you want to find a rate of change or when you want to maintain relationships between two variables.
Think of eigenvalues and eigenvectors as providing Summary of a large matrix
We can represent a large set of information in a matrix. One eigenvalue and eigenvector is used to capture key information that is stored in a large matrix. Performing computations on a large matrix is a very slow process. To elaborate, one of the key methodologies to improve efficiency in computational intensive tasks is to reduce the dimensions after ensuring most of the key information is maintained.
Component analysis is one of the key strategies that is utilised to reduce dimension space without losing valuable information. The core of component analysis (PCA) is built on the concept of eigenvalues and eigenvectors.
Additionally, eigenvectors and eigenvalues are used in facial recognition techniques such as EigenFaces.
It is now apparent that Eigenvalues and Eigenvectors are one of core concepts to understand in data science. Hence this article is dedicated to them.
Practical usecases of Eigenvectors and Eigenvalues:
They are used to reduce dimension space. If you want to forecast a financial variable e.g.interest rates then you will gather data for variables that interest rate is dependent on. Then you will join the variables to create a matrix. You might also be loading textual information and converting it to vectors.
At times, this can increase your dimension space to 100+ columns.
The technique of Eigenvectors and Eigenvalues are used to compress the data.
Many algorithms such as PCA rely on eigen values and eigenvectors to reduce the dimensions.
Have a look at this article if you want to understand dimension reduction and PCA:
More usecases of Eigenvalues and Eigenvectors
Occasionally we gather data that contains a large amount of noise. Finding important or meaningful patterns within the data can be extremely difficult. Eigenvectors and eigenvalues can be used to construct spectral clustering.
We can also use eigenvector to rank items in a dataset.
Lastly in non-linear motion dynamics, eigenvalues and eigenvectors can be used to help us understand the data better as they can be used to transform and represent data into manageable sets.
What are Eigenvalues and Eigenvectors?
Eigenvectors are used to make linear transformation understandable. Think of eigenvectors as stretching/compressing a X-Y line chart without changing its direction.
Eigenvectors and eigenvalues revolve around the concept of matrices.
Matrices are used in machine learning problems to represent a large set of information. Eigenvalues and eigenvectors is about constructing one vector with one value to represent a large matrix. Sounds very useful, right?
Let’s quickly recap and refresh how matrix multiplication and addition works before we take a deep dive
Matrix addition is simply achieved by taking each element of a matrix and adding it together as shown below:
Multiplying Scalar With A Matrix
Multiplying Matrix by a scalar is as straight forward as multiplying each element by the scalar:
Matrices multiplication is achieved by multiplying and then summing matching members of the two matrices. The image below illustrates how we can multiple a 3 by 3 and a 3 by 1 matrix together:
That’s all the Maths which we need to know for the moment
Let’s utilise the knowledge above to understand eigenvectors and eigenvalues:
Although we don’t have to calculate the Eigenvalues and Eigenvectors by hand every time but it is important to understand the inner workings to be able to confidently use the algorithms.
Key Concepts: Let’s go over the following bullet points before we calculate Eigenvalues and Eigenvectors
Eigenvalues and Eigenvectors have following components:
- A matrix has a size of X rows and Y columns.
- A square matrix is the one which has a size of n, implying that X and Y are equal.
- Square matrix is represented as A. This is an example of a square matrix
- Eigenvector is an array with n entries where n is the number of rows (or columns) of a square matrix. Eigenvector is represented as x. Key Note: The direction of an eigenvector does not change when a linear transformation is applied to it.
- Therefore, Eigenvector should be a non-null vector
- Now Eigenvalues: We are required to find a number of values, known as eigenvalues such that
A * x - Lambda * x = 0
Eigenvalues are represented as lambda.
The above equation states that we need to multiply a scalar lambda (eigenvalue) to the vector x such that it is equal to the linear transformation of matrix A once it is scaled by vector x (eigenvector).
Key note: The above equation should not be invertible.
- Determinant of a matrix is a number that is computed from a square matrix. It is basic arithmetic where the diagonal elements are multiplied by each other and then they are substracted together. As the above equation should not be invertible, we need to ensure that the determinant of the matrix is 0.
- Last component: Identity Matrix. A square matrix which has 1 as diagonal and all other elements are 0 is known as an identity matrix. Identity matrix is represented as I:
We can represent
A * x - Lambda * x = 0
x * (A - Lambda * I) = 0
Determinant(A - Lambda * I) = 0
How do I calculate Eigenvalue?
For a matrix A of size n, find Eigenvalues of size n.
The aim is to find: Eigenvector and Eigenvalues of A such that:
A * Eigenvector — Eigenvalue * EigenVector = 0
Find Lambda Such that Determinant(A — Lambda * I) = 0
Based on the concepts learnt above:
1. Lambda * I is:
If A is:
2. Then A — Lambda * I is:
3. Finally calculate the determinant of (A-Lambda*I) as:
Once we solve the equation above, we will get the values of Lambda. These values are the Eigenvalues.
I will present a working example below to ensure we understand the concepts thoroughly.
How do I calculate Eigenvector?
Once we have calculated eigenvalues, we can calculate the Eigenvectors of the matrix A by using Gaussian Elimination. Gaussian elimination is about converting the matrix to row echelon form. Finally it is about solving the linear system by back substitution.
Detailed explanation of gaussian elimination is out of scope of this article so that we can concentrate on Eigenvalues and Eigenvectors.
Once we have the Eigenvalues, we can find Eigenvector for each of the Eigenvalues. We can substitute the eigenvalue in the lambda and we will achieve an eigen vector.
x * (A - Lambda * I) = 0
Therefore if a square matrix has a size n then we will get n eigenvalues and as a result, n eigenvectors will be computed to represent the matrix.
Now that we have the key, it is the time to compute the Eigenvalues and Eigenvectors together with me
Let’s calculate Eigenvalue and Eigenvector together
If there are any doubts then do inform me.
Let’s find eigenvalue of following matrix:
First multiply lambda to an identity matrix and then subtract the two matrices
We will then be left with a matrix which we need to compute a determinant of:
Find determinant of the following matrix:
Once we solve the quadratic equation above, we will be left with two Eigenvalues:
Now that we have computed Eigenvalues, let’s calculate Eigenvectors together
Take the first Eigenvalue (Lambda) and substitute the eigenvalue into following equation:
x * (A - Lambda * I) = 0
For the first eigenvalue, we will get following Eigenvector:
This Eigenvector now represents the key information of matrix A
I want you to find the second Eigenvector.
Message me your values and workings.
Calculate Eigenvalues and Eigenvectors In Python
Although we don’t have to calculate the Eigenvalues and Eigenvectors by hand but it is important to understand the inner workings to be able to confidently use the algorithms. Furthermore, It is very straightforward to calculate eigenvalues and eigenvectors in Python.
We can use numpy.linalg.eig module. It takes in a square matrix as the input and returns eigen values and eigen vectors. It also raises an LinAlgError if the eigenvalue computation does not converge.
from numpy import linalg as LA
input = np.array([[2,-1],[4,3]])
w, v = LA.eig(input)
This article explained one of the key areas of machine learning. It started by giving a brief introduction of eigenvectors and eigenvalues.
Then it explained how eigenvectors and eigenvalues are calculated from the foundations of matrix multiplication and addition so that we can understand the key components thoroughly.
It then presented a working example. Lastly it outlined how we can compute the eigenvectors and eigenvalues in Python.
Hope it helps. Please let me know if you have any questions or feedback.