Vectors | Geometrical Interpretation | Inner Products | Matrices | Matrix Multiplication | Hyperplanes | Eigenvectors and Eigenvalues
This material provides the bare definitions, and some simple examples. As you are using these notes to revise your linear algebra, you will hopefully recall that there are important geometric interpretations for these concepts, as well.
A vector over the real numbers may be viewed as a sequence of real numbers. The dimension of the vector is the number of real numbers in the sequence. So (3.2, 4.7, –2.4, 0.0) is a vector of dimension 4. The 4 numbers in the sequence are referred to as the components of the vector.
An ordered number-sequence like this is also referred to as a tuple. If the dimension (number of components) of the sequence is n, then the tuple is an n-tuple. 2-tuples are usually called ordered pairs, and 3-tuples are called ordered triples.
Vectors can be added together by adding the corresponding components. Thus (2, 7, 4, 0) + (1, 3, 5, 9) = (2+1, 7+3, 4+5, 0+9) = (3, 10, 9, 9). A vector can also be multiplied by a real number (sometimes called a scalar) by multiplying each component of the vector by the real number. So 3×(2, 7, 4, 0) = (3×2, 3×7, 3×4, 3×0) = (6, 21, 12, 0).
There is no particular reason why the components have to be real numbers. In some applications (but probably not in the neural networks course for which these revision notes were written) the components might be complex numbers, or members of a finite field such as the whole numbers with arithmetic done modulo p, where p is a prime number.
It is often useful to work with vectors whose components are symbols representing numbers (pro-numerals). Thus (x1, x2, x3, x4) is a vector of symbols. Sometimes it is convenient to use a symbol to refer to a whole vector - a pro-vector, if you like. Often bold face symbols are used for this purpose. So we might write x = (x1, x2, x3, x4), or y = (y1, y2, y3, y4).
With vectors that have 2 or 3 components, it is fairly easy to see how to interpret the vectors as points in space. For example, the ordered pair (–1, 1) is interpreted as a point in 2-dimensional space (xy-space) with x– and y–coordinates –1 and 1, respectively, as shown in the diagram below. The vector is thought of as the line joining the origin (the point (0, 0)) to the point (–1, 1). Similarly, 3-tuples can be drawn as points in 3-dimensional space (xyz-space).
All of this works in dimensions ≥ 4, but obviously is harder to visualise.
The inner product (sometimes also called the dot product) of two vectors with the same number of components is obtained by multiplying together the corresponding components and adding up the products. The inner product of vectors x and y is written x • y, hence the alternative name dot-product.
So (1, 2, 3, 4) • (5, 6, 7, 8) = 1×5 + 2×6 + 3×7 + 4×8
= 5 + 12 + 21 + 32 = 70.
As a symbolic example, x • y =
(x1, x2, x3, x4) .
(y1, y2, y3, y4) =
x1y2 + x2y2 +
x3y3 + x4y4
Two vectors are said to be orthogonal if their inner product is 0. For example
(1, 1, 1, 1) • (1, –1, 1, –1) = 1 + –1 + 1 + –1 = 0.
Vectors in two and three dimensions are orthogonal, in the sense
just defined, if they are at right angles to each other.
For example, in the figure below, the two vectors shown geometrically, (1, 1) and (–1, 1), are at right angles, and have an inner product of 0.
The length of a vector x is defined to be the square root of its inner product with itself: √(x • x). The length of x is sometimes written as ||x||. As an example, ||(1, –1, 1, –1)|| = √(1×1 + (–1×–1) + 1×1 + (–1×–1)) = √4 = 2.
In the diagram above, the length of (1, 1) is √(12 + 12) = √2. This is the same as the length of the line from (0, 0) to (1, 1), calculated using Pythagoras's theorem.
A vector is said to be normal if it has length 1. So (0.5, –0.5, 0.5, –0.5) is normal. If x is any vector, then x/||x|| is normal, and the process of converting x into x/||x|| is called normalisation.
A matrix can be thought of as a vector of vectors. Alternatively, you can think of a matrix as a rectangular array of numbers. Again, they don't have to be numbers, but for our purposes they will be. Here is an example of a matrix:
[ 1 2 3 4 ]
[ 5 4 3 2 ]
[ 0 1 0 –1 ]
|
[ 1 2 3 4 ] [ 5 4 3 2 ] [ 0 1 0 –1 ] |
+ |
[ 7 4 3 0 ] [ 1 1 1 1 ] [ 1 0 –1 0 ] |
= |
[ 8 6 6 4 ] [ 6 5 4 3 ] [ 1 1 –1 –1 ] |
When matrices are written in symbolic form, they look like this:
[ a11 a12 a13 a14 ]
[ a21 a22 a23 a24 ]
[ a31 a32 a33 a34 ]
Each entry, aij, has two subscripts - the first one, i in aij, indicates which row the entry belongs to, and the second one, j in aij, indicates which column the entry belongs to. For example, a32 belongs to the third row and second column of the matrix.
Matrices as a whole are often symbolised using a capital letter, like A. So aij is the prototypical entry in the matrix A. If A is an m×n matrix, then we say that m is the row dimension of the matrix (or just the number of rows) and n is the column dimension (or number of columns).
The transpose AT of a matrix A is obtained by interchanging its rows and columns. So the transpose of the 3×4 matrix
[ 1 2 3 4 ]
[ 5 4 3 2 ]
[ 0 1 0 –1 ]
is the 4×3 matrix
[ 1 5 0 ]
[ 2 4 1 ]
[ 3 3 0 ]
[ 4 2 –1 ]
The transpose of an m×n matrix is an n×m matrix.
Notice that a vector with n components can be viewed as a 1×n matrix. The vectors we used above were row vectors. If you view them as matrices and transpose, them, you get column vectors. Thus (1, 2, 3, 4) is a row vector, whose transpose is the column vector
1
2
3
4
It is sometimes possible to multiply two matrices together. This is done, not by multiplying corresponding entries together as with matrix addition, but by computing an inner product for each entry of the product matrix. In general, the product of an m×n matrix with n×p is an m×p matrix. Note that for the product of two matrices to exist, the column dimension of the first matrix (m) must equal the row dimension of the second matrix. Example (with first matrix 2×3 and second matrix 3×2, so that the product matrix is 2×2):
|
[ 1 2 3 ] [ 5 4 3 ] |
× |
[ 3 0 ] [ 1 1 ] [ 1 0 ] |
= |
[ (1, 2, 3) • (3, 1, 1) (1, 2, 3) • (0, 1, 0) ] [ (5, 4, 3) • (3, 1, 1) (5, 4, 3) • (0, 1, 0) ] |
= |
[ 8 2 ] [ 22 4 ] |
Notice that it is sometimes possible to multiply a matrix by a vector (viewed as a matrix). If the matrix is m×n and the vector has n components, then you can view the vector as a column vector (n×1 matrix) and multiply them, to get an m×1 result - a column vector of dimension m:
|
[ 1 2 3 ] [ 3 4 1 ] [ 0 1 2 ] |
× |
[ 1 ] [ 2 ] [ 1 ] |
= |
[ (1, 2, 3) • (1, 2, 1) ] [ (3, 4, 1) • (1, 2, 1) ] [ (0, 1, 2) • (1, 2, 1) ] |
= |
[ 7 ] [ 12 ] [ 4 ] |
You could call this post-multiplication by the vector, as the vector appears post- or after- the matrix. Pre-multiplication by a row vector is also sometimes possible - an m-dimensional row vector (i.e. an 1×m matrix) can be multiplied by an m×n matrix to give a 1×n result - another row vector, but of dimension n.
A hyperplane is a higher-dimensional analogue of a plane in 3-dimensional space. The general equation of a plane in 3-space is
ax + by + cz = d
where x, y, and z are variables and a, b, c and d are constants. That is, the plane is the set of all points (x, y, z) such that ax + by + cz = d holds.
Thinking in a vector and inner-product sort of way, you can rewrite the equation of a plane in 3-space as
(a, b, c) • (x, y, z) = d
If we write a for (a, b, c) and x for (x, y, z), the equation becomes
a • x = d.
Note that any (but not all) of a, b, c, and d can be zero. So examples of equations of planes are:
a) 2x + 3y + 4z = 5
b) x + y = 3
c) y + z = 7
d) x + 3z = 4
e) x = 2
f) y = 1
g) z = 43.2
So, a plane is a hyperplane in 3-space. Similarly, a line is a hyperplane in 2-space, and it has an general equation of the form a • x = d, where this time, a = (a, b), and x = (x, y). Notice that equations (b)-(g), above, are equations of planes in 3-space, but also are equations of lines in 2-space. How can this be? Consider (b): x + y = 3. In 2-space, this is the set of all points (x, y) such that the equation holds. In 3-space, it is the set of all points (x, y, z) such that the equation holds - i.e. x + y = 3, and z can have any value at all. Similarly, with (e), in 2-space, this is the line consisting of all points (x, y) such that x = 2, and y can have any value at all, and in 3-space, it is the line consisting of all points (x, y, z) such that x = 2, and y and z can have any value at all.
In the case of a line in 2-space, the line is one-dimensional, and 2-space is of course 2-dimensional. For a plane in 3-space, the plane is 2-dimensional and 3-space is 3-dimensional. In general, in an n-dimensional space, (i.e. a space of vectors with n components), a hyperplane is a (n–1)-dimensional object. As before, its general equation is of the form:
a • x = d
but this time a and x are n-dimensional vectors. So a = (a1, a2, ..., an) and x = (x1, x2, ..., xn).
In the case of a line in 2-space, (ax + by = d) it is easy to check that for a point (x, y) not on the line, which side of the line the point is on is determined by whether ax + by – d is positive or negative. For example, with the line x + y = 1, the point (x, y) = (0, 0) is to the left of the line, and x + y – 1 = –1 is negative, while the point (x, y) = (1, 1) is to the right of the line and x + y – 1 = +1 is positive.
You can check this for planes in 3-space, too, with a little more difficulty. It holds in general: for a point x in n-space, the point lies on one "side" of the hyperplane if a • x – d is negative, and on the other side if a • x – d is positive.
When you pre-multiply a column vector x by a matrix A:
x → Ax,
you transform the vector to another vector. Let us consider a particular matrix
A. For some vectors x, Ax will be a multiple of x. For
example, if x = (0, 1, 0)T, and A =
[2, 0, 0]
[0, 7, 0]
[0, 0, 3]
then it is easy to check that Ax = 7x.
In such a case we say that x is an eigenvector of A, and the multiple, in this case 7, is an eigenvalue of A. Thus if Ax = λx, x is the eigenvector, and λ is the eigenvalue.
A matrix can have several eigenvalues (and hence several eigenvectors). For example, our example matrix A has three eigenvalues: 2, 7, and 3.
Bill Wilson's contact info
Last updated: