Chapter 1: Solving Linear Systems and Matrix Operations
Overview
This chapter is the backbone of everything that follows in linear algebra. At its heart is one deceptively simple question: how do you find all values of unknowns that simultaneously satisfy a collection of linear equations? We start from scratch — building up from two-variable systems — and by the end of the chapter we have powerful, systematic tools: Gaussian elimination, matrix arithmetic, matrix inverses, elementary matrices, and block partitioning. Every technique here reappears throughout the course.
1.1 Systems of Linear Equations
What Is a Linear System?
A linear equation in
where the
Solution: An ordered
-tuple of real numbers that satisfies every equation in the system simultaneously.
Solution Set: The set of all solutions of a linear system.
Consistent System: A linear system that has at least one solution (its solution set is non-empty).
Inconsistent System: A linear system that has no solution (its solution set is empty).
How Many Solutions Can a System Have?
Geometrically, each linear equation in two unknowns represents a line in the plane. A
- The lines intersect at exactly one point → unique solution
- The lines are parallel → no solution (inconsistent)
- The lines are identical → infinitely many solutions
This geometric intuition extends to larger systems. In general, an
Equivalent Systems and Elementary Operations
Equivalent Systems: Two systems involving the same variables are equivalent if they have exactly the same solution set.
To solve a system, we transform it into an equivalent but simpler one. Three operations on a system always produce an equivalent system:
- Interchange the order of any two equations.
- Scale any equation by multiplying through by a non-zero real number.
- Add a multiple of one equation to another equation.
These three operations are the foundation of every systematic method in this chapter.
Strict Triangular Form and Back Substitution
Strict Triangular Form: A system of
equations in unknowns where, in the -th equation, the coefficients of the first unknowns are all zero and the coefficient of is non-zero.
A strict triangular system is easy to solve by back substitution: solve the last equation for
Example intuition: Think of a strict triangular system as a staircase — each step reveals one new variable. You climb down from the bottom to find each variable in sequence.
The Augmented Matrix
Rather than carrying the full symbolic notation of a system, we strip away the variable names and work with numbers only. Given a system, its coefficient matrix holds just the coefficients of the unknowns. Attaching the right-hand side constants as an extra column produces the augmented matrix
Coefficient Matrix: The rectangular array of coefficients of the unknowns in a linear system.
Augmented Matrix: The coefficient matrix with the right-hand side constants appended as an additional column.
The three elementary operations on equations correspond directly to elementary row operations on the augmented matrix:
- Row interchange: swap any two rows.
- Row scaling: multiply a row by any non-zero scalar.
- Row replacement: replace a row by itself plus a scalar multiple of another row.
The beauty of this is that we work with numbers alone — the variables are placeholders reintroduced only at the end.
📝 Section Recap: A linear system is consistent (at least one solution), inconsistent (no solution), or has infinitely many solutions. We never get exactly two solutions. The three elementary row operations on an augmented matrix always yield an equivalent system, and a strict triangular form can be solved by back substitution.
1.2 Row Echelon Form
When Strict Triangular Form Fails
The reduction to strict triangular form can break down: if at some step all the candidate pivot entries in a column are zero, we cannot continue as before. The right response is not to give up — it is to move to the next column and keep going. This leads to a more general shape called echelon form.
Row Echelon Form: A matrix is in row echelon form if (i) the first non-zero entry in each non-zero row is 1 (called the leading 1), (ii) in each successive non-zero row the number of leading zeros strictly increases (the leading 1's form a staircase stepping down and to the right), and (iii) all zero rows appear at the bottom.
Gaussian Elimination: The process of applying elementary row operations to transform a linear system's augmented matrix into row echelon form.
Lead Variables and Free Variables
After reaching row echelon form:
Lead Variables: The unknowns corresponding to columns that contain a leading 1. There is one lead variable per non-zero row.
Free Variables: The remaining unknowns — those corresponding to columns skipped during elimination. Free variables can be assigned any real number, and each such assignment determines a unique set of values for the lead variables.
If a system is consistent and has free variables, it has infinitely many solutions. A consistent underdetermined system (fewer equations than unknowns) always has free variables and therefore always has infinitely many solutions.
Detecting Inconsistency
If any row of the row echelon form has the pattern
the system is inconsistent — that row says
Reduced Row Echelon Form and Gauss–Jordan Reduction
Once we have row echelon form, we can continue eliminating entries above each leading 1 as well as below. This fully simplified form is called reduced row echelon form (RREF).
Reduced Row Echelon Form: A matrix in row echelon form where the leading 1 in each row is the only non-zero entry in its column.
Gauss–Jordan Reduction: The process of applying elementary row operations to transform a matrix all the way to reduced row echelon form.
RREF is particularly useful when there are free variables — it directly reads off the values of every lead variable in terms of the free variables, without any need for back substitution.
Overdetermined and Underdetermined Systems
Overdetermined System: More equations than unknowns (
). Usually (but not always) inconsistent — the extra equations create conflicting constraints.
Underdetermined System: Fewer equations than unknowns (
). Cannot have a unique solution. If consistent, always has infinitely many solutions, because there must be at least free variables.
Homogeneous Systems
Homogeneous System: A linear system
where all right-hand sides are zero. Always consistent — the trivial solution always works.
Theorem: An
Real-World Applications
This section illustrates how linear systems arise naturally:
- Traffic flow: At each intersection, cars in = cars out, giving one equation per intersection. Free variables arise because flow on one road determines all others.
- Electrical networks (Kirchhoff's Laws): At every node, incoming current = outgoing current (Law 1). Around every closed loop, voltage gains = voltage drops (Law 2). Together these form a linear system for the unknown currents.
- Chemical equation balancing: Counting atoms of each element on both sides gives a homogeneous system; non-trivial solutions give the correct coefficients.
- Leontief Input-Output Model: In a simple economy with several sectors, the prices at which each sector should trade can be found as the solution to a homogeneous system. Non-trivial solutions give the fair relative prices.
📝 Section Recap: When strict triangular reduction stalls, the correct response is row echelon form. Gaussian elimination reaches row echelon form; Gauss–Jordan reduction goes further to reduced row echelon form. Lead variables are determined by the pivot columns; free variables can take any value. Homogeneous systems always have at least the trivial solution and, whenever there are more unknowns than equations, they have non-trivial solutions too.
1.3 Matrix Arithmetic
Why Matrices?
Individual systems of equations are one thing. But in science and engineering you often solve many related systems, or need to represent data, transformations, and relationships compactly. Matrices are the tool for all of this. A matrix is simply a rectangular array of numbers arranged in rows and columns.
Matrix: A matrix with rows and columns. The entry in row and column is denoted , called the entry of . We often write .
Square Matrix: A matrix with the same number of rows and columns (
).
Scalar: An individual real (or complex) number — the entries of a matrix.
Vectors
Vector: An
-tuple of real numbers. Represented as an column matrix (column vector) or a row matrix (row vector). The set of all column vectors of length is called Euclidean -space, denoted .
Row vectors are written with a horizontal arrow to distinguish them from column vectors. In most of linear algebra, "vector" means column vector.
Matrix Equality, Scalar Multiplication, and Addition
Matrix Equality: Two
matrices and are equal if for every and .
- Scalar multiplication:
multiplies every entry of by the scalar . - Matrix addition:
. Only defined when and have the same dimensions. - Zero matrix
: The matrix of all zeros; serves as the additive identity. - Additive inverse:
; satisfies .
Matrix–Vector Multiplication and Linear Systems
The key insight: the system
can be written as the single matrix equation
Equivalently,
where
Linear Combination: A sum
where the are scalars and the are vectors.
Consistency Theorem: The system
Matrix Multiplication
When we multiply two matrices
Watch out: the number of columns of
The Transpose
Transpose of
: The matrix obtained by swapping rows and columns — the entry of equals the entry of . If is , then is .
Symmetric Matrix: A square matrix
with . All entries are mirrored across the main diagonal.
Key rule for transposes of products:
Applications of Matrix Multiplication
- Production costs: Multiplying a cost-per-item matrix by a quantities-produced matrix gives total costs by category and quarter in one operation.
- Analytic Hierarchy Process (AHP): Decision-making with multiple criteria. Each candidate gets weights per criterion, assembled into a matrix. Multiplying by a weight vector for the criteria gives overall rankings.
- Information retrieval / database search: Documents are columns of a matrix; keywords are rows. Multiplying the transposed database matrix by a search vector counts keyword matches per document, enabling ranking.
📝 Section Recap: Matrix arithmetic generalises scalar arithmetic, but with one crucial difference: multiplication is not commutative. A linear system
is consistent precisely when is a linear combination of the columns of . The transpose swaps rows and columns, and — order reverses.
1.4 Matrix Algebra
Algebraic Rules That Do and Don't Transfer
Many familiar rules from scalar algebra carry over to matrices, but several important ones do not.
Rules that hold (from Theorem 1.4.1, for appropriately dimensioned matrices and scalars
(addition is commutative) (associativity of addition) (associativity of multiplication) (left distributivity) (right distributivity)
Critical rules that do NOT generally hold:
in general (non-commutativity) (because ) does not imply or and does not imply
The Identity Matrix
Identity Matrix
: The matrix with 1's on the main diagonal and 0's everywhere else. It satisfies for any matrix .
The columns of
Matrix Inverses
Nonsingular (Invertible) Matrix: An
matrix is nonsingular if there exists a matrix such that . The matrix is unique and called the inverse of , denoted .
Singular Matrix: An
matrix that does not have a multiplicative inverse.
Only square matrices can be nonsingular. A matrix can have at most one inverse (uniqueness is proven by showing any two candidates must be equal).
Key theorem: If
Again, note the reversal of order. More generally,
Algebraic Rules for Transposes
Applications
- Markov chain (marital status model): A transition matrix
encodes what fraction of married and single women change status each year. After years, the population vector is . - Leslie population models (loggerhead sea turtle): A matrix
encodes stage-to-stage survival and reproduction rates. gives the population at each life stage after years. This model predicts a roughly 95% decline in breeding-age turtles over 100 years. - Graph theory / Networks: An adjacency matrix
for a graph has if there is an edge between vertices and , and otherwise. The entry of counts the number of walks of length from to . Adjacency matrices are always symmetric.
📝 Section Recap: Matrix algebra largely mirrors scalar algebra, but commutativity of multiplication fails and there are no cancellation laws. The identity matrix
plays the role of 1. A nonsingular matrix has a unique inverse, and the inverse of a product reverses the order of the factors. Powers of adjacency matrices count walks in graphs.
1.5 Elementary Matrices
From Row Operations to Matrix Multiplication
Every elementary row operation can be represented as multiplication by a special matrix. This reframing turns the row-reduction algorithm into a sequence of matrix multiplications, which is powerful for both theory and computation.
Elementary Matrix: An
matrix obtained by performing exactly one elementary row operation on the identity matrix .
There are three types:
- Type I: Interchange two rows of
— this swaps corresponding rows of any matrix it left-multiplies. - Type II: Multiply a single row of
by a non-zero scalar — scales the corresponding row. - Type III: Add a multiple of one row of
to another — adds a multiple of one row to another.
Theorem: Every elementary matrix is nonsingular, and its inverse is an elementary matrix of the same type. Specifically:
- A Type I matrix (row swap) is its own inverse.
- A Type II matrix that multiplies row
by has an inverse that multiplies row by . - A Type III matrix that adds
times row to row has an inverse that subtracts times row from row .
Equivalent Systems via Matrix Multiplication
If
Row Equivalence: Matrix
is row equivalent to if for some sequence of elementary matrices — i.e., can be obtained from by finitely many row operations.
The Nonsingularity Theorem
Theorem (Equivalent Conditions for Nonsingularity): For an
is nonsingular. has only the trivial solution . is row equivalent to .
This theorem is a workhorse. It means: to check if
Corollary: The system
Computing the Inverse
The equivalence
Method: Form the augmented matrix
Triangular and Diagonal Matrices
Upper Triangular: A matrix where all entries below the main diagonal are zero.
Lower Triangular: A matrix where all entries above the main diagonal are zero.
Diagonal Matrix: A matrix that is both upper and lower triangular — only entries on the main diagonal can be non-zero.
An upper triangular matrix with all non-zero diagonal entries defines a strict triangular system and is therefore nonsingular.
LU Factorisation
When an
LU Factorisation:
where is a unit lower triangular matrix (lower triangular with 1's on the diagonal) and is the upper triangular result of elimination. The entries below the diagonal of are exactly the multipliers used during elimination: is the multiple of row subtracted from row .
The LU factorisation is extremely useful for solving multiple systems with the same coefficient matrix but different right-hand sides, since the work of factorising
📝 Section Recap: Every row operation corresponds to left-multiplication by an elementary matrix. A square matrix is nonsingular if and only if it row-reduces to the identity. The inverse can be computed by augmenting with
and row-reducing. When elimination uses only row-replacement operations, the result is an LU factorisation , where the multipliers fill in the lower triangular part of .
1.6 Partitioned Matrices
Breaking Matrices into Blocks
Large matrices are often easier to analyse and compute with when split into smaller submatrices (also called blocks). A matrix can be partitioned by drawing horizontal lines between rows and vertical lines between columns.
Partitioned Matrix (Block Matrix): A matrix viewed as an array of smaller submatrix blocks, created by partitioning both rows and columns.
A common and useful special case is partitioning a matrix into its individual columns or rows:
Block Multiplication
The great power of partitioning is that block multiplication works like ordinary matrix multiplication, as long as the block dimensions are compatible. If
where
More generally, for a partitioned product:
This is exactly how matrix multiplication works entry-by-entry, but at the block level. The requirement is that the block dimensions must be compatible at each interface.
Block Diagonal Matrices and Invertibility
A block diagonal matrix of the form
is nonsingular if and only if both
This principle extends: a block diagonal matrix is invertible if and only if every diagonal block is invertible.
Inner Products and Outer Products
When we have vectors
Inner Product (Scalar Product):
, a scalar. This is a row vector times an column vector — the result is .
Outer Product:
, an matrix. This is an column vector times a row vector. Every row of the result is a multiple of ; every column is a multiple of .
The outer product idea generalises: given an
where
📝 Section Recap: Partitioned matrices let us work with large matrices by treating blocks as if they were individual entries. Block multiplication follows the same rule as ordinary matrix multiplication, as long as block dimensions are compatible. Block diagonal matrices are invertible iff each diagonal block is. The outer product
produces an matrix from two -dimensional vectors, and any matrix product can be written as a sum of outer products.
Key Theorems at a Glance
| Theorem | Statement |
|---|---|
| Homogeneous Non-Trivial Solutions | An |
| Consistency Theorem | |
| Product Inverse | If |
| Transpose of Product | |
| Equivalent Conditions for Nonsingularity | |
| Unique Solution | |
| Walk Counting | The |
| Elementary Matrices are Nonsingular | Every elementary matrix is invertible, and its inverse is an elementary matrix of the same type. |
Common Pitfalls to Watch Out For
- Commutativity trap: Never assume
. This is wrong in general, and it breaks all the "nice" binomial-like expansions from scalar algebra. - Cancellation trap:
does not imply unless is nonsingular. - Zero-product trap:
does not mean or . - Inverse of a product:
, not . - Singular vs. non-square: Only square matrices can be singular or nonsingular. For non-square matrices, those terms do not apply.
- Free variables = infinitely many solutions: Whenever a consistent system has free variables, it has infinitely many solutions (you can set each free variable to any real number).
- Pivot selection: If all candidate pivots in a column are zero during elimination, skip that column and move right — do not stop.