Chapter 1: Solving Linear Systems and Matrix Operations

Overview

This chapter is the backbone of everything that follows in linear algebra. At its heart is one deceptively simple question: how do you find all values of unknowns that simultaneously satisfy a collection of linear equations? We start from scratch — building up from two-variable systems — and by the end of the chapter we have powerful, systematic tools: Gaussian elimination, matrix arithmetic, matrix inverses, elementary matrices, and block partitioning. Every technique here reappears throughout the course.

1.1 Systems of Linear Equations

What Is a Linear System?

A linear equation in unknowns has the form

where the and are real number constants, and the are the unknowns. The key word is linear — no products of unknowns, no powers, no trigonometric functions. A linear system (or system of linear equations) is just a collection of such equations in unknowns, all to be satisfied at the same time.

Solution: An ordered -tuple of real numbers that satisfies every equation in the system simultaneously.

Solution Set: The set of all solutions of a linear system.

Consistent System: A linear system that has at least one solution (its solution set is non-empty).

Inconsistent System: A linear system that has no solution (its solution set is empty).

How Many Solutions Can a System Have?

Geometrically, each linear equation in two unknowns represents a line in the plane. A system asks where two lines meet. There are exactly three possibilities:

The lines intersect at exactly one point → unique solution
The lines are parallel → no solution (inconsistent)
The lines are identical → infinitely many solutions

This geometric intuition extends to larger systems. In general, an system is either inconsistent, has exactly one solution, or has infinitely many solutions. There is no other possibility.

Equivalent Systems and Elementary Operations

Equivalent Systems: Two systems involving the same variables are equivalent if they have exactly the same solution set.

To solve a system, we transform it into an equivalent but simpler one. Three operations on a system always produce an equivalent system:

Interchange the order of any two equations.
Scale any equation by multiplying through by a non-zero real number.
Add a multiple of one equation to another equation.

These three operations are the foundation of every systematic method in this chapter.

Strict Triangular Form and Back Substitution

Strict Triangular Form: A system of equations in unknowns where, in the -th equation, the coefficients of the first unknowns are all zero and the coefficient of is non-zero.

A strict triangular system is easy to solve by back substitution: solve the last equation for , substitute that value into the second-to-last equation to get , and continue working upward.

Example intuition: Think of a strict triangular system as a staircase — each step reveals one new variable. You climb down from the bottom to find each variable in sequence.

The Augmented Matrix

Rather than carrying the full symbolic notation of a system, we strip away the variable names and work with numbers only. Given a system, its coefficient matrix holds just the coefficients of the unknowns. Attaching the right-hand side constants as an extra column produces the augmented matrix .

Coefficient Matrix: The rectangular array of coefficients of the unknowns in a linear system.

Augmented Matrix: The coefficient matrix with the right-hand side constants appended as an additional column.

The three elementary operations on equations correspond directly to elementary row operations on the augmented matrix:

Row interchange: swap any two rows.
Row scaling: multiply a row by any non-zero scalar.
Row replacement: replace a row by itself plus a scalar multiple of another row.

The beauty of this is that we work with numbers alone — the variables are placeholders reintroduced only at the end.

📝 Section Recap: A linear system is consistent (at least one solution), inconsistent (no solution), or has infinitely many solutions. We never get exactly two solutions. The three elementary row operations on an augmented matrix always yield an equivalent system, and a strict triangular form can be solved by back substitution.

1.2 Row Echelon Form

When Strict Triangular Form Fails

The reduction to strict triangular form can break down: if at some step all the candidate pivot entries in a column are zero, we cannot continue as before. The right response is not to give up — it is to move to the next column and keep going. This leads to a more general shape called echelon form.

Row Echelon Form: A matrix is in row echelon form if (i) the first non-zero entry in each non-zero row is 1 (called the leading 1), (ii) in each successive non-zero row the number of leading zeros strictly increases (the leading 1's form a staircase stepping down and to the right), and (iii) all zero rows appear at the bottom.

Gaussian Elimination: The process of applying elementary row operations to transform a linear system's augmented matrix into row echelon form.

Lead Variables and Free Variables

After reaching row echelon form:

Lead Variables: The unknowns corresponding to columns that contain a leading 1. There is one lead variable per non-zero row.

Free Variables: The remaining unknowns — those corresponding to columns skipped during elimination. Free variables can be assigned any real number, and each such assignment determines a unique set of values for the lead variables.

If a system is consistent and has free variables, it has infinitely many solutions. A consistent underdetermined system (fewer equations than unknowns) always has free variables and therefore always has infinitely many solutions.

Detecting Inconsistency

If any row of the row echelon form has the pattern

the system is inconsistent — that row says , which is impossible.

Reduced Row Echelon Form and Gauss–Jordan Reduction

Once we have row echelon form, we can continue eliminating entries above each leading 1 as well as below. This fully simplified form is called reduced row echelon form (RREF).

Reduced Row Echelon Form: A matrix in row echelon form where the leading 1 in each row is the only non-zero entry in its column.

Gauss–Jordan Reduction: The process of applying elementary row operations to transform a matrix all the way to reduced row echelon form.

RREF is particularly useful when there are free variables — it directly reads off the values of every lead variable in terms of the free variables, without any need for back substitution.

Overdetermined and Underdetermined Systems

Overdetermined System: More equations than unknowns (). Usually (but not always) inconsistent — the extra equations create conflicting constraints.

Underdetermined System: Fewer equations than unknowns (). Cannot have a unique solution. If consistent, always has infinitely many solutions, because there must be at least free variables.

Homogeneous Systems

Homogeneous System: A linear system where all right-hand sides are zero. Always consistent — the trivial solution always works.

Theorem: An homogeneous system has a non-trivial solution if (more unknowns than equations). This is because the row echelon form can have at most non-zero rows, leaving at least free variables.

Real-World Applications

This section illustrates how linear systems arise naturally:

Traffic flow: At each intersection, cars in = cars out, giving one equation per intersection. Free variables arise because flow on one road determines all others.
Electrical networks (Kirchhoff's Laws): At every node, incoming current = outgoing current (Law 1). Around every closed loop, voltage gains = voltage drops (Law 2). Together these form a linear system for the unknown currents.
Chemical equation balancing: Counting atoms of each element on both sides gives a homogeneous system; non-trivial solutions give the correct coefficients.
Leontief Input-Output Model: In a simple economy with several sectors, the prices at which each sector should trade can be found as the solution to a homogeneous system. Non-trivial solutions give the fair relative prices.

📝 Section Recap: When strict triangular reduction stalls, the correct response is row echelon form. Gaussian elimination reaches row echelon form; Gauss–Jordan reduction goes further to reduced row echelon form. Lead variables are determined by the pivot columns; free variables can take any value. Homogeneous systems always have at least the trivial solution and, whenever there are more unknowns than equations, they have non-trivial solutions too.

1.3 Matrix Arithmetic

Why Matrices?

Individual systems of equations are one thing. But in science and engineering you often solve many related systems, or need to represent data, transformations, and relationships compactly. Matrices are the tool for all of this. A matrix is simply a rectangular array of numbers arranged in rows and columns.

Matrix: A matrix with rows and columns. The entry in row and column is denoted , called the entry of . We often write .

Square Matrix: A matrix with the same number of rows and columns ().

Scalar: An individual real (or complex) number — the entries of a matrix.

Vectors

Vector: An -tuple of real numbers. Represented as an column matrix (column vector) or a row matrix (row vector). The set of all column vectors of length is called Euclidean -space, denoted .

Row vectors are written with a horizontal arrow to distinguish them from column vectors. In most of linear algebra, "vector" means column vector.

Matrix Equality, Scalar Multiplication, and Addition

Matrix Equality: Two matrices and are equal if for every and .

Scalar multiplication: multiplies every entry of by the scalar .
Matrix addition: . Only defined when and have the same dimensions.
Zero matrix : The matrix of all zeros; serves as the additive identity.
Additive inverse: ; satisfies .

Matrix–Vector Multiplication and Linear Systems

The key insight: the system

can be written as the single matrix equation , where is the coefficient matrix, is the column vector of unknowns, and is the column vector of right-hand sides. The product is defined so that its -th entry is

Equivalently, is a linear combination of the columns of :

where is the -th column of .

Linear Combination: A sum where the are scalars and the are vectors.

Consistency Theorem: The system is consistent if and only if can be written as a linear combination of the column vectors of .

Matrix Multiplication

When we multiply two matrices (size ) and (size ), the result is an matrix whose columns are . The entry is the scalar product of the -th row of and the -th column of :

Watch out: the number of columns of must match the number of rows of , otherwise multiplication is not defined. Also, matrix multiplication is not commutative — in general, . This is one of the most important ways matrices differ from real numbers.

The Transpose

Transpose of : The matrix obtained by swapping rows and columns — the entry of equals the entry of . If is , then is .

Symmetric Matrix: A square matrix with . All entries are mirrored across the main diagonal.

Key rule for transposes of products: . Note the reversal of order — this is analogous to the "socks and shoes" rule for inverses.

Applications of Matrix Multiplication

Production costs: Multiplying a cost-per-item matrix by a quantities-produced matrix gives total costs by category and quarter in one operation.
Analytic Hierarchy Process (AHP): Decision-making with multiple criteria. Each candidate gets weights per criterion, assembled into a matrix. Multiplying by a weight vector for the criteria gives overall rankings.
Information retrieval / database search: Documents are columns of a matrix; keywords are rows. Multiplying the transposed database matrix by a search vector counts keyword matches per document, enabling ranking.

📝 Section Recap: Matrix arithmetic generalises scalar arithmetic, but with one crucial difference: multiplication is not commutative. A linear system is consistent precisely when is a linear combination of the columns of . The transpose swaps rows and columns, and — order reverses.

1.4 Matrix Algebra

Algebraic Rules That Do and Don't Transfer

Many familiar rules from scalar algebra carry over to matrices, but several important ones do not.

Rules that hold (from Theorem 1.4.1, for appropriately dimensioned matrices and scalars ):

(addition is commutative)
(associativity of addition)
(associativity of multiplication)
(left distributivity)
(right distributivity)

Critical rules that do NOT generally hold:

in general (non-commutativity)
(because )
does not imply or
and does not imply

The Identity Matrix

Identity Matrix : The matrix with 1's on the main diagonal and 0's everywhere else. It satisfies for any matrix .

The columns of are the standard basis vectors of .

Matrix Inverses

Nonsingular (Invertible) Matrix: An matrix is nonsingular if there exists a matrix such that . The matrix is unique and called the inverse of , denoted .

Singular Matrix: An matrix that does not have a multiplicative inverse.

Only square matrices can be nonsingular. A matrix can have at most one inverse (uniqueness is proven by showing any two candidates must be equal).

Key theorem: If and are nonsingular matrices, then is also nonsingular and

Again, note the reversal of order. More generally, .

Algebraic Rules for Transposes

Applications

Markov chain (marital status model): A transition matrix encodes what fraction of married and single women change status each year. After years, the population vector is .
Leslie population models (loggerhead sea turtle): A matrix encodes stage-to-stage survival and reproduction rates. gives the population at each life stage after years. This model predicts a roughly 95% decline in breeding-age turtles over 100 years.
Graph theory / Networks: An adjacency matrix for a graph has if there is an edge between vertices and , and otherwise. The entry of counts the number of walks of length from to . Adjacency matrices are always symmetric.

📝 Section Recap: Matrix algebra largely mirrors scalar algebra, but commutativity of multiplication fails and there are no cancellation laws. The identity matrix plays the role of 1. A nonsingular matrix has a unique inverse, and the inverse of a product reverses the order of the factors. Powers of adjacency matrices count walks in graphs.

1.5 Elementary Matrices

From Row Operations to Matrix Multiplication

Every elementary row operation can be represented as multiplication by a special matrix. This reframing turns the row-reduction algorithm into a sequence of matrix multiplications, which is powerful for both theory and computation.

Elementary Matrix: An matrix obtained by performing exactly one elementary row operation on the identity matrix .

There are three types:

Type I: Interchange two rows of — this swaps corresponding rows of any matrix it left-multiplies.
Type II: Multiply a single row of by a non-zero scalar — scales the corresponding row.
Type III: Add a multiple of one row of to another — adds a multiple of one row to another.

Theorem: Every elementary matrix is nonsingular, and its inverse is an elementary matrix of the same type. Specifically:

A Type I matrix (row swap) is its own inverse.
A Type II matrix that multiplies row by has an inverse that multiplies row by .
A Type III matrix that adds times row to row has an inverse that subtracts times row from row .

Equivalent Systems via Matrix Multiplication

If are elementary matrices and , then is nonsingular and the system is equivalent to .

Row Equivalence: Matrix is row equivalent to if for some sequence of elementary matrices — i.e., can be obtained from by finitely many row operations.

The Nonsingularity Theorem

Theorem (Equivalent Conditions for Nonsingularity): For an matrix , the following are equivalent:

is nonsingular.
has only the trivial solution .
is row equivalent to .

This theorem is a workhorse. It means: to check if is invertible, just row-reduce it. If you reach , it's invertible; otherwise it's singular.

Corollary: The system of equations in unknowns has a unique solution if and only if is nonsingular. When is nonsingular, the unique solution is .

Computing the Inverse

The equivalence (via row operations) means we can compute by performing the same row operations on . Concretely:

Method: Form the augmented matrix and row-reduce the entire matrix. If is nonsingular, the result is . The left half reaches while the right half, having undergone the same operations, becomes .

Triangular and Diagonal Matrices

Upper Triangular: A matrix where all entries below the main diagonal are zero.

Lower Triangular: A matrix where all entries above the main diagonal are zero.

Diagonal Matrix: A matrix that is both upper and lower triangular — only entries on the main diagonal can be non-zero.

An upper triangular matrix with all non-zero diagonal entries defines a strict triangular system and is therefore nonsingular.

LU Factorisation

When an matrix can be reduced to strict upper triangular form using only Type III row operations (no row swaps, no scaling), the elimination process can be encoded as a matrix factorisation.

LU Factorisation: where is a unit lower triangular matrix (lower triangular with 1's on the diagonal) and is the upper triangular result of elimination. The entries below the diagonal of are exactly the multipliers used during elimination: is the multiple of row subtracted from row .

The LU factorisation is extremely useful for solving multiple systems with the same coefficient matrix but different right-hand sides, since the work of factorising is done only once.

📝 Section Recap: Every row operation corresponds to left-multiplication by an elementary matrix. A square matrix is nonsingular if and only if it row-reduces to the identity. The inverse can be computed by augmenting with and row-reducing. When elimination uses only row-replacement operations, the result is an LU factorisation , where the multipliers fill in the lower triangular part of .

1.6 Partitioned Matrices

Breaking Matrices into Blocks

Large matrices are often easier to analyse and compute with when split into smaller submatrices (also called blocks). A matrix can be partitioned by drawing horizontal lines between rows and vertical lines between columns.

Partitioned Matrix (Block Matrix): A matrix viewed as an array of smaller submatrix blocks, created by partitioning both rows and columns.

A common and useful special case is partitioning a matrix into its individual columns or rows:

Block Multiplication

The great power of partitioning is that block multiplication works like ordinary matrix multiplication, as long as the block dimensions are compatible. If

where (partitioned by columns) and (partitioned by rows), then the product can be computed block by block.

More generally, for a partitioned product:

This is exactly how matrix multiplication works entry-by-entry, but at the block level. The requirement is that the block dimensions must be compatible at each interface.

Block Diagonal Matrices and Invertibility

A block diagonal matrix of the form

is nonsingular if and only if both and are nonsingular. In that case,

This principle extends: a block diagonal matrix is invertible if and only if every diagonal block is invertible.

Inner Products and Outer Products

When we have vectors and in :

Inner Product (Scalar Product): , a scalar. This is a row vector times an column vector — the result is .

Outer Product: , an matrix. This is an column vector times a row vector. Every row of the result is a multiple of ; every column is a multiple of .

The outer product idea generalises: given an matrix and a matrix , the product can be written as an outer product expansion:

where are the columns of and are the columns of . Each term is an outer product (a rank-1 matrix). This decomposition plays an important role in applications like digital imaging and information retrieval (the singular value decomposition revisits it in Chapter 6).

📝 Section Recap: Partitioned matrices let us work with large matrices by treating blocks as if they were individual entries. Block multiplication follows the same rule as ordinary matrix multiplication, as long as block dimensions are compatible. Block diagonal matrices are invertible iff each diagonal block is. The outer product produces an matrix from two -dimensional vectors, and any matrix product can be written as a sum of outer products.

Key Theorems at a Glance

Theorem	Statement
Homogeneous Non-Trivial Solutions	An homogeneous system has a non-trivial solution whenever .
Consistency Theorem	is consistent iff is a linear combination of the columns of .
Product Inverse	If and are nonsingular, .
Transpose of Product	.
Equivalent Conditions for Nonsingularity	nonsingular has only the trivial solution (row equivalent to identity).
Unique Solution	( equations, unknowns) has a unique solution iff is nonsingular; that solution is .
Walk Counting	The entry of (where is an adjacency matrix) equals the number of walks of length from to .
Elementary Matrices are Nonsingular	Every elementary matrix is invertible, and its inverse is an elementary matrix of the same type.

Common Pitfalls to Watch Out For

Commutativity trap: Never assume . This is wrong in general, and it breaks all the "nice" binomial-like expansions from scalar algebra.
Cancellation trap: does not imply unless is nonsingular.
Zero-product trap: does not mean or .
Inverse of a product: , not .
Singular vs. non-square: Only square matrices can be singular or nonsingular. For non-square matrices, those terms do not apply.
Free variables = infinitely many solutions: Whenever a consistent system has free variables, it has infinitely many solutions (you can set each free variable to any real number).
Pivot selection: If all candidate pivots in a column are zero during elimination, skip that column and move right — do not stop.

Fundamentals of Vectors and Linear Operations