Everyone's AI
Machine learningAI Papers
Loading...

Learn

🏅My achievements

Ch.06

Linear Independence and Rank: Redundancy and Effective Dimension

Math diagram by chapter

Select a chapter to see its diagram below. View the flow of intermediate math at a glance.

Linear Independence and Rank: How Many Real Dimensions?

Independent means the directions don’t overlap. Rank counts non-redundant directions (here 1 or 2 in this demo).

12
rank
12
When the orange arrow lies on the dashed line (the span of the first direction), it does not add a new axis—linear dependence; in this figure the demo reads rank 1.
When orange leaves the line, the two directions differ → linear independence, and the demo reads rank 2.
Imagine a startup with 100 employees on paper. In practice, 20 people drive new ideas while 80 mostly copy the same approvals with different names. Is the “real workload dimension” 100 or 20?
Last chapter: matrices reshape space. Here we learn to spot fake vs real arrows in data: linear independence (a direction nobody else can replace) vs dependence (a free rider that is just a combination). After stripping redundant shadows, rank counts the true backbone of information—without being fooled by raw column counts.

Linear Independence and Rank: How Many Real Dimensions?

1. Linear independence — “RGB primaries”
Mixing paint or light, red, green, blue are fundamental: you cannot synthesize one from the others alone. Vectors are linearly independent when no vector is a combination of the rest: c1v1+⋯+ckvk=0c_1\mathbf{v}_1+\cdots+c_k\mathbf{v}_k=\mathbf{0}c1​v1​+⋯+ck​vk​=0 forces every ci=0c_i=0ci​=0. Each new independent vector opens a new axis of information.
2. Linear dependence — echoes and “free riders”
If red and green lights already exist, adding a “yellow” lamp (red + green) does not widen the color gamut: it is redundant. If v3=2v1+3v2\mathbf{v}_3=2\mathbf{v}_1+3\mathbf{v}_2v3​=2v1​+3v2​, the third vector is a linear combination of the others — dependence. It may look like more data, but it is an echo, not new signal.
3. Rank — “information purity” after defoaming
rank(A)\mathrm{rank}(A)rank(A) is the maximum number of independent columns, no matter whether you have 100 or 1000 columns. If 100 arrows all lie in one plane, rank is still 2: rank is the true effective dimension of the data.
4. Basis — minimal steel frame
A basis is a smallest independent set that still spans the whole subspace—like the steel frame that fixes a building’s shape even if many bricks fill the walls. The number of basis vectors is the dimension.
5. Link to Ch.05 — what det⁡(A)\det(A)det(A) means, and rank
The determinant det⁡(A)\det(A)det(A) is one number that tells how an n×nn\times nn×n linear map scales unit nnn-dimensional volume (in 2D: area of the unit square). If det⁡(A)=0\det(A)=0det(A)=0, the map squashes space so volume collapses and no inverse exists; if det⁡(A)≠0\det(A)\neq 0det(A)=0, you can undo the map with A−1A^{-1}A−1 (Ch.05).
If rank(A)=n\mathrm{rank}(A)=nrank(A)=n, columns are independent and space is not fully flattened, so det⁡(A)≠0\det(A)\neq 0det(A)=0 and A−1A^{-1}A−1 exists. Lower rank lets the space collapse, so det⁡(A)=0\det(A)=0det(A)=0 and inversion fails.
One line: independence = irreplaceable directions; dependence = mixtures; rank = true dimension after removing foam.
Five witnesses sound great—unless they all watched from the same window (dependence): you hear one clue five times (rank 1). Three witnesses from a street, a rooftop, and CCTV (independence, rank 3) carry far more real information.
In ML, feeding both “area in m²” and “area in pyeong” points the same direction: multicollinearity. The model may not “notice” the duplication and can return unstable or nonsense weights.
Rank asks: *how many nutrient-dense directions are really here?* Stripping redundant mixtures is core prep for stable training and faster, clearer computation.
1. Saving linear regression (ridge)
Least squares needs (XTX)−1(X^{\mathsf T}X)^{-1}(XTX)−1. Nearly duplicate columns make XTXX^{\mathsf T}XXTX singular. Ridge adds a tiny diagonal “shim”—like slipping a toothpick into a crushed sandwich—to restore numerical volume so an inverse can be computed.
2. Bottlenecks in deep nets
Think of a 100-lane highway through linear layers. If a layer has effective rank 10, the road suddenly narrows: information bottleneck—most lanes of detail are destroyed. Designers watch rank-like behavior when choosing widths.
The table lists symbols and tips. Worked patterns walk through representative practice types—definitions, true/false, numeric rank, rank–nullity, rank identities, short scenarios—in a compact question / solution format.
  • SymbolIndependent
  • MeaningOnly trivial combination gives zero
  • SymbolDependent
  • MeaningOne column is a combination of others
  • Symbolrank(A)\mathrm{rank}(A)rank(A)
  • MeaningDimension of column space
  • SymbolBasis
  • MeaningIndependent spanning set
  • Symbolrank(AB)\mathrm{rank}(AB)rank(AB)
  • Meaning≤min⁡{rankA,rankB}\le\min\{\mathrm{rank}A,\mathrm{rank}B\}≤min{rankA,rankB}
  • Symboldet⁡(A)\det(A)det(A)
  • MeaningVolume/area scaling factor for the map (Ch.05); det⁡(A)=0\det(A)=0det(A)=0 ⇒ no inverse
SymbolMeaning
IndependentOnly trivial combination gives zero
DependentOne column is a combination of others
rank(A)\mathrm{rank}(A)rank(A)Dimension of column space
BasisIndependent spanning set
rank(AB)\mathrm{rank}(AB)rank(AB)≤min⁡{rankA,rankB}\le\min\{\mathrm{rank}A,\mathrm{rank}B\}≤min{rankA,rankB}
det⁡(A)\det(A)det(A)Volume/area scaling factor for the map (Ch.05); det⁡(A)=0\det(A)=0det(A)=0 ⇒ no inverse

Worked patterns

Example 1 — Picking a definition
Question: Which statement defines rank(A)\mathrm{rank}(A)rank(A)?
Solution: Choose dimension of the column space.

Example 2 — True / false
Question: Are two distinct vectors in R2\mathbb{R}^2R2 always linearly independent?
Solution: No in general: collinear vectors are dependent.

Example 3 — Numeric rank
Question: What is the rank of (1224)\begin{pmatrix}1&2\\2&4\end{pmatrix}(12​24​)?
Solution: Columns are proportional → rank 1. If unclear, row-reduce and count pivots.

Example 4 — Rank–nullity
Question: If dim⁡{x:Ax=0}=k\dim\{\mathbf{x}:A\mathbf{x}=\mathbf{0}\}=kdim{x:Ax=0}=k and AAA has nnn columns, what is rank(A)\mathrm{rank}(A)rank(A)?
Solution: rank(A)=n−k\mathrm{rank}(A)=n-krank(A)=n−k.

Example 5 — Rank identities
Question: For invertible P,QP,QP,Q, what is rank(PAQ)\mathrm{rank}(PAQ)rank(PAQ)?
Solution: rank(PAQ)=rank(A)\mathrm{rank}(PAQ)=\mathrm{rank}(A)rank(PAQ)=rank(A).

Example 6 — Short scenario
Question: If a3=2a1−a2\mathbf{a}_3=2\mathbf{a}_1-\mathbf{a}_2a3​=2a1​−a2​, what can you say about the rank of the three columns?
Solution: They are dependent, so rank≤2\mathrm{rank}\le 2rank≤2.

Practice

10 random questions are drawn from a bank of 60.

rank(AT)\mathrm{rank}(A^{\mathsf T})rank(AT) vs rank(A)\mathrm{rank}(A)rank(A)?
1 / 10