21-326: Markov Chains: Theory, Simulation and Applications

Fall 2025

What is this course about?

Let’s start with a few questions whose statements should be accessible to you:

Can you decode this message?

LAVLCRQPTVIWLRVTKQCLCRQLMCLETDEJCLARQLDVKEYSLMLREGQLCQTFVKLRQEPFLRMKLKQYAMVYLRQPLHYFQPLEYJLVARQPLYEKQSLMYLRMCLQJQCLCRQLQITMBCQCLEYFLBPQFVKMYEAQCLARQLDRVTQLVNLRQPLCQUSLMALDECLYVALAREALRQLNQTALEYJLQKVAMVYLEWMYLAVLTVGQLNVPLMPQYQLEFTQPSLETTLQKVAMVYCLEYFLAREALVYQLBEPAMIHTEPTJLDQPQLEXRVPPQYALAVLRMCLIVTFLBPQIMCQLXHALEFKMPEXTJLXETEYIQFLKMYFSLRQLDECLMLAEWQLMALARQLKVCALBQPNQIALPQECVYMYOLEYFLVXCQPGMYOLKEIRMY

The message above was sent by performing some permutation of the letters (e.g. $A \mapsto K$, $B \mapsto Z$, etc.) on the intended message. A brute force search will take longer than the life of the universe with modern computers. But a MCMC based algorithm will crack it in seconds. (You will learn how to do this on your homework.)

Do randomly packed disks in a box form a hexagonal like arrangement?

Randomly pack (non-overlapping) disks into a box. When packed at high densities they arrange themselves into a hexagonal structure. How do you reliably test this numerically? Simulating all configurations is not feasible; but you will learn how to do this reliably and efficiently. (Questions like this arise in the study of crystal structures and statistical physics.)

In a (large) interconnected network, how do you determine the relative importance of nodes?

You use this everyday when you search online, and see results ranked based on “relevance”. There is a simple, elegant algorithm (now known as the Page Rank algorithm), which we will analyze and implement.

Do random walks return to their starting point infinitely often?

Suppose a drunk hiker explores their surroundings by randomly walking a unit distance north, south, east or west. Will they return to their starting point infinitely often?

If instead of a drunk hiker (whose movement is confined to the plane), we had a drunk spaceship captain who did the same thing, but can additionally move up or down. Will they return to their starting point infinitely often?

Turns out the drunk hiker will return infinitely often, but the drunk spaceship captain won’t. Why? This is related to recurrence of Markov Chains and the gamblers ruin problem, which we will study in detail.

The underlying theoretical foundation of each of the above examples is Markov Chains, which is both accessible to undergraduate students and has enough mathematical depth to be an active area of current research.

This course aims to study both the theoretical foundations (via an honest theorem, lemma, proof approach), and analyze various applications. A few applications will be implemented on a computer using scientific / numerical Python. (No prior knowledge of Python is required, but some experience with coding is helpful.)

Syllabus

Topics covered

Markov chains: basic notions, stationary distribution, mixing times, techniques to estimate mixing times.
Sampling: random number generation, importance sampling, rejection sampling.
Markov Chain Monte Carlo: Metropolis Hastings, Gibbs sampling, tempering, simulated annealing.
Applications: Page rank, Monte Carlo integration, hard core model, traveling salesman, Ising model, graph coloring, gerrymandering.

A few applications will be implemented on a computer using scientific / numerical Python. (No prior knowledge of Python is required, but some experience with coding is helpful.)

Learning Objectives

Understand the theoretical foundations of Markov Chains, and be proficient with the proofs of basic results in the field.
Learn about MCMC and study several applications in detail.
Understand the basics of simulation, and code a few applications using numerical Python.

Pre-requisites

A rigorous and through first course on Probability (e.g. 21-325 or 21-721).
Linear algebra (e.g. 21-240, 21-241 or 21-242).
Some familiarity with coding (not necessarily Python)

References

Brief lecture notes by the instructor will be written as the semester progresses.
Markov Chains, James Norris
A First Course in Monte Carlo Methods, D. Sanz-Alonso, O. Al-Ghattas.
Markov Chains and Mixing Times, D. Levin, Y. Peres, E. Wilmer

Frequently Asked Questions

Would this fulfill any of the requirements for math majors?

Yes, for the Computational and Applied concentration, it will count as a depth elective. For all other concentrations, it will count as a standard 300-level elective.

I haven’t taken pre-requisite X, can I still enroll?

If you’ve seen the material at an appropriate level somewhere else, then most probably yes. We will, however, use probability and certain concepts in linear algebra crucially throughout the semester, so be sure you are well versed with those concepts before asking for pre-requisites to be waived.

What is the balance between applications and theory?

Each of the applications listed above (and a few more) will be part of your homework assignment, so it will certainly be covered. There’s actually a simple recipe for each of the applications listed here, which can be black boxed and covered in the span of one lecture. This, however, is not the aim of this course, and we will not spend time listing black box recipes for such applications.

This is a proof based course that will emphasize the theory behind the applications, and cover the underlying conceptual principles rigorously (theorems, lemmas, proofs). As a rule of thumb, only advanced theorems that require too much machinery will be black boxed. Everything else will be proved.

After learning the theory, it takes a little bit of insight to realize how it can be applied to various situations. It also takes a little computational knowledge to be able to implement it correctly, and avoid common (but not obvious) errors. We aim to cover this, focussing on the mathematical / conceptual aspects, without getting caught up in application specific nuances / implementation details.

How does this compare to Course X

21-387/15-327: This course will cover all the material from the first half of 21-387/15-327 (Fall 2024), will cover a few topics more depth, and will cover additional topics. For instance we will study techniques to estimate mixing times (e.g. coupling, spectral methods), and recurrence of Markov chains, which were beyond the scope of 21-387. (Also 21-387 / 15-327 are not being offered in Fall 2025.)
15-259/15-359: We will not cover several of the CS applications (queuing / hashing) done in 15-259/15-359. We also focus on the foundational theory of Markov Chains much more than 15-259/359. For instance, 15-359 seems to cover general convergence theorems only towards the end of the class in chapters 25/26 of the book. We will start with this, and then spend time studying the rate of convergence.