Information Theory and Applications

Course Aim

This course introduces the foundations and applications of information theory from a principled, probabilistic perspective. Its aim is to clarify what quantities such as entropy, relative entropy, and mutual information actually measure, and why they arise naturally in problems of uncertainty, inference, and physical systems. The course emphasises conceptual understanding and mathematical structure, while illustrating how information-theoretic ideas connect to neuroscience, physics, statistics, and machine learning. The purpose is not only to present the formalism, but to develop a clear and operational understanding of how information-theoretic tools can be meaningfully applied across disciplines.

Student Learning Outcomes

- To demonstrate a solid understanding of the fundamental concepts of information theory
- To develop the ability to construct rigorous mathematical arguments and proofs within the framework of probability theory and information theory
- To prove and apply the fundamental theorems of information theory, such as source coding and channel coding
- To use Information Theory methods to model problems of a probabilistic nature
- To apply the principles and techniques of information theory to characterise problems of a probabilistic nature: large deviation theory, hypothesis testing, Bayesian inference, etc.
- To acquire the language to describe impossibility results in settings like estimation, testing,

Course Description

This course introduces the foundations and applications of information theory from a rigorous probabilistic perspective. Beginning with a review of probability and discrete random sources, we develop entropy as a quantitative measure of uncertainty and establish the fundamental limits of lossless data compression. We then introduce relative entropy (Kullback–Leibler divergence) and show how it governs hypothesis testing and statistical distinguishability. These ideas are extended to information transmission, including mutual information, channel capacity, and the Gaussian channel model.

In the second part of the course, information-theoretic methods are applied to problems in statistical inference and learning. Topics include parameter estimation and the Cramér–Rao bound, Fano’s inequality, exploration bias in data analysis, and information-theoretic perspectives on generalisation error. Throughout, the emphasis is on operational meaning, structural results, and fundamental limits rather than computational techniques.

Depending on available time and the background of the students, additional applications and selected modern developments may be discussed, particularly finite-sample and non-asymptotic perspectives relevant to contemporary research in statistics, neuroscience, physics, and machine learning.

Course Contents

1. Review of probability theory
2. Data Compression: coding theorem for a discrete memoryless source
3. Entropy
4. Lossless coding
5. Kullback-Leibler Divergence
6. Hypothesis Testing
7. Channel Coding: Information Transmission Theorem and Mutual Information
8. Continuous random variables and information-theoretic quantities, Gaussian channel
9. Parameter estimation, Cramér-Rao, Fano's inequality
10. Exploration bias in data science
11. Generalisation error of learning algorithms

Depending on available time and the students' backgrounds, additional applications and selected modern developments will be discussed, particularly results on finite-sample behaviour and non-asymptotic analysis.

Assessment

Final exam 50%, oral presentation 30%, in-term tests/quizzes 20%

Prerequisites or Prior Knowledge

Students are expected to have:
- Some knowledge of probability theory
(random variables, expectation, conditioning, Bayes’ rule)
- Basic mathematical maturity
(comfort with proofs, inequalities, and abstract reasoning)
- Familiarity with calculus and linear algebra

No prior knowledge of information theory is assumed.
Programming experience is not required.

Textbooks

Elements of Information Theory, Thomas Cover and Joy A. Thomas
Information Theory: From Coding to Learning, Yury Polyanskiy and Yihong Wu

Reference Books

Information Theory, Coding Theorems for Discrete Memoryless Systems. Imre Csiszár and Jànos Körner

Faculty

Amedeo Roberto Esposito

Course ID

B55

Course Type

Elective

Course Credits

Course Term

Information Theory and Applications

Research Specialties

Mathematics

Computer sciences