If you are an undergraduate in Cambridge, please go away. No peeking at last year's notes!
[want to choose a nearer web site? | Cambridge, Europe | Toronto, North America |]
Downloading Options: choose http/ftp, choose one-page or two-page
Copyright issues: you are welcome to download and print the above documents, and give them to your friends. Please do not make multiple copies of them (e.g., more than ten) without consulting me. (mackay@mrao.cam.ac.uk) Thanks.
History:
Draft 1.1.1 - March 14 1997.
Draft 1.2.1 - April 4 1997.
Draft 1.2.3 - April 9 1997.
Draft 1.2.4 - April 10 1997. Margins altered so as to print better
on Northamerican paper
There is no substantially new material on this web site compared with the lecture notes I handed out. I have not been able to write up any more worked solutions, for example.
I will be away in Canada from April 12th to May 1st, so I will only be reachable by email.
Information Theory, Pattern Recognition and Neural Networks
Minor Option
Lecturer: David MacKay
Introduction to information theory {1}
The possibility of perfect communication over noisy channels.
Entropy and data compression {3}
Entropy, conditional entropy, mutual information. Shannon's
source coding theorem: entropy as a measure of information
content. Codes for data compression. Uniquely decodeable codes
and the Kraft-MacMillan inequality. Huffman codes. Arithmetic
coding. Lempel-Ziv coding.
Communication over noisy channels {3}
Definition of channel capacity. Capacity of binary symmetric
channel; of binary erasure channel; of Z channel. Shannon's
noisy channel coding theorem. Practical error-correcting
codes.
Statistical inference, data modelling and pattern recognition {3}
The likelihood function and Bayes' theorem.
Inference of discrete and continuous parameters.
Curve fitting. Classification. Density estimation.
Neural networks as information storage devices {2}
Capacity of a single neuron. Hopfield
network and its relationship to spin glasses.
Boltzmann machine and maximum entropy.
Data modelling with neural networks {2}
Interpolation and classification using multilayer perceptrons.
Backpropagation algorithm.
Unsupervised neural networks {2}
Principal component analysis. Vector quantization.
Density modelling with neural networks.
Kohonen network. Helmholtz machine.
-----------------
Required courses: 1B Mathematics
----------------
References
----------
Berger, J. (1985)
Statistical Decision theory and Bayesian Analysis. Springer.
Bishop, C.M. (1995)
Neural Networks for Pattern Recognition. Oxford University Press.
Blahut, R.E. (1987)
Principles and Practice of Information Theory. New York: Addison-Wesley.
Box, G.E.P. & Tiao, G.C. (1973)
Bayesian inference in statistical analysis. Addison-Wesley.
Bretthorst, G. (1988)
Bayesian spectrum analysis and parameter estimation. Springer.
Cover, T.M. & Thomas, J.A. (1991)
Elements of Information Theory. New York: Wiley.
Duda, R. & Hart, P. (1973)
Pattern Classification and Scene Analysis. Wiley.
Hertz, J., Krogh, A. & Palmer, R.G. (1991)
Introduction to the Theory of Neural Computation. Addison-Wesley.
Jeffreys, H. (1939)
Theory of Probability. Oxford Univ. Press.
McEliece, R.J. (1977)
The theory of information and coding: a mathematical framework
for communication. Reading, Mass.: Addison-Wesley.
Rosencrantz, R.D. (1983)
E.T. Jaynes. Papers on Probability, Statistics and Statistical
Physics. Kluwer.
Witten, I.H., Neal, R.M. & Cleary, J.G. (1987)
Arithmetic coding for data compression.
Communications of the ACM 30 (6):520-540.
My old 8 lecture course on
Information Theory.
| mirror (Toronto) |