Segev BenZvi

Department of Physics and Astronomy, University of Rochester


Spring 2016

PHY 403: Modern Statistics and the Exploration of Large Datasets

This is my second time teaching this course on probability and statistics for graduate students and advanced undergraduates. The class includes a significant data analysis and numerical methods component centered around the Python programming language. (Mathematica, R, and ROOT are also allowed.)

Time and Location

MW 10:25 - 11:40 am
Bausch and Lomb 208


Data Analysis: A Bayesian Tutorial
D.S. Sivia and John Skilling
ISBN-10: 0198568320

Statistical Data Analysis
Glen Cowan
ISBN-10: 0198501552

The following books are not required but are great references. You may find them on reserve at POA or available as an online electronic reference on the River Campus Library website:



Class Participation10%
Final Project30%

Homeworks are assigned bi-weekly and will have a significant programming component. Assignments are due Friday at 5 pm two weeks after it is assigned. You can use any programming language you like (including Mathematica and R), but support from the TA and instructor is limited to Python and ROOT.

You may discuss the problems informally with your classmates but you must complete the homework on your own. Printouts of source code and plots are required to receive full credit.

The final project can be on a data analysis project of your choice, either reflecting your current work or your analysis of a previous result. You will present your results during a 20 minute presentation at the end of the semester (April 20-29).

Lecture Notes

1Course Intro
Different interpretations of probability: propensity, frequency, and degree of belief.
Reading: Sivia Ch. 1; Cowan Ch. 1.1, 1.2
2Programming Primer
Basics of programming in Python: Arithmetic operators, variables, conditionals and loops, functions, and importing modules. Intro to NumPy and Matplotlib.
3Basics of Probability and Summary Statistics
Rules of probability: Sum Rule, Product Rule, Bayes' Theorem, Law of Total Probability.
PDFs and summary statistics: mean, mode median; variance, covariance, correlation; histograms.
Reading: Sivia Ch. 1, Cowan Ch. 1.1-1.5
4Common Probability Distributions I
Binomial, negative binomial, multinomial, Gaussian, Poisson, Gamma, Exponential, Chi-square, Cauchy, Landau.
Reading: Cowan Ch. 2, Numerical Recipes Ch. 7
5Common Probability Distributions II
PDFs and probability mass functions (PMFs); more about the chi-square test and p-values; transformation rules for PDFs; probability generating functions for discrete random variables.
Reading: Cowan Ch. 2, Sivia Ch. 3.6
6Monte Carlo Methods
Simulation and random number generation. Pseudo-random uniform number generators. Sampling from arbitrary PDFs using transformation/inversion and acceptance/rejection methods.
Reading: Cowan Ch. 3
7Model Selection and Parameter Estimation
Parameter estimation in the Bayesian framework: posteriod distributions; the role and effect of priors; marginalization of "nuisance" parameters; model comparison using posterior odds ratios; a quantitative version of Ockham's Razor.
Reading: Sivia Ch. 2, 3
8Parameter Estimation
Choosing priors: the Principle of Indifference; uniform and Jeffreys priors. Estimators of parameters: maximizing the posterior; reliability of estimators; bias and mean squared error; consistency and efficiency of estimators. Case studies: Gaussian and binomial estimators.
Reading: Sivia Ch. 2; Cowan Ch. 5
9Parameter Estimation, Correlation, and Error Bars
Correlations between parameters: quadratic approximation in 2D, the Hessian matrix, the covariance matrix. The student-t distribution and the chi-square distribution.
Reading: Sivia Ch. 3.2, 3.3
10Minimization Techniques: Maximum Likelihood and Least Squares I
Function minimization: common issues; grid search and steepest descent; Newton's Method; simplex method; simulated annealing. The method of maximum likelihood and its connection to least squares regression.
Reading: Numerical Recipes Ch. 10
11Maximum Likelihood and Least Squares II
Properties of ML estimators. Variances and the Minimum Variance Bound. The chi-square statistic and goodness of fit.
Reading: Sivia Ch. 3; Cowan Ch. 6; Numerical Recipes Ch. 15
12Propagation of Uncertainties
The classic error propagation formula and its limitations. Using the covariance matrix. Asymmetric error bars. A fully Bayesian approach with the complete PDF.
Reading: Sivia Ch. 3.6; Cowan Ch. 7.6
13Systematic Uncertainties
Systematic uncertainties vs. mistakes ("errors") in data taking. Systematics and experimental design. How and when to assign systematic uncertainties.
Reading: see papers by Roger Barlow referenced in the slides.
14Bayesian Model Selection and Hypothesis Testing
Hypothesis testing; posterior odds; the Ockham Factor revisited. Comparing several models with free parameters. Hypothesis testing vs. parameter estimation.
Reading: Sivia Ch. 4.1-4.2; Cowan Ch. 4.1-4.4
15Classical Hypothesis Testing: The Likelihood Ratio Test
Type I and Type II errors. Statistical significance and power in model selection. The Neyman-Pearson lemma. Using p-values and the Neyman-Pearson test in model selection. The likelihood ratio test and Wilks' Theorem.
Reading: Cowan Ch. 4
16Credible Intervals and Confidence Intervals
Summarizing the range of values of a parameter. Bayesian credible intervals. Classical confidence intervals: Neyman intervals and confidence belts; central intervals and lower/upper limits; frequentist coverage and the "flip-flopping" problem. Feldman-Cousins frequentist intervals. Confusing confidence intervals with posterior probabilities.
Reading: Cowan Ch. 9
17Instrument Response and Unfolding
Accounting for instrumental efficiency and resolution. Forward folding and unfolding. Regularization techniques for unfolding. Balancing of variance and bias: figures of merit based on MSE, log-likelihood, and chi-square statistics.
Reading: Cowan Ch. 11
18Sampling from PDFs: Markov Chain Monte Carlo
The Metropolis-Hastings algorithm. Sampling from multi-dimensional PDFs with MCMC. The Principle of Detailed Balance. Practical details: burn-in and efficiency. Parallel tempering.
Reading: Information Theory, Inference, and Learning Algorithms, Ch. 29
19Sampling from PDFs: Nested Sampling
Evaluating full posterior distributions. Likelihood ordering and Lebesgue integration. Sampling from strongly multimodal PDFs.
Reading: Sivia Ch. 9
20Spectral Analysis
Analysis of signals in the time domain: signal sampling and the Nyquist-Shannon Sampling Theorem. Analysis of signals in the frequency domain: Fourier analysis and power spectral density. Windowing and apodization. Bayesian insight into the power spectrum: Schuster and Lomb-Scargle periodograms.
Reading: Numerical Recipes in C Ch. 13
21The Principle of Maximum Entropy
Revisiting the Principle of Indifference. Choosing maximally non-committal PDFs in the presence of missing information. The Shannon-Jaynes Entropy and the derivation of common statistical distributions using the Principle of Maximum Entropy.
Reading: Sivia Ch. 5, 6.2
22Measurement and Bias
Bandwagon effects in experimental results. Confirmation bias: data selection and stopping criteria. Blind analyses.

The homework assignments are available at

Additional Bibliography

In addition to the course texts and books on reserve I also used online materials as resources for these lectures, including lecture notes from similar courses. In the interest of giving credit where it's due, here are some of the best resources out there:


Anyone who comes across this material and wishes to use it for their own courses is free to do so without requesting my permission. However, please cite S. BenZvi, Dept. of Physics and Astronomy, University of Rochester, 2016.