Algebraic & Geometric Methods in Statistics
Outline and some
illustrative examples in nonlinear statistics
Goals
After the course, you can:
- list topics in algebraic statistics
- recognize problems in statistics that are answerable by algebraic methods
- assess which algebraic methods are suitable for solving a problem
- apply basic algebraic tools to solve a problem
Tentative course outline:
What is algebraic statistics? An invitation / introduction / overview
Exponential families 1.1. Statistical foundations 1.2. Underlying algebra
Conditional independence and graphical models 2.1. Statistical foundations 2.2. Underlying algebra
Goodness-of-fit testing of models for discrete data 3.1. Overview 3.2. Chromosome clusters in cancer cells 3.3. Network data 3.4. Challenges of large, sparse data sets
Parameter identifiability 4.1. Overview 4.2. Graphical models 4.3. Phylogenetics and evolutionary biology 4.4. Model selection: learning a causal graph
Maximum likelihood estimation 5.1. Introduction 5.2. Deciding existence of ML estimators 5.3. Algorithms for MLE: convex and non-convex optimization
Materials
Books and resources
Main textbook: Seth Sullivant “Algebraic Statistics”. It is avaiable in the bookstore. (I will check with the library for an e-copy.)
General course syllabus is here
Homework and grade
Approximately 6-7 assignments, expect a usual weekly workload.
Project
Reading a paper, working on a small research project, or applying algebraic methods on a data set, and writing a report on it. Timeline will be determined soon; project will take place during second half of semester. Groups up to 2 students.
Participate (team) in the Eric and Wendy Schmidt Center’s cancer immunotherapy data science challenge: https://go.topcoder.com/schmidtcentercancerchallenge/
Communication
We will not use Blackboard. Email is not efficient. So.. ??
- Campuswire
- GitHub
- Your input please as I decide. Decision will be made THIS WEEK.
Saving this information:
Course homepage will be created here: sonjapetrovicstats.com/teaching, and the syllabus will be posted there.
Student input time!
Motivating example 1: Discrete Markov chain
Section 1.1. of the textbook.
Lecture on board.
Motivating example 2: Graphical models
What is algebraic statistics?
Probability / statistics
- Probability distribution
- Statistical model
- (Discrete) exponential family
- Conditional inference
- Maximum likelihood estimation
- model selection
- Multivariate Gaussian model
- Phylogenetic model
- MAP estimates
Algebra/geometry
- Point
- (Semi)algebraic set
- Toric variety / ideal
- Lattice points in polytopes
- Polynomial optimization
- Geometry of singularities
- Spectrahedral geometry
- Tensor networks
- Tropical geometry
Lecture plan
We will continue now with the following topics:
- Probability Primer (Chapter 2) and
- Conditional Independence (Chapter 4)
Appendix
Following is a 3-slide “intro” to algebraic geometry; these were slides by S. Sullivant given at a colloquium a long long time ago. They are meant to just give you a glimpse into the vocabulary… not to digest this immediately.
Introduction to algebraic geometry
Example: Hardy-Weinberg Equilibrium
License
This document is created for Math/Stat 561, Spring 2023, at Illinois Tech.
While the course materials are generally not to be distributed outside the course without permission of the instructor, all materials posted on this page are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.