Algebraic & Geometric Methods in Statistics

Outline and some illustrative examples in nonlinear statistics

Goals

After the course, you can:

list topics in algebraic statistics
recognize problems in statistics that are answerable by algebraic methods
assess which algebraic methods are suitable for solving a problem
apply basic algebraic tools to solve a problem

Tentative course outline:

What is algebraic statistics? An invitation / introduction / overview
Exponential families 1.1. Statistical foundations 1.2. Underlying algebra
Conditional independence and graphical models 2.1. Statistical foundations 2.2. Underlying algebra
Goodness-of-fit testing of models for discrete data 3.1. Overview 3.2. Chromosome clusters in cancer cells 3.3. Network data 3.4. Challenges of large, sparse data sets
Parameter identifiability 4.1. Overview 4.2. Graphical models 4.3. Phylogenetics and evolutionary biology 4.4. Model selection: learning a causal graph
Maximum likelihood estimation 5.1. Introduction 5.2. Deciding existence of ML estimators 5.3. Algorithms for MLE: convex and non-convex optimization

Materials

Books and resources

Main textbook: Seth Sullivant “Algebraic Statistics”. It is avaiable in the bookstore. (I will check with the library for an e-copy.)

General course syllabus is here

Homework and grade

Approximately 6-7 assignments, expect a usual weekly workload.

Project

Reading a paper, working on a small research project, or applying algebraic methods on a data set, and writing a report on it. Timeline will be determined soon; project will take place during second half of semester. Groups up to 2 students.

Participate (team) in the Eric and Wendy Schmidt Center’s cancer immunotherapy data science challenge: https://go.topcoder.com/schmidtcentercancerchallenge/

Communication

We will not use Blackboard. Email is not efficient. So.. ??

Campuswire
GitHub
Your input please as I decide. Decision will be made THIS WEEK.

Saving this information:

Course homepage will be created here: sonjapetrovicstats.com/teaching, and the syllabus will be posted there.

Student input time!

AhaSlides.com/NLASTAT1

Motivating example 1: Discrete Markov chain

Section 1.1. of the textbook.

Lecture on board.

Motivating example 2: Graphical models

What is algebraic statistics?

Probability / statistics

Probability distribution
Statistical model
(Discrete) exponential family
Conditional inference
Maximum likelihood estimation
model selection
Multivariate Gaussian model
Phylogenetic model
MAP estimates

Algebra/geometry

Point
(Semi)algebraic set
Toric variety / ideal
Lattice points in polytopes
Polynomial optimization
Geometry of singularities
Spectrahedral geometry
Tensor networks
Tropical geometry

Lecture plan

We will continue now with the following topics:

Probability Primer (Chapter 2) and
Conditional Independence (Chapter 4)

Appendix

Following is a 3-slide “intro” to algebraic geometry; these were slides by S. Sullivant given at a colloquium a long long time ago. They are meant to just give you a glimpse into the vocabulary… not to digest this immediately.

Introduction to algebraic geometry

Example: Hardy-Weinberg Equilibrium

License

This document is created for Math/Stat 561, Spring 2023, at Illinois Tech.

While the course materials are generally not to be distributed outside the course without permission of the instructor, all materials posted on this page are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.