algstat561

Course information

Math 561 is a graduate-level course on algebraic and geometric methods in statistics. Course homepage is here, containing detailed info on the material, grading, etc.

This page keeps track of course notes, material covered, homework schedule, and other relevant links. Unlike the static course homepage, this one is updated weekly.

Week 1 - invitation to algebraic statistics

Homework 1 assigned. due 1/18/2023->1/24/2023

Week 2 - limiting distributions, why worry about which model fits the data, and intro to conditional independence

(This was a one-lecture week due to MLK holiday.)

Week 3 - the algebra of conditional independence

The plan is to cover the remainder of section 4.1 in the book. We will talk about the algebra / geometry behind conditional independence models. It is important to be familiar with the independence concepts so that you can better understand the lecture.

Day 1: discrete random variables

Day 2: Gaussian random variables

For students, here are questions to verify:

Things to review:
Understanding some of the material in here, which is still in section 4.1 of the textbook, requires digesting the normal distribution notation from Chapter 2. ( These are standard concepts you most likely have seen in some statistics course before.) Specific results you need to review are stated in the slides at the beginning of the lecture.

Homework 2 assigned. due 2/10/2023. Here is the source .Rmd file. Deadline moved to 2/13/2023

Important note regarding code submissions: If you are solving the coding problem, you should submit: clean code that I can run, ensuring that proper documentation is provided (e.g. what does each function/method do?), and a printout of at least one example illustrating how the code works.

Week 4: Maximum likelihood estimation and exponential families

Day 1: Parametric models (recap), statistics vs. parameters, parameter estimation problem

Day 2: Exponential families

Week 5: Exponential families, and wrap-up on likelihood inference. The sufficiency principle.

Day 1: How to find the vanishing ideal of a discrete exponential family. Log-linear models and binomial ideals.

Day 2: Likelihood inference & outlook: why do we even estimate parameters?

Homework 3 assigned. Due 2/24/2023 hw3.pdf, hw3.Rmd.

Week 6:

Day 1: in-class work. See Campuswire post #39 for announcements.

Note: Homework 2 is due 2/13/2023, extended from Friday week 5.

Please note the following special assignment:

Homework X assigned. hwX.pdf Attend either of the following two talks: 1. Aida Maraj, Friday 3 March 11:30-12:30, or 2. Felix Almendra Hernandez, Friday 7 April 11:30-12:30. Write a one-page summary (one page typed in Markdown, for example) of the talk. Details are in the PDF. If your write-up is satisfactory, this grade can contribute to your your “Participation & worksheets: 10%” grade (see grading scale) in the course.

Day 2: Exact testing: motivation

Project topics are to be posted this week based on the input received from students; please stay tuned. pushed to week 7 due to illness.

Week 7: GoF (goodness of fit) for log-linear models

Day 1: exact tests part 2

We worked through the main ingredients of an exact conditional goodness of fit test, starting with the motivation. Here are the main points, which I summarize for your review:

Announcement: Project topics and more information can be found on this page.

Day 2: exact tests part 3

Today’s lecture will finish the last definition in section 9.1 in the book, and continue covering section 9.2 in the book. The book has an example with another data table and code in R, which you are invited to try out. I will focus on explaining the method and the algorithm on the board. For your convenience, here is a quick summary of what we will cover first:

Homework 4 assigned. Update: THIS HAS NOT YET BEEN ASSIGNED. it will be assigned 2/28, early week 8.

Week 8: Bounds on cell entries or design of experiments [tentative]

Project topics selection period closes. All teams should have a topic to work on and develop a plan of action this week.

Day 1: exact test, part 3 continued

We spent a large part of week 7, day 2 on project topics. So, today we will finish the topics from slide 7 from lecture 12 outline PDF.

Homework 4 assigned. Due 3/13/2023 hw4.pdf, hw4.Rmd.

Day 2: how to compute Markov bases?

lecture 14 PDF

Week 9:

Day 1: Cell bounds on contingency tables

lecture 15 PDF

Day 2: Hidden variables

The lecture will be on the board and follow these notes, summarizing our book’s Chapter 14.

Week 10: Graphical models

Day 1: intro to graphical models

Lecture by Miles Bakenhus. Notes forthcoming.

Day 2: continuation.

Week 11: Graphical models [tentative]

Lecture 18 PDF.

Lecture 19 PDF.

Homework 5 assigned. Due 4/14/2023 hw5.pdf, hw5.Rmd.

Week 12:

Day 1: MLE for graphical models

Lecture 20 PDF

Day 2: Project check-ins.

Status updates and help on projects. In person, during class time, in office RE208.

See post #82 on Campuswire: Here are the steps to complete the project assignment:

  1. sign up your team for a day to present (see below for dates);
  2. prepare the slides (or notes to write on board if that’s what you prefer) for your presentation day;
  3. prepare the project report draft, and either: (a) post your project report draft here on Campuswire so that others can comment (anonymously) about it; or (b) print at least 6 copies so you can get feedback from other students in the class on the day of your presentation. Deadline for posting/sharing drafts is your presentation day.
  4. after your presentation, take the feedback you get from others to improve your final project report as necessary. Final report is due on Friday, May 5th, at 5pm. You are more than welcome to submit your final project report to me before this deadline, as soon as you are done.

Week 13: Hidden variables

Day 1: hidden variable graphical models

Lecture 21 will be written on the white board.

Material: section 14.2 from the textbook; it revolves around a running example of relating hidden variables to instrumental variables and mixed graphs.

Day 2: hidden variables, identifiability, and model usefulness

Lecture 22 will also be written on the white board.

Material: section 14.2, and then continuing into selection of several examples/results from chapter 16.

Homework 6 assigned. Due 4/27/2023 hw6.pdf.

Week 14: Project focus and hw6 work

Day 1: project check-ins

Monday, April 17, will be used again for project check-ins, which will this time take place over Zoom. The Meeting ID information will be sent separately, and each team will sign up for 10 minutes (similar to this week in person), you can use the time to share a draft of your slides, report, any issues that come up, etc.

Day 2: Final notes on identifiability.

Week 15: Overflow & wrap-up; final presentations, part 1

Day 1: on to causal discovery with graphical models!

Since this lecture needs to be virtual, we will make it asynchronous. In lieu of the live lecture, I invite you to please watch this seminar lecture. It is aimed at a student audience. The lecture is 1 hour long. It is a nice wrap-up for causal discovery, relating directly to and building on our most recent lectures on graphical models and conditional independence.

There is both a YouTube video link and slides that you can look at.

Day 2: project presentations on Wednesday, April 26, 11:25-12:40pm

Confirmed team presentations

Finals week: final presentations, part 2

Thursday, May 4th, 2-3 pm, project presentations confirmed

finals week: Thursday, May 4, 2-4pm project presentations for 2-3 projects. Important: If nobody wants to show up during the finals week, then we need at least 2 projects to be presented on Wednesday, April 19, 11:25-12:40. If there are teams willing to do this, please just claim the date in the comments!


Useful Info

Sample files: how to create your first .Rmd file! All of your HW can be submitted using markdown and html/pdf. Here are some templates I created for another course, just so you know what to expect:

Types of files