Some illustrative examples in nonlinear statistics
Lecture plan
Previous lecture we provides that we will continue now with the following topics:
- Probability Primer (Chapter 2) and
- Conditional Independence (Chapter 4)
Some probability and a bit of stats
Topics:
- discrete probability distributions: basic notation.
- The model of independence.
- Limiting distributions - see section 2.3 and 2.5 in the book.
Lecture on board.
Topics covered in lecture 2:
- discrete and continuous random variables
- definition of a statistical model, including parameter space,
implicit/explicit(parametric) description of the same model
- two views of the same object.
- probability simplex, dimensions of the space we live in (the number of stats of discrete random varaibles dictates everything, doesn’t it?:)
- review of conditional probability, independent events, independent random variables
- Examples: the Bernoulli model, the independence model for two random
variables, the model for a count of the number of heads in two coin
tosses
- explicit computation of joint probabilities,
- a peek into implicit description - lead-in to homework 1 questions 2&3,
- a hint as to how this might generalize to other (higher-dimensional) models.
Why do we care about distributions?
Who cares about model fitting and testing whether we have the correct model in the first place?
Why do I have to understand a model?
Simulation of a coin toss
Let \(x\) be the random variable recording the outcome of a coin toss:
- \(x_i=0\) if we see Tail on the \(i\)-th trial (toss),
- \(x_i=1\) if we see heads on the \(i\)-th trial.
Fix \(n=10000\).
\(Y\) = the number of heads.
- Is the number of heads supposed to be \(n/2\)? How far off is it? Does it vary? What does this mean?
Sampling distribution of \(Y\) appears to have a mean around the expected number of heads when a fair coin is tossed, which is about \(n/2\).
The more times we repeat the experiment of \(n\) coin tosses, the closer \(Y\) gets to its expected value – this can be measured by looking at both the mean and the variance of Y.
| Means of Y |
|---|
| 5007.900 |
| 4998.540 |
| 4999.927 |
| Vars of Y |
|---|
| 2268.008 |
| 2268.008 |
| 2468.887 |
Question:
is it possible that something similar to this always happens?
- As we will see, the sampling distribution of \(Y\) is approximately normal with mean equal to the expected value of \(X\).
- In other words, the example above illustrates a known result–the Central Limit Theorem, .
- You should already be familiar with it from your probability class.
Importance of sampling distributions
- about the model behind the data ;
- they give a glimpse into how it was generated.
Example
Min. 1st Qu. Median Mean 3rd Qu. Max.
-2.583 1.067 10.566 21.754 39.824 42.816
Hmm…
- Is it strange to see “two bumps” in the histogram instead of one, as usual?
- Maybe the sample size is too small, we need to simulate more data?
AhaMoment!
What do you see?
- This data is not being drawn from anything like a normal distribution.
- Consequently, knowing simply the mean and the variance … is not enough to understand the data, that is, the data-generating mechanism behind it.
… Wait, what was that?!
This was an example of a mixture of normal distributions. We will see mixture distributions again in the course, soon. (See also link in “License” page at the end of these slides, which includes source information.)
License
This document is created for Math/Stat 561, Spring 2023, at Illinois Tech.
While the course materials are generally not to be distributed outside the course without permission of the instructor, all materials posted on this page are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.