Suppose that \(X\sim\mathcal{N}(2, 10^2)\). We sample the variable \(X\) once (i.e., we obtain a sample from the distribution \(\mathcal{N}(2, 10^2)\)).

Write R code to obtain \(P(2.1 < X < 3.1)\). Use `pnorm`

.

`pnorm(q = 3.1, mean = 2, sd = 10) - pnorm(q = 2.1, mean = 2, sd = 10)`

`## [1] 0.03980596`

*Learning goal*: compute probabilities of intervals

Write R code to obtain \(P(2.1 < X < 3.1)\). Use `pnorm(..., ,mean = 0, sd = 1)`

.

The idea here is that we can “shift” and “shrink” X using \((X-2)/10\) so that now \((X-2)/10 \sim \mathcal{N}(0, 1)\)

`pnorm(q = (3.1-2)/10, mean = 0, sd = 1) - pnorm(q = (2.1-2)/10, mean = 0, sd = 1)`

`## [1] 0.03980596`

*Learning goal*: transform normal random variables to be \(\mathcal{N}(0, 1)\)

Write R code to obtain \(P(2.1 < X < 3.1)\). Use `rnorm`

. (And not `pnorm`

.)

```
x <- rnorm(n = 100000, mean = 2, sd = 10)
mean((2.1 < x) & (x < 3.1))
```

`## [1] 0.03982`

*Learning goal*: compute probabilities via simulation. Understand the connection between samples from a distribution and the cumulative probability function.

Write R code to obtain \(P(2.1 < X < 3.1)\). Use `rnorm(..., mean = 0, sd = 1)`

Suppose 65% of Princeton students like Wawa better than World Coffee. We selected a random sample of 100 students, and asked them which they prefer. What is the probability that more than 78 students said “Wawa”?

Answer the question using `pbinom`

.

*Learning goal*: map a word problem to a cumulative probability computation, use the normal approximation to the binomial distribution.

`1 - pbinom(q = 78, size = 100, prob = .65)`

`## [1] 0.001686446`

Another option is to use the `lower.tail`

argument, but that is not preferred right now

`pbinom(q = 78, size = 100, prob = .65, lower.tail = F)`

`## [1] 0.001686446`

*Learning goal*: map a word problem to a cumulative probability computation.

Answer the question using `pnorm`

. Use the normal approximation to the Binomial distribution (recall: the mean is \(n\times prob\) and the variance is \(n\times prob\times (1-prob)\)).

`1 - pnorm(q = 78, mean = 65, sd = sqrt(.65*.35*100))`

`## [1] 0.003209814`

(Note: we are not requiring trying to use a continuity correction. To match the answer to 2(b), we’d need `q = 78.9`

)

Another option (dispreferred):

`pnorm(q = 78, mean = 65, sd = sqrt(.65*.35*100), lower.tail = F)`

`## [1] 0.003209814`

*Learning goal*: use the normal approximation to the binomial. Recognize the consequences of not using continuity correction.

Suppose 100 Princeton students we asked whether Harvard or Stanford is the worse online institution of higher learning. 60 students said that Stanford is worse. Compute the p-value for the null hypothesis that Princeton students think that Harvard and Stanford are equally bad, on average. What can you conclude?

The null hypothesis here is that \(P(Stanford) = 0.5\)

The p-value here is P(n.Stanford >= 60 or n.Stanford <= 40). We can compute that using

`pbinom(q = 40, size = 100, prob = 0.5) + (1 - pbinom(q = 59, size = 100, prob = 0.5))`

`## [1] 0.05688793`

We would see a value that’s as extreme as what we’re seeing 5.6% of the time. This suggests that the data we have is consistent with Princeton students thinking that Harvard and Stanford are equally bad online institutions of higher learning.

Answer Problem 2 using only `rnorm(..., mean = 0, sd = 1)`

```
x <- rnorm(n = 100000, mean = 0, sd = 1)
# Now, 65 + x*sqrt(.65*.35*100) ~ N(65, sqrt(.65*.35*100)^2)
y <- 65 + x*sqrt(.65*.35*100)
mean(y > 78)
```

`## [1] 0.00317`

*Learning goal*: compute probability via simulation; flexibly apply variable transformation