### Problem 1

In this problem, you will work on reproducing Figure 1 from the Generalizability Crisis paper. You will use the lme4 package.

#### Problem 1(a)

Generate “fake” data using the naive model:

$y_{ij} = \beta_0 + \beta_1 X_{ij} + e_{ij}$ $e_{ij} \sim \mathcal{N}(0, \sigma_{e}^{2})$

Use simulation to show that, for a reasonable sample size, you will usually not reject the null hypothesis if it is true, and you usually will reject the null hypothesis if it is false.

Decide on a reasonable effect size.

(Suggestion: make a function that generates a dataset, runs the regression, and computes the p-balue. Then use replicate in order to run the experiment multiple times).

#### Problem 1(b)

Generate data as in 1(a), but now model the subject effects as random:

$y_{ij} = \beta_0 + \beta_1 X_{ij} + u_{0i} + u_{1i} X_{ij} + e_{ij}$ $u_{0i} \sim \mathcal{N}(0, \sigma_{u_0}^2)$ $u_{1i} \sim \mathcal{N}(0, \sigma_{u_1}^2)$ $e_{ij} \sim \mathcal{N}(0, \sigma_{e}^{2})$

What is the effect on the probability of rejecting a true null hypothesis and on the probability of not rejecting a false null hypothesis, compared to what you saw in 1(a)?

Under what circumstances would it be inappropriate to use the random-effects model for data that looks like it was generated using the procedure in 1(a)? Under what circumstances would it be appropriate?

#### Problem 1(c)

Now, generate the data from the random effects model. Obtain a visualization similar to Figure 1 in the Yarkoni paper. (Note that Yarkoni displays HPD intervals rather than CIs, but you should be visualizing CIs)

### Problem 2

In this problem, you will reanalyze Table 3 from a recent Proceedings of the National Academy of Sciences paper. Regenerate a data set with the same summary statistics as what you see in Table 3.

Note that ehe paper reports that the data was obtained for 149 students. Make reasonable assumptions about other things. Note that Feeling of Learning (FOL) and Test of Learning (TOL) are reported as normalized (i.e., their distribution is $$\mathcal{N}(0, 1)$$ overall).

Re-analyze your generated data using linear regression to confirm that you get about the same coefficients and comparable p-values. Now, re-analyze the data using a random-effects model, assuming that the effect on the student TOL and FOL is random. (Note that this analysis is conservative, since we are generating the data from a fixed-effect model).