---
title: "SML201 Precept 7 Solutions, Spring 2020"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
```
### Problem 3
```{r}
titanic <- read.csv("http://guerzhoy.princeton.edu/201s20/titanic.csv")
```
*Learning goals*
* Understand calibration and FPR parity
* Be able to implement checks for calibration and quantities such as FPR/PPV/etc. parity
#### False Positive Parity
```{r}
get.FPR <- function(fit, data){
dat.neg <- data %>% filter(Survived == 0)
pred <- predict(fit, newdata = dat.neg, type = "response") > 0.5
mean(pred == 1)
}
fit <- glm(Survived ~ Sex + Age + Pclass, family = binomial, data = titanic)
FPR.m <- get.FPR(fit, titanic %>% filter(Sex == "male"))
FPR.f <- get.FPR(fit, titanic %>% filter(Sex == "female"))
data.frame(FPR.m = FPR.m, FPR.f = FPR.f)
```
The false positive rates vary quite dramatically between female and male passengers. There is no false positive parity.
#### Calibration
```{r}
get.PPV <- function(fit, data){
data$pred <- predict(fit, newdata = data, type = "response") > 0.5
data.predpos <- data %>% filter(pred == 1)
mean(data.predpos$Survived == 1)
}
get.NPV <- function(fit, data){
data$pred <- predict(fit, newdata = data, type = "response") > 0.5
data.predneg <- data %>% filter(pred == 0)
mean(data.predneg$Survived == 0)
}
fit <- glm(Survived ~ Sex + Age + Pclass, family = binomial, data = titanic)
PPV.m <- get.PPV(fit, titanic %>% filter(Sex == "male"))
PPV.f <- get.PPV(fit, titanic %>% filter(Sex == "female"))
NPV.m <- get.NPV(fit, titanic %>% filter(Sex == "male"))
NPV.f <- get.NPV(fit, titanic %>% filter(Sex == "female"))
data.frame(sex = c("m", "f"), PPV = c(PPV.m, PPV.f), NPV = c(NPV.m, NPV.f))
```
Calibration is also not satisfied.
#### Extra material
We can try to see whether calibration is approximately satisfied by looking at the "decile scores" that we can obtain for the Titanic data:
```{r}
fit <- glm(Survived ~ Sex + Age + Pclass, family = binomial, data = titanic)
titanic$decile <- round(predict(fit, newdata = titanic, type = "response")*10)
titanic.surv <- titanic %>% group_by(decile, Sex) %>%
summarize(Surv.Rate = mean(Survived), Surv.Total = sum(Survived), n = n()) %>%
rowwise() %>%
mutate(surv.min = binom.test(Surv.Total, n, p = Surv.Rate)$conf.int[1],
surv.max = binom.test(Surv.Total, n, p = Surv.Rate)$conf.int[2]) %>%
ungroup() %>%
group_by(decile) %>%
filter(n() > 1)
ggplot(titanic.surv, mapping = aes(x = decile, y = Surv.Rate, fill = Sex)) +
geom_bar(stat = "identity", position = "dodge") +
geom_errorbar(mapping = aes(ymin = surv.min, ymax = surv.max), position = "dodge") +
scale_x_continuous(breaks = c(0:10))
```
We'd expect equal survival probabilities for the same deciles if there is calibration. As you can see, that is not exactly the case; furthermore, there are no male passangers at all in the top three deciles.
We can try again, without including sex as a predictor:
```{r}
fit <- glm(Survived ~ Sex + Age + Pclass, family = binomial, data = titanic)
titanic$decile <- round(predict(fit, newdata = titanic, type = "response")*10)
titanic.surv <- titanic %>% group_by(decile, Sex) %>%
summarize(Surv.Rate = mean(Survived), Surv.Total = sum(Survived), n = n()) %>%
rowwise() %>%
mutate(surv.min = binom.test(Surv.Total, n, p = Surv.Rate)$conf.int[1],
surv.max = binom.test(Surv.Total, n, p = Surv.Rate)$conf.int[2]) %>%
ungroup() %>%
group_by(decile) %>%
filter(n() > 1)
ggplot(titanic.surv, mapping = aes(x = decile, y = Surv.Rate, fill = Sex)) +
geom_bar(stat = "identity", position = "dodge") +
geom_errorbar(mapping = aes(ymin = surv.min, ymax = surv.max), position = "dodge") +
scale_x_continuous(breaks = c(0:10))
```
We can't reject the idea that there is calibration here as well. But since sex is a predictor of survival, we'd expect that there would be no calibration.