---
title: "Precept 6 Problem Set"
output:
html_document:
df_print: paged
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
### Problem 1: `replicate`
Recall that this is how you can use `replicate` to repeatdly run the same experiment.
```{r}
res <- replicate(10, sample(c(1, 2, 3, 4)))
res
```
Here, we ran `sample(c(1, 2, 3, 4)` 10 times. Each column represents a result of an experiment. You will usually just obtain a single number from one experiment. Here is an example:
```{r}
replicate(10, mean(sample(c(1, 2, 3, 4), size = 2)))
```
Repeatedly sample a training set of size 25 from `titanic`, and create two histograms: one for the performances (i.e., CCRs) on the training set, and one for the performances (i.e., CCRs) on the test set. You should use `ggplot`'s `geom_histogram` geom.
#### Hints and suggestions
The [Precept 5 solutions](http://guerzhoy.mycpanel.princeton.edu/201s19/pre/P5/p5_soln.html) (see the linked [videos](http://guerzhoy.mycpanel.princeton.edu/201s19/index.html#assignments) as well if you like) should be helpful, as should [the code](http://guerzhoy.mycpanel.princeton.edu/201s19/lec/W07/lec1/probintro.R) from the Tuesday lecture.
Here is a suggestion for how to proceed:
* First, write a function that samples a small training set, fits a model on it, and computes the performance on the small training set as well as the validation set.
* Second, use something like `sample(1000, func(arg1, arg2))` to get a matrix that's like the matrix in the Precept 5 solution
* Observe that you get the same kind of thing as what we got in Precept 5
* Make histograms (rather than curves, as in Precept 5)
### Problem 2
Work on your project. If your project partner is not the same as your precept partner, please work separately.