---
title: "Precept 6 Problem Set"
output:
  html_document:
    df_print: paged
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

### Problem 1:  `replicate`

Recall that this is how you can use `replicate` to repeatdly run the same experiment.

```{r}
res <- replicate(10, sample(c(1, 2, 3, 4)))
res
```


Here, we ran `sample(c(1, 2, 3, 4)` 10 times. Each column represents a result of an experiment. You will usually just obtain a single number from one experiment. Here is an example:

```{r}
replicate(10, mean(sample(c(1, 2, 3, 4), size = 2)))

```


Repeatedly sample a training set of size 25 from `titanic`, and create two histograms: one for the performances (i.e., CCRs) on the training set, and one for the performances (i.e., CCRs) on the test set. You should use `ggplot`'s `geom_histogram` geom.


#### Hints and suggestions

The [Precept 5 solutions](http://guerzhoy.mycpanel.princeton.edu/201s19/pre/P5/p5_soln.html) (see the linked [videos](http://guerzhoy.mycpanel.princeton.edu/201s19/index.html#assignments) as well if you like) should be helpful, as should [the code](http://guerzhoy.mycpanel.princeton.edu/201s19/lec/W07/lec1/probintro.R) from the Tuesday lecture.

Here is a suggestion for how to proceed:

* First, write a function that samples a small training set, fits a model on it, and computes the performance on the small training set as well as the validation set.

* Second, use something like `sample(1000, func(arg1, arg2))` to get a matrix that's like the matrix in the Precept 5 solution

* Observe that you get the same kind of thing as what we got in Precept 5

* Make histograms (rather than curves, as in Precept 5)

### Problem 2

Work on your project. If your project partner is not the same as your precept partner, please work separately.