replicate
Recall that this is how you can use replicate
to repeatdly run the same experiment.
res <- replicate(10, sample(c(1, 2, 3, 4)))
res
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 3 2 1 4 3 2 1 1 4 4
## [2,] 1 4 3 1 2 4 3 4 3 1
## [3,] 2 1 4 3 1 1 4 3 1 3
## [4,] 4 3 2 2 4 3 2 2 2 2
Here, we ran sample(c(1, 2, 3, 4)
10 times. Each column represents a result of an experiment. You will usually just obtain a single number from one experiment. Here is an example:
replicate(10, mean(sample(c(1, 2, 3, 4), size = 2)))
## [1] 1.5 2.5 1.5 2.0 2.5 1.5 2.5 2.0 1.5 1.5
Repeatedly sample a training set of size 25 from titanic
, and create two histograms: one for the performances (i.e., CCRs) on the training set, and one for the performances (i.e., CCRs) on the test set. You should use ggplot
’s geom_histogram
geom.
The Precept 5 solutions (see the linked videos as well if you like) should be helpful, as should the code from the Tuesday lecture.
Here is a suggestion for how to proceed:
First, write a function that samples a small training set, fits a model on it, and computes the performance on the small training set as well as the validation set.
Second, use something like sample(1000, func(arg1, arg2))
to get a matrix that’s like the matrix in the Precept 5 solution
Observe that you get the same kind of thing as what we got in Precept 5
Make histograms (rather than curves, as in Precept 5)
Work on your project. If your project partner is not the same as your precept partner, please work separately.