titanic <- read.csv("http://guerzhoy.princeton.edu/201s20/titanic.csv")
A more typical situation is plotting the histogram of a continous variable like age.
ggplot(data = titanic, mapping = aes(x = Age)) +
geom_histogram(bins = 10)
Varying the number of bins allows us to display the data more appropriately: too many bins means we’ll see patterns that aren only there because the sample size is too small; too few bins means we won’t see trends that are actually in the data.
ggplot(data = titanic, mapping = aes(x = Age)) +
geom_histogram(bins = 100)
ggplot(data = titanic, mapping = aes(x = Age)) +
geom_histogram(bins = 3)
We can display overlapping histograms. We specify alpha = 0.4
to indicate that the histograms are partially transparent.
ggplot(data = titanic, mapping = aes(x = Age, fill = Sex)) +
geom_histogram(alpha = 0.4, bins = 10, position = "identity")
Note that the default position is "stack"
.
ggplot(data = titanic, mapping = aes(x = Age, fill = Sex)) +
geom_histogram(alpha = 0.4, bins = 10, position = "stack")
Here is the same histogram with position "dodge"
ggplot(data = titanic, mapping = aes(x = Age, fill = Sex)) +
geom_histogram(alpha = 0.4, bins = 10, position = "dodge")