UC San Diego
COGS 137 - Fall 2024
2024-10-15
ggplot2
Q: I’m curious about the differences between base r pipe (|>) and the %>% operators! I’ve only learned about %>% and it’d be interesting looking at the differences between using either one.
A: There’s a blog post for that!
Q: The most confusing part of this lecture was getting started with dplyr functions, but it will become easier with practice.
A: Lots of people said something similar. And, this is definitely the right attitude! Lecture is for first exposure. Practice in lecture is to start underestanding. Labs are for guided practice where you have more time to practice. Case studies and homeworks are where we check our understanding! So, you’re not supposed to “get it” all the first time you see it in lecture.
Due Dates:
Notes:
Note: this is the code from the end of 03-dplyr notes, combined into a single chunk.
WB <- WB |>
mutate(Treatment = fct_recode(Treatment,
"5.9% THC (low dose)" = "5.90%",
"13.4% THC (high dose)" = "13.40%"),
Treatment = fct_relevel(Treatment, "Placebo", "5.9% THC (low dose)")) |>
janitor::clean_names() |>
rename(thcoh = x11_oh_thc,
thccooh = thc_cooh,
thccooh_gluc = thc_cooh_gluc,
thcv = thc_v) |>
mutate(timepoint = case_when(time_from_start < 0 ~ "pre-smoking",
time_from_start > 0 & time_from_start <= 30 ~ "0-30 min",
time_from_start > 30 & time_from_start <= 70 ~ "31-70 min",
time_from_start > 70 & time_from_start <= 100 ~ "71-100 min",
time_from_start > 100 & time_from_start <= 180 ~ "101-180 min",
time_from_start > 180 & time_from_start <= 210 ~ "181-210 min",
time_from_start > 210 & time_from_start <= 240 ~ "211-240 min",
time_from_start > 240 & time_from_start <= 270 ~ "241-270 min",
time_from_start > 270 & time_from_start <= 300 ~ "271-300 min",
time_from_start > 300 ~ "301+ min"))
Why are there two mutates? Could they have all been in a single mutate?
WB |>
filter(timepoint=="0-30 min") |>
ggplot(., mapping = aes(x = thc, y = thccooh,
color = treatment)) +
geom_point() +
labs(title = "THC and THC-COOH levels (0-30 min)",
subtitle = "THC levels remain low in placebo group; THC-COOH is variable",
x = "THC (ng/mL)", y = "THC-COOH (ng/mL)",
color = "Treatment Group") +
scale_color_viridis_d()
Start with the
WB
data frame (filtering to only include first timepoint)
Start with the
WB
data frame, map thc levels to the x-axis
Start with the
WB
data frame, map thc levels to the x-axis and map thccooh levels to the y-axis.
Start with the
WB
data frame, map thc levels to the x-axis and map thccooh levels to the y-axis. Represent each observation with a point.
Start with the
WB
data frame, map thc levels to the x-axis and map thccooh levels to the y-axis. Represent each observation with a point and map treatment group to the color of each point.
Start with the
WB
data frame, map thc levels to the x-axis and map thccooh levels to the y-axis. Represent each observation with a point and map treatment group to the color of each point. Title and subtitle the plot.
Start with the
WB
data frame, map thc levels to the x-axis and map thccooh levels to the y-axis. Represent each observation with a point and map treatment group to the color of each point. Title and subtitle the plot, label the x and y axes
Start with the
WB
data frame, map thc levels to the x-axis and map thccooh levels to the y-axis. Represent each observation with a point and map treatment group to the color of each point. Title and subtitle the plot, label the x and y axes, and title the legend.
WB |>
filter(timepoint=="0-30 min") |>
ggplot(., mapping = aes(x = thc, y = thccooh,
color = treatment)) +
geom_point() +
labs(title = "THC and THC-COOH levels (0-30 min)",
subtitle = "THC levels remain low in placebo group; THC-COOH is variable",
x = "THC (ng/mL)", y = "THC-COOH (ng/mL)",
color = "Treatment Group")
Start with the
WB
data frame, map thc levels to the x-axis and map thccooh levels to the y-axis. Represent each observation with a point and map treatment group to the color of each point. Title and subtitle the plot, label the x and y axes, and title the legend. Finally, use a discrete color scale that is designed to be perceived by viewers with common forms of color blindness.
WB |>
filter(timepoint=="0-30 min") |>
ggplot(., mapping = aes(x = thc, y = thccooh,
color = treatment)) +
geom_point() +
labs(title = "THC and THC-COOH levels (0-30 min)",
subtitle = "THC levels remain low in placebo group; THC-COOH is variable",
x = "THC (ng/mL)", y = "THC-COOH (ng/mL)",
color = "Treatment Group") +
scale_color_viridis_d()
WB |>
filter(timepoint=="0-30 min") |>
ggplot(., mapping = aes(x = thc, y = thccooh,
color = treatment)) +
geom_point() +
labs(title = "THC and THC-COOH levels (0-30 min)",
subtitle = "THC levels remain low in placebo group; THC-COOH is variable",
x = "THC (ng/mL)", y = "THC-COOH (ng/mL)",
color = "Treatment Group") +
scale_color_viridis_d()
Start with the WB
data frame, map thc levels to the x-axis and map thccooh levels to the y-axis.
Represent each observation with a point and map treatment group to the color of each point.
Title and subtitle the plot, label the x and y axes, and title the legend.
Finally, use a discrete color scale that is designed to be perceived by viewers with common forms of color blindness.
Tip
You can omit the names of first two arguments when building plots with ggplot()
.
Generate a basic plot in ggplot2
using different filtering and/or variables than those in the last example (last example: thc
& thccoooh
, “0-30 min” timepoint).
Commonly used characteristics of plotting characters that can be mapped to a specific variable in the data are
color
shape
size
alpha
(transparency)Mapped to a different variable than treatment
Mapped to same variable as color
aes()
geom_*()
(this was geom_point()
in the previous example, but we’ll learn about other geoms soon!)Edit the basic plot you created earlier to change something about its aesthetics.
facet_grid
2d grid; `rows ~ cols
facet_wrap
“1d ribbon wrapped according to number of rows and columns specified or available plotting area”
geom
sgeom
sgeom 1 |
Description 2 |
---|---|
geom_point |
scatterplot |
geom_bar |
barplot |
geom_line |
line plot |
geom_density |
densityplot |
geom_histogram |
histogram |
geom_boxplot |
boxplot |
Generate a plot in ggplot2
using a different geom
than what you did previously. Customize as much as you can before time is “up.”
ggplot2
?ggplot2
code? Can I create plots using ggplot2
?geom
is and do I know the basic plots available?