Explore more advanced data visualization techniques with ggplot2.
Using the penguins dataset, create a faceted plot showing the relationship between flipper length and body mass for different species.
Experiment with different themes and color palettes to enhance the visual appeal of your plots.
Key Visualization Techniques Used
1. Faceting (facet_wrap
) 🧩
The core technique is faceting, which breaks the data into subsets and displays each subset in its own plot panel.
facet_wrap(~ island, ncol = 2)
: This creates a separate panel for each unique value in the island
column (Torgersen, Biscoe, Dream). This allows for a quick visual comparison of the Flipper Length/Body Mass relationship across the different geographic colonies.
Note: While the prompt asked to facet by species
, faceting by island
while coloring by species
provides a richer, three-way comparison, revealing how species distribution and relationships differ geographically.
2. Multi-Layer Geoms (geom_point
and geom_smooth
) 📊
Two distinct geometric layers are stacked to show both the raw data and the trend:
geom_point(alpha = 0.7, size = 3)
: Visualizes the raw data, mapping the aesthetic variables (color
, shape
) to the species
column defined in the global aes()
. The alpha
setting ensures points that overlap are still visible.
geom_smooth(method = "lm", se = FALSE, linewidth = 1.2)
: Overlays a Linear Model (LM) regression line. This is powerful for quantitative risk analysis as it immediately visualizes the general trend: as flipper length increases, body mass also increases. Removing the standard error (se = FALSE
) declutters the visual.
3. Custom Color Palette (scale_color_viridis_d
) 🌈
Instead of the default ggplot2 colors, a custom scale is used:
scale_color_viridis_d(option = "D", ...)
: The Viridis color palettes are perceptually uniform, meaning the colors maintain the same perceived brightness across the spectrum. This is ideal for categorical data (the _d
suffix indicates discrete data) as it ensures no single species appears artificially more prominent or lighter than the others.
4. Custom Themeing (theme_minimal
and theme(...)
) 🖼️
The plot starts with theme_minimal()
for a clean, white background and then uses the theme()
function for fine-grained control:
Titles: Setting plot.title = element_text(face = "bold", size = 16, hjust = 0.5)
centers the title and makes it bold for emphasis.
Facet Labels: strip.text = element_text(face = "bold", size = 10)
ensures the facet labels (the island names) stand out clearly within each panel.
Sample Answer
Advanced ggplot2 Visualization: Penguin Flipper Length vs. Body Mass
The following code uses the penguins
dataset (available via the palmerpenguins
package) and demonstrates faceting, geom layers, and custom aesthetics to enhance the visualization.
R Code for the Faceted Plot
R# Assuming you have the palmerpenguins and ggplot2 libraries loaded
library(ggplot2)
library(palmerpenguins)
# Create the advanced faceted plot
ggplot(
data = penguins,
mapping = aes(
x = flipper_length_mm,
y = body_mass_g,
color = species, # Map species to color
shape = species # Map species to shape
)
) +
# 1. Add Scatter Plot Points
geom_point(alpha = 0.7, size = 3) +
# 2. Add Regression Line (Linear Model) for the Trend
# The 'se = FALSE' removes the standard error ribbon for a cleaner look.
geom_smooth(method = "lm", se = FALSE, linewidth = 1.2) +
# 3. Faceting: Separate plots by 'island'
# 'scales = "free_x"' allows the x-axis limits to vary for each island,
# but we'll stick to 'fixed' for direct comparison here.
facet_wrap(~ island, ncol = 2) +