What is a pirateplot()?
A pirateplot, is the RDI (Raw data,
Descriptive statistics, and
Inferential statistics) plotting choice of R pirates
who are displaying the relationship between 1 to 3 categorical
independent variables, and one continuous dependent variable.
A pirateplot has 4 main elements
- points, symbols representing the raw data (jittered
horizontally)
- bar, a vertical bar showing central tendencies
- bean, a smoothed density (inspired by Kampstra et al. (2008)) representing a smoothed
density
- inf, a rectangle representing an inference interval (e.g.; Bayesian
Highest Density Interval or frequentist confidence interval)
data:image/s3,"s3://crabby-images/93b54/93b54dfcaf5da72fb04718769e189541b0c39424" alt=""
Main arguments
Here are the main arguments to pirateplot()
Main Pirateplot Arguments
formula |
A formula |
height ~ sex + eyepatch, weight ~ Time |
data |
A dataframe |
pirates, ChickWeight |
main |
Plot title |
‘Pirate heights’, ’Chicken Weights |
pal |
A color palette |
‘xmen’, ‘black’ |
theme |
A plotting theme |
0, 1, 2 |
inf |
Type of inference |
‘ci’, ‘hdi’, ‘iqr’ |
Themes
pirateplot()
currently supports three themes which
change the default look of the plot. To specify a theme, use the
theme
argument:
Theme 1
theme = 1
is the default
# Theme 1 (the default)
pirateplot(
formula = weight ~ Time,
data = ChickWeight,
theme = 1,
main = "theme = 1"
)
data:image/s3,"s3://crabby-images/b44bc/b44bc9d2c512b1004d2fd6e48d4db58c64079ba9" alt=""
Theme 2
Here is theme = 2
# Theme 2
pirateplot(
formula = weight ~ Time,
data = ChickWeight,
theme = 2,
main = "theme = 2"
)
data:image/s3,"s3://crabby-images/b3b27/b3b27bcfb07f2a739433b14b03b5e4cb5e71498f" alt=""
Theme 3
And now…theme = 3
!
# Theme 3
pirateplot(
formula = weight ~ Time,
data = ChickWeight,
theme = 3,
main = "theme = 3"
)
data:image/s3,"s3://crabby-images/f65f5/f65f50ff6d01d506363c5b2f8d1e34a868daad6c" alt=""
Theme 4
theme = 4
tries to maintain a classic barplot look (but
with added raw data).
# Theme 4
pirateplot(
formula = weight ~ Time,
data = ChickWeight,
theme = 4,
main = "theme = 4"
)
data:image/s3,"s3://crabby-images/10932/10932a379d78aab92b761825702a0c8eb26910f0" alt=""
Theme 0
theme = 0
allows you to start a pirateplot from scratch
– that is, it turns of all elements. You can then selectively
turn elements on with individual arguments (e.g.; bean.f.o
,
point.o
)
# Default theme
pirateplot(
formula = weight ~ Time,
data = ChickWeight,
theme = 0,
main = "theme = 0\nStart from scratch"
)
data:image/s3,"s3://crabby-images/82ece/82ecec72452b019a581b7879507c5340365cbccd" alt=""
Color palettes
You can specify a general color palette using the pal
argument. You can do this in two ways.
The first way is to specify the name of a color palette in the
piratepal()
function. Here they are:
data:image/s3,"s3://crabby-images/15422/1542227abaa370f53696165c91cd01538a009bef" alt=""
For example, here is a pirateplot using the "pony"
palette
pirateplot(
formula = weight ~ Time,
data = ChickWeight,
pal = "pony",
theme = 1,
main = "pony color palette"
)
data:image/s3,"s3://crabby-images/32454/32454f364da8dab9d47b1518bf2dac491d6a95fa" alt=""
The second method is to simply enter a vector of one or more colors.
Here, I’ll create a black and white pirateplot from theme 2 by
specifying pal = 'black'
pirateplot(
formula = weight ~ Time,
data = ChickWeight,
theme = 2,
pal = "black",
main = "pal = 'black"
)
data:image/s3,"s3://crabby-images/d2cd7/d2cd70395bb3fbbf7b550fd848ebfa2442c43cf7" alt=""
Customising elements
Regardless of the theme you use, you can always customize the color
and opacity of graphical elements. To do this, specify one of the
following arguments. Note: Arguments with .f.
correspond to
the filling of an element, while .b.
correspond to
the border of an element:
Customising plotting elements
points |
point.col, point.bg |
point.o |
beans |
bean.f.col, bean.b.col |
bean.f.o, bean.b.o |
bar |
bar.f.col, bar.b.col |
bar.f.o, bar.b.o |
inf |
inf.f.col, inf.b.col |
inf.f.o, inf.b.o |
avg.line |
avg.line.col |
avg.line.o |
For example, I could create the following pirateplots using
theme = 0
and specifying elements explicitly:
pirateplot(
formula = weight ~ Time,
data = ChickWeight,
theme = 0,
main = "Fully customized pirateplot",
pal = "southpark", # southpark color palette
bean.f.o = .6, # Bean fill
point.o = .3, # Points
inf.f.o = .7, # Inference fill
inf.b.o = .8, # Inference border
avg.line.o = 1, # Average line
bar.f.o = .5, # Bar
inf.f.col = "white", # Inf fill col
inf.b.col = "black", # Inf border col
avg.line.col = "black", # avg line col
bar.f.col = gray(.8), # bar filling color
point.pch = 21,
point.bg = "white",
point.col = "black",
point.cex = .7
)
data:image/s3,"s3://crabby-images/d77b0/d77b0a98a9c6120272db607b54e441353807bb75" alt=""
If you don’t want to start from scratch, you can also start with a
theme, and then make selective adjustments:
pirateplot(
formula = weight ~ Time,
data = ChickWeight,
main = "Adjusting an existing theme",
theme = 2, # Start with theme 2
inf.f.o = 0, # Turn off inf fill
inf.b.o = 0, # Turn off inf border
point.o = .2, # Turn up points
bar.f.o = .5, # Turn up bars
bean.f.o = .4, # Light bean filling
bean.b.o = .2, # Light bean border
avg.line.o = 0, # Turn off average line
point.col = "black" # Black points
)
data:image/s3,"s3://crabby-images/f6eba/f6eba261690df077eb5fdcee992dae02085fc25f" alt=""
Just to drive the point home, as a barplot is a special case of a
pirateplot, you can even reduce a pirateplot into a horrible
barplot:
pirateplot(
formula = weight ~ Time,
data = ChickWeight,
main = "Reducing a pirateplot to a barplot",
theme = 0, # Start from scratch
bar.f.o = .7
) # Just turn on the bars
data:image/s3,"s3://crabby-images/7679e/7679e40be6248de584f45be49afbb11a0094a2f2" alt=""
Additional arguments
There are several more arguments that you can use to customize your
plot:
Additonal pirateplot elements
Background color |
back.col |
back.col = ‘gray(.9, .9)’ |
Gridlines |
gl.col, gl.lwd, gl.lty |
gl.col = ‘gray’, gl.lwd = c(.75, 0), gl.lty = 1 |
y-axis Locations |
yaxt.y |
yaxt.y = seq(0, 400, 50) |
Quantiles |
quant, quant.lwd, quant.col |
quant = c(.1, .9), quant.lwd = 1, quant.col =
‘black’ |
Average line |
avg.line.fun |
avg.line.fun = median |
Inference Calculation |
inf.method |
inf.method = ‘hdi’, inf.method = ‘ci’ |
Inference Display |
inf.disp |
inf.disp = ‘line’, inf.disp = ‘bean’, inf.disp =
‘rect’ |
Here’s an example using a background color, and quantile lines.
pirateplot(
formula = weight ~ Time,
data = ChickWeight,
main = "Adding quantile lines and background colors",
theme = 2,
back.col = gray(.98), # Add light gray background
gl.col = "gray", # Gray gridlines
gl.lwd = c(.75, 0), # Gridline widths (alternating)
inf.f.o = .6, # Turn up inf filling
inf.disp = "bean", # Wrap inference around bean
bean.b.o = .4, # Turn down bean borders
quant = c(.1, .9), # 10th and 90th quantiles
quant.col = "black", # Black quantile lines
yaxt.y = seq(0, 400, 50) # Locations of y-axis tick marks
)
data:image/s3,"s3://crabby-images/483bb/483bb91bef6f205245d9a7bf1e633295264fbcd8" alt=""
Multiple IVs
You can use up to 3 categorical IVs in your plot. Here are some
examples:
pirateplot(
formula = height ~ sex + eyepatch,
data = pirates,
theme = 2,
inf.disp = "bean"
)
data:image/s3,"s3://crabby-images/06b33/06b33f4fe6363fcedeabd15648dc077f5df6d625" alt=""
beside = FALSE
# Same as before, but with second IV on different plots by including beside = FALSE
pirateplot(
formula = height ~ sex + eyepatch,
data = pirates,
theme = 2,
beside = FALSE,
inf.disp = "bean"
)
data:image/s3,"s3://crabby-images/19fda/19fdaa1b289bc86e6197e4c65e435e8f0ee66f83" alt=""
# Same as before, but with second IV on different plots by including beside = FALSE
pirateplot(
formula = weight ~ Time + Diet,
data = ChickWeight,
theme = 2,
beside = FALSE
)
data:image/s3,"s3://crabby-images/a4b49/a4b49c54ad4c1eba32d0c396a1f2a37bfb8dfead" alt=""
If you use 3 ivs, values of the second iv will be beside each
other.
pirateplot(
formula = height ~ sex + eyepatch + headband,
data = pirates,
theme = 2,
inf.disp = "bean"
)
data:image/s3,"s3://crabby-images/5b258/5b258f867556bfef17bbb0a4af8476bf1e712e4a" alt=""
Output
If you include the plot = FALSE
argument to a
pirateplot, the function will return some values associated with the
plot.
times.pp <- pirateplot(
formula = time ~ sequel + genre,
data = subset(
movies,
genre %in% c("Action", "Adventure", "Comedy", "Horror") &
rating %in% c("G", "PG", "PG-13", "R") &
time > 0
),
plot = FALSE
)
Here’s the result. The most interesting element is
$summary
which shows summary statistics for each bean:
## $summary
## sequel genre bean.num n avg inf.lb inf.ub
## 1 0 Action 1 233 114.73391 112.43863 117.0773
## 2 1 Action 2 80 120.47500 116.22305 123.9434
## 3 0 Adventure 3 206 106.36408 103.43893 109.1944
## 4 1 Adventure 4 78 118.64103 112.26745 124.9704
## 5 0 Comedy 5 400 102.01500 100.77308 102.9995
## 6 1 Comedy 6 51 101.21569 98.14873 103.9931
## 7 0 Horror 7 79 102.13924 98.47381 105.3967
## 8 1 Horror 8 23 97.65217 92.80024 101.7407
##
## $avg.line.fun
## [1] "mean"
##
## $inf.method
## [1] "hdi"
##
## $inf.p
## [1] 0.95
Specifying individual point parameters
If you want to customize the look of individual points in the plot,
you can specify them with an additional pointpars
argument.
This should be a dataframe with the same number of rows as
data
, and column names in the set col, bg, pch, labels,
jitter.
For example, here is a plot with additional point labels and custom
jitter values:
# Create a smaller version of the ChickWeight data
ChickWeight2 <- ChickWeight[sample(nrow(ChickWeight), size = 100), ]
# Define labels, jitter values, and colors for all individual points
pointpars <- data.frame(
"labels" = 1:100,
"jitter" = rep(c(-.1, .1),
length.out = 100
),
"col" = sample(colors(),
size = 100,
replace = TRUE
)
)
# Createpireatplot with custom point parameters
pirateplot(
formula = weight ~ Diet,
data = ChickWeight2,
pointpars = pointpars,
point.o = 1
)
Contribute!
I am very happy to receive new contributions and suggestions to
improve the pirateplot. If you come up a new theme (i.e.; customization)
that you like, or have a favorite color palette that you’d like to have
implemented, please contact me ([email protected]) or post an issue at
www.github.com/ndphillips/yarrr/issues
and I might include it in a future update.
References
The pirateplot is really a knock-off of the great beanplot package
and visualization from Kampstra et al.
(2008).
Kampstra, Peter et al. 2008. “Beanplot: A Boxplot Alternative for
Visual Comparison of Distributions.” Journal of Statistical
Software 28 (1): 1–9.