--- title: "pirateplot" author: "Nathaniel Phillips" date: "`r Sys.Date()`" output: rmarkdown::html_vignette bibliography: yarrr.bib vignette: > %\VignetteIndexEntry{pirateplot: The plotting choice of R pirates} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- # What is a pirateplot()? A pirateplot, is the RDI (**Raw** data, **Descriptive** statistics, and **Inferential** statistics) plotting choice of R pirates who are displaying the relationship between 1 to 3 categorical independent variables, and one continuous dependent variable. ```{r, echo = F, message = F, result = 'hide'} library(yarrr) ``` A pirateplot has 4 main elements 1. points, symbols representing the raw data (jittered horizontally) 2. bar, a vertical bar showing central tendencies 3. bean, a smoothed density (inspired by @kampstra2008beanplot) representing a smoothed density 4. inf, a rectangle representing an inference interval (e.g.; Bayesian Highest Density Interval or frequentist confidence interval) ```{r, fig.width = 6, fig.height = 5, echo = F, fig.align='center'} pirateplot( formula = weight ~ Diet, data = ChickWeight, theme = 1, back.col = "white", gl.col = "white", bean.f.o = c(0, .1, .7, .1), # bean.b.o = c(0, .1, 1, .1), point.o = c(.4, .1, .1, .1), avg.line.o = c(.3, 1, .3, .3), inf.f.o = c(.1, .1, .1, .9), bar.f.o = c(.1, .8, .1, .1), inf.f.col = c("white", "white", "white", piratepal("xmen")[4]), main = "4 Elements of a pirateplot", pal = "xmen" ) text(.7, 350, labels = "Points") text(.7, 345, labels = "Raw Data", pos = 1, cex = .8) arrows(.7, 310, .97, 270, length = .1) text(1.4, 200, labels = "Bar/Line") text(1.4, 200, labels = "Center", pos = 1, cex = .8) arrows(1.4, 170, 1.54, 125, length = .1) text(2.4, 250, labels = "Bean") text(2.4, 250, labels = "Density", pos = 1, cex = .8) arrows(2.4, 220, 2.85, 200, length = .1) text(3.55, 300, labels = "Band") text(3.55, 290, labels = "Inference\n95% HDI or CI", pos = 1, cex = .8) arrows(3.55, 240, 3.8, 150, length = .1) ``` # Main arguments Here are the main arguments to `pirateplot()` ```{r, echo = FALSE} pp.elements <- data.frame( "Argument" = c("formula", "data", "main", "pal", "theme", "inf"), "Description" = c( "A formula", "A dataframe", "Plot title", "A color palette", "A plotting theme", "Type of inference" ), "Examples" = c( "height ~ sex + eyepatch, weight ~ Time", "pirates, ChickWeight", "'Pirate heights', 'Chicken Weights", "'xmen', 'black'", "0, 1, 2", "'ci', 'hdi', 'iqr'" ) ) knitr::kable(pp.elements, caption = "Main Pirateplot Arguments") ``` # Themes `pirateplot()` currently supports three themes which change the default look of the plot. To specify a theme, use the `theme` argument: ## Theme 1 `theme = 1` is the default ```{r fig.align='center', fig.width = 6, fig.height = 4} # Theme 1 (the default) pirateplot( formula = weight ~ Time, data = ChickWeight, theme = 1, main = "theme = 1" ) ``` ## Theme 2 Here is `theme = 2` ```{r fig.align='center', fig.width = 6, fig.height = 4} # Theme 2 pirateplot( formula = weight ~ Time, data = ChickWeight, theme = 2, main = "theme = 2" ) ``` ## Theme 3 And now...`theme = 3`! ```{r fig.align='center', fig.width = 6, fig.height = 4} # Theme 3 pirateplot( formula = weight ~ Time, data = ChickWeight, theme = 3, main = "theme = 3" ) ``` ## Theme 4 `theme = 4` tries to maintain a classic barplot look (but with added raw data). ```{r fig.align='center', fig.width = 6, fig.height = 4} # Theme 4 pirateplot( formula = weight ~ Time, data = ChickWeight, theme = 4, main = "theme = 4" ) ``` ## Theme 0 `theme = 0` allows you to start a pirateplot from scratch -- that is, it turns of *all* elements. You can then selectively turn elements on with individual arguments (e.g.; `bean.f.o`, `point.o`) ```{r fig.align='center', fig.width = 6, fig.height = 4} # Default theme pirateplot( formula = weight ~ Time, data = ChickWeight, theme = 0, main = "theme = 0\nStart from scratch" ) ``` # Color palettes You can specify a general color palette using the `pal` argument. You can do this in two ways. The first way is to specify the name of a color palette in the `piratepal()` function. Here they are: ```{r fig.width = 6, fig.height = 4, fig.align = "center"} piratepal("all") ``` For example, here is a pirateplot using the `"pony"` palette ```{r fig.width = 6, fig.height = 4, fig.align = "center"} pirateplot( formula = weight ~ Time, data = ChickWeight, pal = "pony", theme = 1, main = "pony color palette" ) ``` The second method is to simply enter a vector of one or more colors. Here, I'll create a black and white pirateplot from theme 2 by specifying `pal = 'black'` ```{r fig.width = 6, fig.height = 4, fig.align = "center"} pirateplot( formula = weight ~ Time, data = ChickWeight, theme = 2, pal = "black", main = "pal = 'black" ) ``` # Customising elements Regardless of the theme you use, you can always customize the color and opacity of graphical elements. To do this, specify one of the following arguments. Note: Arguments with `.f.` correspond to the *filling* of an element, while `.b.` correspond to the *border* of an element: ```{r, echo = FALSE} pp.elements <- data.frame( "element" = c("points", "beans", "bar", "inf", "avg.line"), "color" = c( "point.col, point.bg", "bean.f.col, bean.b.col", "bar.f.col, bar.b.col", "inf.f.col, inf.b.col", "avg.line.col" ), "opacity" = c( "point.o", "bean.f.o, bean.b.o", "bar.f.o, bar.b.o", "inf.f.o, inf.b.o", "avg.line.o" ) ) knitr::kable(pp.elements, caption = "Customising plotting elements") ``` For example, I could create the following pirateplots using `theme = 0` and specifying elements explicitly: ```{r fig.width = 6, fig.height = 4, fig.align = "center"} pirateplot( formula = weight ~ Time, data = ChickWeight, theme = 0, main = "Fully customized pirateplot", pal = "southpark", # southpark color palette bean.f.o = .6, # Bean fill point.o = .3, # Points inf.f.o = .7, # Inference fill inf.b.o = .8, # Inference border avg.line.o = 1, # Average line bar.f.o = .5, # Bar inf.f.col = "white", # Inf fill col inf.b.col = "black", # Inf border col avg.line.col = "black", # avg line col bar.f.col = gray(.8), # bar filling color point.pch = 21, point.bg = "white", point.col = "black", point.cex = .7 ) ``` If you don't want to start from scratch, you can also start with a theme, and then make selective adjustments: ```{r fig.width = 6, fig.height = 4, fig.align = "center"} pirateplot( formula = weight ~ Time, data = ChickWeight, main = "Adjusting an existing theme", theme = 2, # Start with theme 2 inf.f.o = 0, # Turn off inf fill inf.b.o = 0, # Turn off inf border point.o = .2, # Turn up points bar.f.o = .5, # Turn up bars bean.f.o = .4, # Light bean filling bean.b.o = .2, # Light bean border avg.line.o = 0, # Turn off average line point.col = "black" # Black points ) ``` Just to drive the point home, as a barplot is a special case of a pirateplot, you can even reduce a pirateplot into a horrible barplot: ```{r fig.width = 6, fig.height = 4, fig.align = "center"} pirateplot( formula = weight ~ Time, data = ChickWeight, main = "Reducing a pirateplot to a barplot", theme = 0, # Start from scratch bar.f.o = .7 ) # Just turn on the bars ``` # Additional arguments There are several more arguments that you can use to customize your plot: ```{r, echo = FALSE} pp.elements <- data.frame( "element" = c("Background color", "Gridlines", "y-axis Locations", "Quantiles", "Average line", "Inference Calculation", "Inference Display"), "arguments" = c( "back.col", "gl.col, gl.lwd, gl.lty", "yaxt.y", "quant, quant.lwd, quant.col", "avg.line.fun", "inf.method", "inf.disp" ), "examples" = c( "back.col = 'gray(.9, .9)'", "gl.col = 'gray', gl.lwd = c(.75, 0), gl.lty = 1", "yaxt.y = seq(0, 400, 50)", "quant = c(.1, .9), quant.lwd = 1, quant.col = 'black'", "avg.line.fun = median", "inf.method = 'hdi', inf.method = 'ci'", "inf.disp = 'line', inf.disp = 'bean', inf.disp = 'rect'" ) ) knitr::kable(pp.elements, caption = "Additonal pirateplot elements") ``` Here's an example using a background color, and quantile lines. ```{r fig.width = 8, fig.height = 5, fig.align = "center", out.width = "100%"} pirateplot( formula = weight ~ Time, data = ChickWeight, main = "Adding quantile lines and background colors", theme = 2, back.col = gray(.98), # Add light gray background gl.col = "gray", # Gray gridlines gl.lwd = c(.75, 0), # Gridline widths (alternating) inf.f.o = .6, # Turn up inf filling inf.disp = "bean", # Wrap inference around bean bean.b.o = .4, # Turn down bean borders quant = c(.1, .9), # 10th and 90th quantiles quant.col = "black", # Black quantile lines yaxt.y = seq(0, 400, 50) # Locations of y-axis tick marks ) ``` # Multiple IVs You can use up to 3 categorical IVs in your plot. Here are some examples: ```{r fig.width = 10, fig.height = 5, fig.align = "center", out.width = "100%"} pirateplot( formula = height ~ sex + eyepatch, data = pirates, theme = 2, inf.disp = "bean" ) ``` ## beside = FALSE ```{r fig.width = 10, fig.height = 5, fig.align = "center", out.width = "100%"} # Same as before, but with second IV on different plots by including beside = FALSE pirateplot( formula = height ~ sex + eyepatch, data = pirates, theme = 2, beside = FALSE, inf.disp = "bean" ) ``` ```{r fig.width = 10, fig.height = 10, fig.align = "center", out.width = "100%"} # Same as before, but with second IV on different plots by including beside = FALSE pirateplot( formula = weight ~ Time + Diet, data = ChickWeight, theme = 2, beside = FALSE ) ``` If you use 3 ivs, values of the second iv will be beside each other. ```{r fig.width = 10, fig.height = 5, fig.align = "center", out.width = "100%"} pirateplot( formula = height ~ sex + eyepatch + headband, data = pirates, theme = 2, inf.disp = "bean" ) ``` # Output If you include the `plot = FALSE` argument to a pirateplot, the function will return some values associated with the plot. ```{r fig.width = 10, fig.height = 7, fig.align = "center"} times.pp <- pirateplot( formula = time ~ sequel + genre, data = subset( movies, genre %in% c("Action", "Adventure", "Comedy", "Horror") & rating %in% c("G", "PG", "PG-13", "R") & time > 0 ), plot = FALSE ) ``` Here's the result. The most interesting element is `$summary` which shows summary statistics for each bean: ```{r} times.pp ``` # Specifying individual point parameters If you want to customize the look of individual points in the plot, you can specify them with an additional `pointpars` argument. This should be a dataframe with the same number of rows as `data`, and column names in the set col, bg, pch, labels, jitter. For example, here is a plot with additional point labels and custom jitter values: ```{r, eval = FALSE} # Create a smaller version of the ChickWeight data ChickWeight2 <- ChickWeight[sample(nrow(ChickWeight), size = 100), ] # Define labels, jitter values, and colors for all individual points pointpars <- data.frame( "labels" = 1:100, "jitter" = rep(c(-.1, .1), length.out = 100 ), "col" = sample(colors(), size = 100, replace = TRUE ) ) # Createpireatplot with custom point parameters pirateplot( formula = weight ~ Diet, data = ChickWeight2, pointpars = pointpars, point.o = 1 ) ``` # Contribute! I am very happy to receive new contributions and suggestions to improve the pirateplot. If you come up a new theme (i.e.; customization) that you like, or have a favorite color palette that you'd like to have implemented, please contact me (nathaniel.d.phillips.is@gmail.com) or post an issue at [www.github.com/ndphillips/yarrr/issues](www.github.com/ndphillips/yarrr/issues) and I might include it in a future update. # References The pirateplot is really a knock-off of the great beanplot package and visualization from @kampstra2008beanplot.