--- title: "Plotting Functions" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Plotting Functions} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) ``` ## Overview The plot module provides five publication-ready plotting functions: | Function | Purpose | |---|---| | `plot_bar()` | Bar chart with optional grouping and sorting | | `plot_density()` | Density / distribution curves with optional faceting | | `plot_pie()` | Pie / donut chart from a vector or data frame | | `plot_venn()` | Venn diagram for 2–4 sets | | `plot_forest()` | Publication-ready forest plot | ```{r load} library(evanverse) ``` > **Note:** All code examples in this vignette are static (`eval = FALSE`). > Output is hand-written to reflect the current implementation. --- ## 1 Bar Chart ### `plot_bar()` — Bar chart with optional grouping Creates a vertical or horizontal bar chart. Bars can be sorted, and an optional `group_col` produces a grouped (side-by-side) chart. All inputs are character column names, so the function works naturally in pipelines. **Key parameters** | Parameter | Default | Description | |---|---|---| | `data` | — | Data frame | | `x_col` | — | Column for x-axis categories | | `y_col` | — | Column for bar heights | | `horizontal` | `FALSE` | Flip to a horizontal bar chart | | `sort` | `FALSE` | Sort bars by height | | `decreasing` | `TRUE` | Sort order when `sort = TRUE` | | `group_col` | `NULL` | Column for grouped bars | | `sort_by` | `"union"` | Which groups to base sort order on | #### Basic usage ```{r bar-basic} df <- data.frame( category = c("Alpha", "Beta", "Gamma", "Delta", "Epsilon"), count = c(42, 18, 35, 27, 51) ) plot_bar(df, x_col = "category", y_col = "count") #> # A ggplot object — vertical bars, categories in original order ``` #### Sorted bars ```{r bar-sorted} plot_bar(df, x_col = "category", y_col = "count", sort = TRUE) #> # Bars ordered from tallest (Epsilon = 51) to shortest (Beta = 18) ``` #### Horizontal layout ```{r bar-horizontal} plot_bar(df, x_col = "category", y_col = "count", sort = TRUE, horizontal = TRUE) #> # Horizontal bars, sorted descending ``` #### Grouped bars ```{r bar-grouped} df2 <- data.frame( category = rep(c("Alpha", "Beta", "Gamma"), each = 2), group = rep(c("Control", "Treatment"), 3), value = c(10, 15, 8, 22, 18, 20) ) plot_bar(df2, x_col = "category", y_col = "value", group_col = "group") #> # Side-by-side bars, two groups per category ``` --- ## 2 Density Plot ### `plot_density()` — Distribution curves Draws kernel density curves via `geom_density`. Supports multi-group overlays, optional faceting, and palette customisation. **Key parameters** | Parameter | Default | Description | |---|---|---| | `data` | — | Data frame | | `x_col` | — | Numeric column for the distribution | | `group_col` | `NULL` | Column for overlaid group curves | | `facet_col` | `NULL` | Column for faceted panels | | `alpha` | `0.3` | Fill transparency | | `adjust` | `1` | Bandwidth multiplier | | `palette` | `NULL` | Named or unnamed character vector of colours | #### Single distribution ```{r density-basic} set.seed(42) df <- data.frame(score = rnorm(200, mean = 50, sd = 10)) plot_density(df, x_col = "score") #> # Density curve for all 200 observations ``` #### Multiple groups ```{r density-groups} df2 <- data.frame( score = c(rnorm(100, 50, 10), rnorm(100, 65, 8)), group = rep(c("Control", "Treatment"), each = 100) ) plot_density(df2, x_col = "score", group_col = "group") #> # Two overlaid density curves, one per group ``` #### Faceted panels ```{r density-facet} df3 <- data.frame( score = c(rnorm(150, 50, 10), rnorm(150, 60, 12)), group = rep(c("A", "B"), each = 150), cohort = rep(c("Discovery", "Replication"), 150) ) plot_density(df3, x_col = "score", group_col = "group", facet_col = "cohort") #> # Two facet panels (Discovery / Replication), each with two group curves ``` --- ## 3 Pie Chart ### `plot_pie()` — Pie / donut chart Accepts either a plain named vector of counts or a data frame with separate group and count columns. Labels can show counts, percentages, both, or none. **Key parameters** | Parameter | Default | Description | |---|---|---| | `data` | — | Named numeric vector or data frame | | `group_col` | `NULL` | Column for slice labels (data frame only) | | `count_col` | `NULL` | Column for slice sizes (data frame only) | | `label` | `"percent"` | `"count"`, `"percent"`, `"both"`, or `"none"` | | `palette` | `NULL` | Character vector of fill colours | #### From a named vector ```{r pie-vector} counts <- c(T_cell = 423, B_cell = 187, NK_cell = 95, Monocyte = 312) plot_pie(counts) #> # Pie chart with four slices, percentage labels ``` #### From a data frame ```{r pie-df} df <- data.frame( cell_type = c("T_cell", "B_cell", "NK_cell", "Monocyte"), n = c(423, 187, 95, 312) ) plot_pie(df, group_col = "cell_type", count_col = "n", label = "both") #> # Pie chart with count + percentage labels ``` #### Custom palette ```{r pie-palette} plot_pie(counts, palette = c("#4C72B0", "#DD8452", "#55A868", "#C44E52")) #> # Same chart with custom fill colours ``` --- ## 4 Venn Diagram ### `plot_venn()` — Venn diagram for 2–4 sets Draws classic (`ggvenn`) or gradient-filled (`ggVennDiagram`) Venn diagrams. Both packages are in `Suggests` — install the one you need. **Key parameters** | Parameter | Default | Description | |---|---|---| | `set1`, `set2` | — | Required input vectors | | `set3`, `set4` | `NULL` | Optional third and fourth sets | | `set_names` | `NULL` | Override labels (default: variable names) | | `method` | `"classic"` | `"classic"` (ggvenn) or `"gradient"` (ggVennDiagram) | | `label` | `"count"` | `"count"`, `"percent"`, `"both"`, or `"none"` | | `palette` | `NULL` | Fill colours (vector for classic; palette name for gradient) | | `return_sets` | `FALSE` | Return `list(plot, sets)` instead of just the plot | #### Two sets ```{r venn-two} set.seed(42) genes_a <- sample(paste0("GENE", 1:100), 40) genes_b <- sample(paste0("GENE", 1:100), 35) plot_venn(genes_a, genes_b, set_names = c("Set A", "Set B")) #> # Two-circle Venn with overlap count labels ``` #### Three sets ```{r venn-three} genes_c <- sample(paste0("GENE", 1:100), 30) plot_venn(genes_a, genes_b, genes_c, set_names = c("RNA-seq", "ATAC-seq", "ChIP-seq"), label = "percent") #> # Three-circle Venn with percentage labels ``` #### Gradient method ```{r venn-gradient} plot_venn(genes_a, genes_b, genes_c, method = "gradient", palette = "Blues") #> # Gradient-filled Venn (requires ggVennDiagram) ``` #### Return intersection sets ```{r venn-return} result <- plot_venn(genes_a, genes_b, return_sets = TRUE) names(result) #> [1] "plot" "sets" lapply(result$sets, length) #> $genes_a #> [1] 40 #> #> $genes_b #> [1] 35 ``` --- ## 5 Forest Plot ### `plot_forest()` — Publication-ready forest plot Produces a publication-ready forest plot with automatic OR (95% CI) label generation, p-value formatting, row bolding, background shading, and three-line borders. Built on `forestploter`. **Key parameters** | Parameter | Default | Description | |---|---|---| | `data` | — | Data frame; first column is the row label | | `est` | — | Numeric vector of point estimates (`NA` = header row) | | `lower`, `upper` | — | CI lower and upper bounds | | `ci_column` | `2L` | Column position for the CI graphic | | `ref_line` | `1` | Reference line (use `0` for beta coefficients) | | `xlim` | `c(0, 2)` | X-axis limits | | `p_cols` | `NULL` | Numeric column(s) to auto-format as p-values | | `indent` | `NULL` | Per-row indentation level (0 = parent, 1+ = sub-row) | | `bold_label` | `NULL` | Logical vector for label bolding | | `background` | `"zebra"` | `"zebra"`, `"bold_label"`, or `"none"` | | `save` | `FALSE` | Save output to file | | `dest` | `NULL` | File path when `save = TRUE` | #### Basic forest plot ```{r forest-basic} df <- data.frame( item = c("Exposure vs. control", "Unadjusted", "Fully adjusted"), `Cases/N` = c("", "89/4521", "89/4521"), p_value = c(NA_real_, 0.001, 0.006), check.names = FALSE ) p <- plot_forest( data = df, est = c(NA, 1.52, 1.43), lower = c(NA, 1.18, 1.11), upper = c(NA, 1.96, 1.85), ci_column = 2L, indent = c(0L, 1L, 1L), p_cols = "p_value", xlim = c(0.5, 3.0) ) plot(p) #> # Forest plot — header row bolded, two indented model rows, #> # OR (95% CI) column auto-generated, significant p-values bolded ``` #### Multi-exposure plot ```{r forest-multi} df2 <- data.frame( item = c("Biomarker A", " Model 1", " Model 2", "Biomarker B", " Model 1", " Model 2"), N = c("", "4521", "4521", "", "4389", "4389"), p_value = c(NA, 0.001, 0.012, NA, 0.034, 0.21), check.names = FALSE ) p2 <- plot_forest( data = df2, est = c(NA, 1.52, 1.43, NA, 0.88, 0.91), lower = c(NA, 1.18, 1.11, NA, 0.72, 0.74), upper = c(NA, 1.96, 1.85, NA, 1.07, 1.12), ci_column = 2L, indent = c(0L, 1L, 1L, 0L, 1L, 1L), bold_label = c(TRUE, FALSE, FALSE, TRUE, FALSE, FALSE), p_cols = "p_value", xlim = c(0.5, 2.5), background = "bold_label" ) plot(p2) #> # Two parent rows (bolded, shaded) with four indented sub-rows ``` #### Custom CI colours ```{r forest-colors} p3 <- plot_forest( data = df, est = c(NA, 1.52, 1.43), lower = c(NA, 1.18, 1.11), upper = c(NA, 1.96, 1.85), ci_column = 2L, indent = c(0L, 1L, 1L), ci_col = c(NA, "#E74C3C", "#3498DB"), p_cols = "p_value", xlim = c(0.5, 3.0) ) plot(p3) #> # Red CI for Unadjusted model, blue CI for Fully adjusted model ``` #### Save to file ```{r forest-save} plot_forest( data = df, est = c(NA, 1.52, 1.43), lower = c(NA, 1.18, 1.11), upper = c(NA, 1.96, 1.85), ci_column = 2L, indent = c(0L, 1L, 1L), p_cols = "p_value", xlim = c(0.5, 3.0), save = TRUE, dest = "results/forest_plot", save_width = 20, save_height = 8 ) #> # Saves forest_plot.pdf, .png, .svg, .tiff to results/ ``` --- ## Combined Workflow A typical analysis might use several plot functions together: ```{r workflow} library(evanverse) # Group sizes as a sorted bar chart summary_df <- data.frame( group = c("T_cell", "B_cell", "NK_cell", "Monocyte"), n = c(423, 187, 95, 312) ) plot_bar(summary_df, x_col = "group", y_col = "n", sort = TRUE, horizontal = TRUE) #> # Horizontal sorted bar chart # Proportional composition as a pie chart plot_pie(summary_df, group_col = "group", count_col = "n", label = "percent") #> # Pie chart with percentage labels # Score distributions per cell type score_df <- data.frame( score = c(rnorm(423, 55, 10), rnorm(187, 62, 9), rnorm(95, 70, 8), rnorm(312, 48, 12)), group = rep(c("T_cell", "B_cell", "NK_cell", "Monocyte"), c(423, 187, 95, 312)) ) plot_density(score_df, x_col = "score", group_col = "group") #> # Overlaid density curves for all four cell types # DEG overlaps across comparisons deg_tc <- paste0("GENE", sample(1:500, 80)) deg_bc <- paste0("GENE", sample(1:500, 65)) deg_nk <- paste0("GENE", sample(1:500, 45)) plot_venn(deg_tc, deg_bc, deg_nk, set_names = c("T_cell DEGs", "B_cell DEGs", "NK_cell DEGs")) #> # Three-set Venn of differentially expressed genes ``` --- ## Getting Help ```{r help} ?plot_bar ?plot_density ?plot_pie ?plot_venn ?plot_forest help(package = "evanverse") ```