Skip to contents

This function generates boxplots to visualize changes in within-group and between-group beta diversity across multiple time points, for different groups.

Usage

generate_beta_change_boxplot_long(
  data.obj = NULL,
  dist.obj = NULL,
  subject.var,
  time.var,
  t0.level = NULL,
  ts.levels = NULL,
  group.var = NULL,
  strata.var = NULL,
  adj.vars = NULL,
  dist.name = c("BC", "Jaccard"),
  base.size = 16,
  theme.choice = "bw",
  custom.theme = NULL,
  palette = NULL,
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5,
  ...
)

Arguments

data.obj

A list object in a format specific to MicrobiomeStat, which can include components such as feature.tab (matrix), feature.ann (matrix), meta.dat (data.frame), tree, and feature.agg.list (list). The data.obj can be converted from other formats using several functions from the MicrobiomeStat package, including: 'mStat_convert_DGEList_to_data_obj', 'mStat_convert_DESeqDataSet_to_data_obj', 'mStat_convert_phyloseq_to_data_obj', 'mStat_convert_SummarizedExperiment_to_data_obj', 'mStat_import_qiime2_as_data_obj', 'mStat_import_mothur_as_data_obj', 'mStat_import_dada2_as_data_obj', and 'mStat_import_biom_as_data_obj'. Alternatively, users can construct their own data.obj. Note that not all components of data.obj may be required for all functions in the MicrobiomeStat package.

dist.obj

Distance matrix between samples, usually calculated using mStat_calculate_beta_diversity function. If NULL, beta diversity will be automatically computed from data.obj using mStat_calculate_beta_diversity.

subject.var

A string specifying the name of the subject variable

time.var

A string specifying the name of the time variable

t0.level

Character or numeric, baseline time point for longitudinal analysis, e.g. "week_0" or 0. Required.

ts.levels

Character vector, names of follow-up time points, e.g. c("week_4", "week_8"). Required.

group.var

A string specifying the name of the group variable or NULL (default)

strata.var

A string specifying the name of the strata variable or NULL (default)

adj.vars

A string specifying the name of the adjustment variable or NULL (default)

dist.name

A character vector specifying which beta diversity indices to calculate. Supported indices are "BC" (Bray-Curtis), "Jaccard", "UniFrac" (unweighted UniFrac), "GUniFrac" (generalized UniFrac), "WUniFrac" (weighted UniFrac), and "JS" (Jensen-Shannon divergence). If a name is provided but the corresponding object does not exist within dist.obj, it will be computed internally. If the specific index is not supported, an error message will be returned. Default is c('BC', 'Jaccard').

base.size

(Optional) Base font size for the plot (default is 16).

theme.choice

Plot theme choice. Specifies the visual style of the plot. Can be one of the following pre-defined themes: - "prism": Utilizes the ggprism::theme_prism() function from the ggprism package, offering a polished and visually appealing style. - "classic": Applies theme_classic() from ggplot2, providing a clean and traditional look with minimal styling. - "gray": Uses theme_gray() from ggplot2, which offers a simple and modern look with a light gray background. - "bw": Employs theme_bw() from ggplot2, creating a classic black and white plot, ideal for formal publications and situations where color is best minimized. - "light": Implements theme_light() from ggplot2, featuring a light theme with subtle grey lines and axes, suitable for a fresh, modern look. - "dark": Uses theme_dark() from ggplot2, offering a dark background, ideal for presentations or situations where a high-contrast theme is desired. - "minimal": Applies theme_minimal() from ggplot2, providing a minimalist theme with the least amount of background annotations and colors. - "void": Employs theme_void() from ggplot2, creating a blank canvas with no axes, gridlines, or background, ideal for custom, creative plots. Each theme option adjusts various elements like background color, grid lines, and font styles to match the specified aesthetic. Default is "bw", offering a universally compatible black and white theme suitable for a wide range of applications.

custom.theme

A custom ggplot theme provided as a ggplot2 theme object. This allows users to override the default theme and provide their own theme for plotting. Custom themes are useful for creating publication-ready figures with specific formatting requirements.

To use a custom theme, create a theme object with ggplot2::theme(), including any desired customizations. Common customizations for publication-ready figures might include adjusting text size for readability, altering line sizes for clarity, and repositioning or formatting the legend. For example:

“`r my_theme <- ggplot2::theme( axis.title = ggplot2::element_text(size=14, face="bold"), # Bold axis titles with larger font axis.text = ggplot2::element_text(size=12), # Slightly larger axis text legend.position = "top", # Move legend to the top legend.background = ggplot2::element_rect(fill="lightgray"), # Light gray background for legend panel.background = ggplot2::element_rect(fill="white", colour="black"), # White panel background with black border panel.grid.major = ggplot2::element_line(colour = "grey90"), # Lighter color for major grid lines panel.grid.minor = ggplot2::element_blank(), # Remove minor grid lines plot.title = ggplot2::element_text(size=16, hjust=0.5) # Centered plot title with larger font ) “`

Then pass `my_theme` to `custom.theme`. If `custom.theme` is NULL (the default), the theme is determined by `theme.choice`. This flexibility allows for both easy theme selection for general use and detailed customization for specific presentation or publication needs.

palette

An optional parameter specifying the color palette to be used for the plot. It can be either a character string specifying the name of a predefined palette or a vector of color codes in a format accepted by ggplot2 (e.g., hexadecimal color codes). Available predefined palettes include 'npg', 'aaas', 'nejm', 'lancet', 'jama', 'jco', and 'ucscgb', inspired by various scientific publications and the `ggsci` package. If `palette` is not provided or an unrecognized palette name is given, a default color palette will be used. Ensure the number of colors in the palette is at least as large as the number of groups being plotted.

pdf

A logical value indicating whether to save the plot as a PDF file. Default is TRUE

file.ann

A character string specifying a custom annotation for the PDF file name or NULL (default)

pdf.wid

(Optional) The width of the PDF file if `pdf` is set to `TRUE` (default is 11).

pdf.hei

(Optional) The height of the PDF file if `pdf` is set to `TRUE` (default is 8.5).

...

Additional parameters passed on to ggsave()

Value

A named list of ggplot objects visualizing within-group and between-group beta diversity across time points. The list contains one plot for each distance metric specified in dist.name. Each plot shows boxplots of the within-group and between-group distances at each time point, with boxes dodge-positioned by the group_var variable if provided. A line plot connecting the medians of boxes is overlaid to show the trajectory over time.

Details

For each time point, it calculates the within-group and between-group beta diversity (distance) between all pairs of samples at that time point.

Boxplots are generated to compare the distribution of within-group and between-group distances at each time point. The boxes are dodge-positioned to allow for easy comparison.

A line plot connecting the medians of boxes is overlaid to show the trajectory over time.

Adjustment for covariates is supported by using adjusted distances.

See also

mStat_calculate_beta_diversity for creating the distance object.

Examples

if (FALSE) { # \dontrun{
# Load required libraries and example data
library(vegan)
data(peerj32.obj)
dist.obj <- mStat_calculate_beta_diversity(peerj32.obj, dist.name = "BC")
generate_beta_change_boxplot_long(
  data.obj = peerj32.obj,
  dist.obj = NULL,
  subject.var = "subject",
  time.var = "time",
  group.var = "group",
  strata.var = "sex",
  adj.vars = "sex",
  t0.level = "1",
  dist.name = c('BC'),
  base.size = 20,
  theme.choice = "bw",
  custom.theme = NULL,
  palette = "lancet",
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5
)

data("subset_pairs.obj")
generate_beta_change_boxplot_long(
  data.obj = subset_pairs.obj,
  dist.obj = NULL,
  subject.var = "MouseID",
  time.var = "Antibiotic",
  group.var = "Sex",
  t0.level = "Baseline",
  dist.name = c('BC'),
  base.size = 20,
  theme.choice = "bw",
  custom.theme = NULL,
  palette = "lancet",
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5
)

data("subset_T2D.obj")
generate_beta_change_boxplot_long(
  data.obj = subset_T2D.obj,
  dist.obj = NULL,
  subject.var = "subject_id",
  time.var = "visit_number",
  t0.level = unique(subset_T2D.obj$meta.dat$visit_number)[1],
  ts.levels = unique(subset_T2D.obj$meta.dat$visit_number)[-1],
  group.var = "subject_gender",
  dist.name = c('BC'),
  base.size = 20,
  theme.choice = "bw",
  custom.theme = NULL,
  palette = "lancet",
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5
)

generate_beta_change_boxplot_long(
  data.obj = subset_T2D.obj,
  dist.obj = NULL,
  subject.var = "subject_id",
  time.var = "visit_number",
  t0.level = unique(subset_T2D.obj$meta.dat$visit_number)[1],
  ts.levels = unique(subset_T2D.obj$meta.dat$visit_number)[-1],
  group.var = "subject_race",
  dist.name = c('BC'),
  base.size = 20,
  theme.choice = "bw",
  custom.theme = NULL,
  palette = "lancet",
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5
)

data("ecam.obj")
generate_beta_change_boxplot_long(
  data.obj = ecam.obj,
  dist.obj = NULL,
  subject.var = "subject.id",
  time.var = "month",
  t0.level = unique(ecam.obj$meta.dat$month)[1],
  ts.levels = unique(ecam.obj$meta.dat$month)[-1],
  group.var = "diet",
  dist.name = c('BC'),
  base.size = 20,
  theme.choice = "bw",
  custom.theme = NULL,
  palette = "lancet",
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5
)
generate_beta_change_boxplot_long(
  data.obj = ecam.obj,
  dist.obj = NULL,
  subject.var = "subject.id",
  time.var = "month",
  t0.level = unique(ecam.obj$meta.dat$month)[1],
  ts.levels = unique(ecam.obj$meta.dat$month)[-1],
  group.var = "diet",
  strata.var = "delivery",
  dist.name = c('BC'),
  base.size = 20,
  theme.choice = "bw",
  custom.theme = NULL,
  palette = "lancet",
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5
)
} # }