Generate taxa area plots over time
Source:R/generate_taxa_areaplot_long.R
generate_taxa_areaplot_long.Rd
This function generates taxa area plots for a given data object. The plots will show the relative abundance of different taxa over time. Raw count data will be automatically normalized using rarefaction and total sum scaling (TSS). The function also supports the generation of plots for grouped data and stratified data.
Usage
generate_taxa_areaplot_long(
data.obj,
subject.var,
time.var,
group.var = NULL,
strata.var = NULL,
feature.level = "original",
feature.dat.type = c("count", "proportion", "other"),
feature.number = 20,
features.plot = NULL,
t0.level = NULL,
ts.levels = NULL,
base.size = 10,
theme.choice = "bw",
custom.theme = NULL,
palette = NULL,
pdf = TRUE,
file.ann = NULL,
pdf.wid = 11,
pdf.hei = 8.5,
...
)
Arguments
- data.obj
A list object in a format specific to MicrobiomeStat, which can include components such as feature.tab (matrix), feature.ann (matrix), meta.dat (data.frame), tree, and feature.agg.list (list). The data.obj can be converted from other formats using several functions from the MicrobiomeStat package, including: 'mStat_convert_DGEList_to_data_obj', 'mStat_convert_DESeqDataSet_to_data_obj', 'mStat_convert_phyloseq_to_data_obj', 'mStat_convert_SummarizedExperiment_to_data_obj', 'mStat_import_qiime2_as_data_obj', 'mStat_import_mothur_as_data_obj', 'mStat_import_dada2_as_data_obj', and 'mStat_import_biom_as_data_obj'. Alternatively, users can construct their own data.obj. Note that not all components of data.obj may be required for all functions in the MicrobiomeStat package.
- subject.var
Character string specifying the column name in metadata containing unique subject IDs. Required to connect samples from the same subject.
- time.var
Character string specifying the column name in metadata containing the time variable. Required to order and connect samples over time.
- group.var
Character string specifying the column name in metadata containing grouping categories. Used for coloring lines in the plot. Optional, can be NULL.
- strata.var
Character string specifying the column name in metadata containing stratification categories. Used for nested faceting in the plots. Optional, can be NULL.
- feature.level
Character vector specifying taxonomic level(s) to use for plotting, e.g. c("Phylum", "Genus"). The special value "original" can also be provided, which will use the original taxon identifiers. Multiple levels can be specified and data will be plotted separately for each. **Cannot be NULL, as NULL value will lead to errors.** Default is "original".
- feature.dat.type
The type of the feature data, which determines how the data is handled in downstream analyses. Should be one of: - "count": Raw count data, will be normalized by the function. - "proportion": Data that has already been normalized to proportions/percentages. - "other": Custom abundance data that has unknown scaling. No normalization applied. The choice affects preprocessing steps as well as plot axis labels. Default is "count", which assumes raw OTU table input.
- feature.number
A numeric value indicating the number of top abundant features to retain in the plot. Features with average relative abundance ranked below this number will be grouped into 'Other'. Default 20.
- features.plot
A character vector specifying which feature IDs (e.g. OTU IDs) to plot. Default is NULL, in which case features will be selected based on `top.k.plot` and `top.k.func`.
- t0.level
Character or numeric, baseline time point for longitudinal analysis, e.g. "week_0" or 0. Required.
- ts.levels
Character vector, names of follow-up time points, e.g. c("week_4", "week_8"). Required.
- base.size
The base size for the ggplot2 theme. Default is 10.
- theme.choice
Plot theme choice. Specifies the visual style of the plot. Can be one of the following pre-defined themes: - "prism": Utilizes the ggprism::theme_prism() function from the ggprism package, offering a polished and visually appealing style. - "classic": Applies theme_classic() from ggplot2, providing a clean and traditional look with minimal styling. - "gray": Uses theme_gray() from ggplot2, which offers a simple and modern look with a light gray background. - "bw": Employs theme_bw() from ggplot2, creating a classic black and white plot, ideal for formal publications and situations where color is best minimized. - "light": Implements theme_light() from ggplot2, featuring a light theme with subtle grey lines and axes, suitable for a fresh, modern look. - "dark": Uses theme_dark() from ggplot2, offering a dark background, ideal for presentations or situations where a high-contrast theme is desired. - "minimal": Applies theme_minimal() from ggplot2, providing a minimalist theme with the least amount of background annotations and colors. - "void": Employs theme_void() from ggplot2, creating a blank canvas with no axes, gridlines, or background, ideal for custom, creative plots. Each theme option adjusts various elements like background color, grid lines, and font styles to match the specified aesthetic. Default is "bw", offering a universally compatible black and white theme suitable for a wide range of applications.
- custom.theme
A custom ggplot theme provided as a ggplot2 theme object. This allows users to override the default theme and provide their own theme for plotting. Custom themes are useful for creating publication-ready figures with specific formatting requirements.
To use a custom theme, create a theme object with ggplot2::theme(), including any desired customizations. Common customizations for publication-ready figures might include adjusting text size for readability, altering line sizes for clarity, and repositioning or formatting the legend. For example:
“`r my_theme <- ggplot2::theme( axis.title = ggplot2::element_text(size=14, face="bold"), # Bold axis titles with larger font axis.text = ggplot2::element_text(size=12), # Slightly larger axis text legend.position = "top", # Move legend to the top legend.background = ggplot2::element_rect(fill="lightgray"), # Light gray background for legend panel.background = ggplot2::element_rect(fill="white", colour="black"), # White panel background with black border panel.grid.major = ggplot2::element_line(colour = "grey90"), # Lighter color for major grid lines panel.grid.minor = ggplot2::element_blank(), # Remove minor grid lines plot.title = ggplot2::element_text(size=16, hjust=0.5) # Centered plot title with larger font ) “`
Then pass `my_theme` to `custom.theme`. If `custom.theme` is NULL (the default), the theme is determined by `theme.choice`. This flexibility allows for both easy theme selection for general use and detailed customization for specific presentation or publication needs.
- palette
Character vector specifying colors to use for mapping features to color aesthetic. Should be same length as number of features. If NULL, default palette will be used. Colors will be mapped to features based on order of features. This parameter does not represent groups, it is only used for feature colors.
Logical indicating if the plot should be saved as a PDF. Default is TRUE.
- file.ann
Optional, a file annotation. Default is NULL.
- pdf.wid
Width of the output PDF. Default is 11.
- pdf.hei
Height of the output PDF. Default is 8.5.
- ...
Additional arguments to pass to the function.
Examples
if (FALSE) { # \dontrun{
library(ggh4x)
library(vegan)
data(ecam.obj)
generate_taxa_areaplot_long(
data.obj = ecam.obj,
subject.var = "studyid",
time.var = "month_num",
group.var = "delivery",
strata.var = "diet",
feature.level = c("Genus"),
feature.dat.type = "proportion",
feature.number = 40,
t0.level = NULL,
ts.levels = NULL,
base.size = 10,
theme.choice = "bw",
palette = NULL,
pdf = TRUE,
file.ann = NULL
)
generate_taxa_areaplot_long(
data.obj = ecam.obj,
subject.var = "studyid",
time.var = "month_num",
group.var = "delivery",
strata.var = "diet",
feature.level = c("Genus"),
feature.dat.type = "proportion",
feature.number = 20,
t0.level = NULL,
ts.levels = NULL,
base.size = 10,
theme.choice = "bw",
palette = NULL,
pdf = TRUE,
file.ann = NULL
)
generate_taxa_areaplot_long(
data.obj = ecam.obj,
subject.var = "studyid",
time.var = "month_num",
group.var = "delivery",
strata.var = "diet",
feature.level = c("Genus"),
feature.dat.type = "proportion",
feature.number = 20,
features.plot = unique(ecam.obj$feature.ann[,"Genus"])[1:15],
t0.level = NULL,
ts.levels = NULL,
base.size = 10,
theme.choice = "bw",
palette = NULL,
pdf = TRUE,
file.ann = NULL
)
data(subset_T2D.obj)
generate_taxa_areaplot_long(
data.obj = subset_T2D.obj,
subject.var = "subject_id",
time.var = "visit_number_num",
group.var = "subject_gender",
strata.var = "subject_race",
feature.level = c("Genus"),
feature.dat.type = "count",
feature.number = 40,
t0.level = NULL,
ts.levels = NULL,
base.size = 10,
theme.choice = "bw",
palette = NULL,
pdf = TRUE,
pdf.wid = 49,
file.ann = NULL
)
generate_taxa_areaplot_long(
data.obj = subset_T2D.obj,
subject.var = "subject_id",
time.var = "visit_number_num",
group.var = "subject_id",
strata.var = "subject_gender",
feature.level = c("Genus"),
feature.dat.type = "count",
feature.number = 40,
t0.level = NULL,
ts.levels = NULL,
base.size = 10,
theme.choice = "bw",
palette = NULL,
pdf = TRUE,
pdf.wid = 49,
file.ann = NULL
)
generate_taxa_areaplot_long(
data.obj = subset_T2D.obj,
subject.var = "subject_id",
time.var = "visit_number_num",
group.var = "sample_body_site",
strata.var = "subject_race",
feature.level = c("Genus"),
feature.dat.type = "count",
feature.number = 40,
t0.level = NULL,
ts.levels = NULL,
base.size = 10,
theme.choice = "bw",
palette = NULL,
pdf = TRUE,
pdf.wid = 49,
file.ann = NULL
)
} # }