Generate Taxonomic Heatmap Single
Source:R/generate_taxa_heatmap_single.R
generate_taxa_heatmap_single.Rd
This function performs hierarchical clustering on microbiome data based on grouping variables and strata variables in sample metadata and generates stacked heatmaps using the “pheatmap” package. It can also save the resulting heatmap as a PDF file.
Usage
generate_taxa_heatmap_single(
data.obj,
subject.var,
time.var = NULL,
t.level = NULL,
group.var = NULL,
strata.var = NULL,
other.vars = NULL,
feature.level = NULL,
feature.dat.type = c("count", "proportion", "other"),
features.plot = NULL,
top.k.plot = NULL,
top.k.func = NULL,
prev.filter = 0.01,
abund.filter = 1e-04,
base.size = 10,
palette = NULL,
cluster.cols = NULL,
cluster.rows = NULL,
pdf = TRUE,
file.ann = NULL,
pdf.wid = 11,
pdf.hei = 8.5,
...
)
Arguments
- data.obj
A list object in a format specific to MicrobiomeStat, which can include components such as feature.tab (matrix), feature.ann (matrix), meta.dat (data.frame), tree, and feature.agg.list (list). The data.obj can be converted from other formats using several functions from the MicrobiomeStat package, including: 'mStat_convert_DGEList_to_data_obj', 'mStat_convert_DESeqDataSet_to_data_obj', 'mStat_convert_phyloseq_to_data_obj', 'mStat_convert_SummarizedExperiment_to_data_obj', 'mStat_import_qiime2_as_data_obj', 'mStat_import_mothur_as_data_obj', 'mStat_import_dada2_as_data_obj', and 'mStat_import_biom_as_data_obj'. Alternatively, users can construct their own data.obj. Note that not all components of data.obj may be required for all functions in the MicrobiomeStat package.
- subject.var
The name of the subject variable in the samples
- time.var
The name of the time variable in the samples
- t.level
Character string specifying the time level/value to subset data to, if a time variable is provided. Default NULL does not subset data.
- group.var
The name of the grouping variable in the samples
- strata.var
The name of the strata variable in the samples
- other.vars
A character vector specifying additional variables from the metadata to include in the heatmap annotation. These variables will be added to the annotation columns alongside `group.var` and `strata.var`. This allows for the visualization of additional metadata information in the heatmap. Default is NULL, which means no additional variables are included.
- feature.level
The column name in the feature annotation matrix (feature.ann) of data.obj to use for summarization and plotting. This can be the taxonomic level like "Phylum", or any other annotation columns like "Genus" or "OTU_ID". Should be a character vector specifying one or more column names in feature.ann. Multiple columns can be provided, and data will be plotted separately for each column. Default is NULL, which defaults to all columns in feature.ann if `features.plot` is also NULL.
- feature.dat.type
The type of the feature data, which determines how the data is handled in downstream analyses. Should be one of: - "count": Raw count data, will be normalized by the function. - "proportion": Data that has already been normalized to proportions/percentages. - "other": Custom abundance data that has unknown scaling. No normalization applied. The choice affects preprocessing steps as well as plot axis labels. Default is "count", which assumes raw OTU table input.
- features.plot
A character vector specifying which feature IDs (e.g. OTU IDs) to plot. Default is NULL, in which case features will be selected based on `top.k.plot` and `top.k.func`.
- top.k.plot
Integer specifying number of top k features to plot, when `features.plot` is NULL. Default is NULL, in which case all features passing filters will be plotted.
- top.k.func
Function to use for selecting top k features, when `features.plot` is NULL. Options include inbuilt functions like "mean", "sd", or a custom function. Default is NULL, in which case features will be selected by abundance.
- prev.filter
Numeric value specifying the minimum prevalence threshold for filtering taxa before analysis. Taxa with prevalence below this value will be removed. Prevalence is calculated as the proportion of samples where the taxon is present. Default 0 removes no taxa by prevalence filtering.
- abund.filter
Numeric value specifying the minimum abundance threshold for filtering taxa before analysis. Taxa with mean abundance below this value will be removed. Abundance refers to counts or proportions depending on
feature.dat.type
. Default 0 removes no taxa by abundance filtering.- base.size
Base font size for the generated plots.
- palette
The color palette to be used for annotating the plots. This parameter can be specified in several ways: - As a character string representing a predefined palette name. Available predefined palettes include 'npg', 'aaas', 'nejm', 'lancet', 'jama', 'jco', and 'ucscgb'. - As a vector of color codes in a format accepted by ggplot2 (e.g., hexadecimal color codes). The function uses `mStat_get_palette` to retrieve or generate the color palette. If `palette` is NULL or an unrecognized string, a default color palette will be used. The colors are applied to the specified grouping variables (`group.var`, `strata.var`) in the heatmap, ensuring each level of these variables is associated with a unique color. If both `group.var` and `strata.var` are specified, the function assigns colors to `group.var` from the start of the palette and to `strata.var` from the end, ensuring distinct color representations for each annotation layer.
- cluster.cols
A logical variable indicating if columns should be clustered. Default is NULL.
- cluster.rows
A logical variable indicating if rows should be clustered. Default is NULL.
If TRUE, save the plot as a PDF file (default: TRUE)
- file.ann
The file name annotation (default: NULL)
- pdf.wid
Width of the PDF plots.
- pdf.hei
Height of the PDF plots.
- ...
Additional arguments passed to the pheatmap() function from the “pheatmap” package.
Examples
if (FALSE) { # \dontrun{
# Load required libraries and example data
library(pheatmap)
data(peerj32.obj)
generate_taxa_heatmap_single(
data.obj = peerj32.obj,
subject.var = "subject",
time.var = "time",
t.level = "1",
group.var = "group",
strata.var = "sex",
other.vars = NULL,
feature.level = c("Phylum", "Family", "Genus"),
feature.dat.type = "count",
features.plot = NULL,
top.k.plot = NULL,
top.k.func = NULL,
prev.filter = 0.001,
abund.filter = 0.01,
base.size = 10,
palette = NULL,
pdf = TRUE,
file.ann = NULL,
pdf.wid = 11,
pdf.hei = 8.5
)
data(peerj32.obj)
generate_taxa_heatmap_single(
data.obj = peerj32.obj,
subject.var = "subject",
time.var = "time",
t.level = "1",
group.var = "group",
strata.var = "sex",
other.vars = NULL,
feature.level = c("Phylum", "Family", "Genus"),
feature.dat.type = "count",
features.plot = NULL,
top.k.plot = NULL,
top.k.func = NULL,
cluster.rows = FALSE,
prev.filter = 0.001,
abund.filter = 0.01,
base.size = 10,
palette = NULL,
pdf = TRUE,
file.ann = NULL,
pdf.wid = 11,
pdf.hei = 8.5
)
generate_taxa_heatmap_single(
data.obj = peerj32.obj,
subject.var = "subject",
time.var = "time",
t.level = "1",
group.var = "group",
strata.var = NULL,
other.vars = NULL,
feature.level = c("Phylum", "Family", "Genus"),
feature.dat.type = "count",
features.plot = NULL,
top.k.plot = NULL,
top.k.func = NULL,
prev.filter = 0.001,
abund.filter = 0.01,
base.size = 10,
palette = NULL,
pdf = TRUE,
file.ann = NULL,
pdf.wid = 11,
pdf.hei = 8.5
)
generate_taxa_heatmap_single(
data.obj = peerj32.obj,
subject.var = "subject",
time.var = "time",
t.level = "1",
group.var = NULL,
strata.var = NULL,
other.vars = NULL,
feature.level = c("Phylum", "Family", "Genus"),
feature.dat.type = "count",
features.plot = NULL,
top.k.plot = NULL,
top.k.func = NULL,
prev.filter = 0.001,
abund.filter = 0.01,
base.size = 10,
palette = NULL,
pdf = TRUE,
file.ann = NULL,
pdf.wid = 11,
pdf.hei = 8.5
)
data(ecam.obj)
generate_taxa_heatmap_single(
data.obj = ecam.obj,
subject.var = "subject.id",
time.var = "month",
t.level = "0",
group.var = "antiexposedall",
strata.var = "diet",
other.vars = "delivery",
feature.level = c("Order", "Family", "Genus"),
feature.dat.type = "proportion",
features.plot = NULL,
top.k.plot = NULL,
top.k.func = NULL,
prev.filter = 0.001,
abund.filter = 0.01,
base.size = 10,
palette = NULL,
pdf = TRUE,
file.ann = NULL,
pdf.wid = 11,
pdf.hei = 8.5
)
generate_taxa_heatmap_single(
data.obj = ecam.obj,
subject.var = "subject.id",
time.var = "month",
t.level = "0",
group.var = "antiexposedall",
strata.var = "diet",
other.vars = "delivery",
feature.level = c("Genus"),
feature.dat.type = "proportion",
features.plot = unique(ecam.obj$feature.ann[,"Genus"])[-c(1,9)],
top.k.plot = NULL,
top.k.func = NULL,
prev.filter = 0.001,
abund.filter = 0.01,
base.size = 10,
palette = NULL,
pdf = TRUE,
file.ann = NULL,
pdf.wid = 11,
pdf.hei = 8.5
)
} # }