Skip to contents

This function performs hierarchical clustering on microbiome data based on grouping variables and strata variables in sample metadata and generates stacked heatmaps using the “pheatmap” package. It can also save the resulting heatmap as a PDF file.

Usage

generate_taxa_heatmap_single(
  data.obj,
  subject.var,
  time.var = NULL,
  t.level = NULL,
  group.var = NULL,
  strata.var = NULL,
  other.vars = NULL,
  feature.level = NULL,
  feature.dat.type = c("count", "proportion", "other"),
  features.plot = NULL,
  top.k.plot = NULL,
  top.k.func = NULL,
  prev.filter = 0.01,
  abund.filter = 1e-04,
  base.size = 10,
  palette = NULL,
  cluster.cols = NULL,
  cluster.rows = NULL,
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5,
  ...
)

Arguments

data.obj

A list object in a format specific to MicrobiomeStat, which can include components such as feature.tab (matrix), feature.ann (matrix), meta.dat (data.frame), tree, and feature.agg.list (list). The data.obj can be converted from other formats using several functions from the MicrobiomeStat package, including: 'mStat_convert_DGEList_to_data_obj', 'mStat_convert_DESeqDataSet_to_data_obj', 'mStat_convert_phyloseq_to_data_obj', 'mStat_convert_SummarizedExperiment_to_data_obj', 'mStat_import_qiime2_as_data_obj', 'mStat_import_mothur_as_data_obj', 'mStat_import_dada2_as_data_obj', and 'mStat_import_biom_as_data_obj'. Alternatively, users can construct their own data.obj. Note that not all components of data.obj may be required for all functions in the MicrobiomeStat package.

subject.var

The name of the subject variable in the samples

time.var

The name of the time variable in the samples

t.level

Character string specifying the time level/value to subset data to, if a time variable is provided. Default NULL does not subset data.

group.var

The name of the grouping variable in the samples

strata.var

The name of the strata variable in the samples

other.vars

A character vector specifying additional variables from the metadata to include in the heatmap annotation. These variables will be added to the annotation columns alongside `group.var` and `strata.var`. This allows for the visualization of additional metadata information in the heatmap. Default is NULL, which means no additional variables are included.

feature.level

The column name in the feature annotation matrix (feature.ann) of data.obj to use for summarization and plotting. This can be the taxonomic level like "Phylum", or any other annotation columns like "Genus" or "OTU_ID". Should be a character vector specifying one or more column names in feature.ann. Multiple columns can be provided, and data will be plotted separately for each column. Default is NULL, which defaults to all columns in feature.ann if `features.plot` is also NULL.

feature.dat.type

The type of the feature data, which determines how the data is handled in downstream analyses. Should be one of: - "count": Raw count data, will be normalized by the function. - "proportion": Data that has already been normalized to proportions/percentages. - "other": Custom abundance data that has unknown scaling. No normalization applied. The choice affects preprocessing steps as well as plot axis labels. Default is "count", which assumes raw OTU table input.

features.plot

A character vector specifying which feature IDs (e.g. OTU IDs) to plot. Default is NULL, in which case features will be selected based on `top.k.plot` and `top.k.func`.

top.k.plot

Integer specifying number of top k features to plot, when `features.plot` is NULL. Default is NULL, in which case all features passing filters will be plotted.

top.k.func

Function to use for selecting top k features, when `features.plot` is NULL. Options include inbuilt functions like "mean", "sd", or a custom function. Default is NULL, in which case features will be selected by abundance.

prev.filter

Numeric value specifying the minimum prevalence threshold for filtering taxa before analysis. Taxa with prevalence below this value will be removed. Prevalence is calculated as the proportion of samples where the taxon is present. Default 0 removes no taxa by prevalence filtering.

abund.filter

Numeric value specifying the minimum abundance threshold for filtering taxa before analysis. Taxa with mean abundance below this value will be removed. Abundance refers to counts or proportions depending on feature.dat.type. Default 0 removes no taxa by abundance filtering.

base.size

Base font size for the generated plots.

palette

The color palette to be used for annotating the plots. This parameter can be specified in several ways: - As a character string representing a predefined palette name. Available predefined palettes include 'npg', 'aaas', 'nejm', 'lancet', 'jama', 'jco', and 'ucscgb'. - As a vector of color codes in a format accepted by ggplot2 (e.g., hexadecimal color codes). The function uses `mStat_get_palette` to retrieve or generate the color palette. If `palette` is NULL or an unrecognized string, a default color palette will be used. The colors are applied to the specified grouping variables (`group.var`, `strata.var`) in the heatmap, ensuring each level of these variables is associated with a unique color. If both `group.var` and `strata.var` are specified, the function assigns colors to `group.var` from the start of the palette and to `strata.var` from the end, ensuring distinct color representations for each annotation layer.

cluster.cols

A logical variable indicating if columns should be clustered. Default is NULL.

cluster.rows

A logical variable indicating if rows should be clustered. Default is NULL.

pdf

If TRUE, save the plot as a PDF file (default: TRUE)

file.ann

The file name annotation (default: NULL)

pdf.wid

Width of the PDF plots.

pdf.hei

Height of the PDF plots.

...

Additional arguments passed to the pheatmap() function from the “pheatmap” package.

Value

An object of class pheatmap, the generated heatmap plot

See also

Examples

if (FALSE) { # \dontrun{
# Load required libraries and example data
library(pheatmap)

data(peerj32.obj)
generate_taxa_heatmap_single(
  data.obj = peerj32.obj,
  subject.var = "subject",
  time.var = "time",
  t.level = "1",
  group.var = "group",
  strata.var = "sex",
  other.vars = NULL,
  feature.level = c("Phylum", "Family", "Genus"),
  feature.dat.type = "count",
  features.plot = NULL,
  top.k.plot = NULL,
  top.k.func = NULL,
  prev.filter = 0.001,
  abund.filter = 0.01,
  base.size = 10,
  palette = NULL,
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5
)
data(peerj32.obj)
generate_taxa_heatmap_single(
  data.obj = peerj32.obj,
  subject.var = "subject",
  time.var = "time",
  t.level = "1",
  group.var = "group",
  strata.var = "sex",
  other.vars = NULL,
  feature.level = c("Phylum", "Family", "Genus"),
  feature.dat.type = "count",
  features.plot = NULL,
  top.k.plot = NULL,
  top.k.func = NULL,
  cluster.rows = FALSE,
  prev.filter = 0.001,
  abund.filter = 0.01,
  base.size = 10,
  palette = NULL,
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5
)
generate_taxa_heatmap_single(
  data.obj = peerj32.obj,
  subject.var = "subject",
  time.var = "time",
  t.level = "1",
  group.var = "group",
  strata.var = NULL,
  other.vars = NULL,
  feature.level = c("Phylum", "Family", "Genus"),
  feature.dat.type = "count",
  features.plot = NULL,
  top.k.plot = NULL,
  top.k.func = NULL,
  prev.filter = 0.001,
  abund.filter = 0.01,
  base.size = 10,
  palette = NULL,
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5
)
generate_taxa_heatmap_single(
  data.obj = peerj32.obj,
  subject.var = "subject",
  time.var = "time",
  t.level = "1",
  group.var = NULL,
  strata.var = NULL,
  other.vars = NULL,
  feature.level = c("Phylum", "Family", "Genus"),
  feature.dat.type = "count",
  features.plot = NULL,
  top.k.plot = NULL,
  top.k.func = NULL,
  prev.filter = 0.001,
  abund.filter = 0.01,
  base.size = 10,
  palette = NULL,
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5
)

data(ecam.obj)
generate_taxa_heatmap_single(
  data.obj = ecam.obj,
  subject.var = "subject.id",
  time.var = "month",
  t.level = "0",
  group.var = "antiexposedall",
  strata.var = "diet",
  other.vars = "delivery",
  feature.level = c("Order", "Family", "Genus"),
  feature.dat.type = "proportion",
  features.plot = NULL,
  top.k.plot = NULL,
  top.k.func = NULL,
  prev.filter = 0.001,
  abund.filter = 0.01,
  base.size = 10,
  palette = NULL,
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5
)
generate_taxa_heatmap_single(
  data.obj = ecam.obj,
  subject.var = "subject.id",
  time.var = "month",
  t.level = "0",
  group.var = "antiexposedall",
  strata.var = "diet",
  other.vars = "delivery",
  feature.level = c("Genus"),
  feature.dat.type = "proportion",
  features.plot = unique(ecam.obj$feature.ann[,"Genus"])[-c(1,9)],
  top.k.plot = NULL,
  top.k.func = NULL,
  prev.filter = 0.001,
  abund.filter = 0.01,
  base.size = 10,
  palette = NULL,
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5
)
} # }