Longitudinal Differential Abundance Test in Microbiome Data — generate_taxa_per_time_test

This function performs differential abundance testing across multiple time points in longitudinal microbiome data. It is tailored to analyze how the abundance of microbial taxa varies over time, within different groups or under various conditions.

Usage

generate_taxa_per_time_test_long(
  data.obj,
  subject.var,
  time.var = NULL,
  group.var = NULL,
  adj.vars = NULL,
  feature.level,
  prev.filter = 0,
  abund.filter = 0,
  feature.dat.type = c("count", "proportion"),
  ...
)

Arguments

data.obj: A MicrobiomeStat data object containing microbiome data and metadata.
subject.var: A string specifying the column name in meta.dat that uniquely identifies each subject.
time.var: Optional; a string representing the time variable in the meta.dat. If provided, enables longitudinal analysis.
group.var: Optional; a string specifying the group variable in meta.dat for between-group comparisons.
adj.vars: Optional; a vector of strings representing covariates in meta.dat for adjustment in the analysis.
feature.level: A string or vector of strings indicating the taxonomic level(s) for analysis (e.g., "Phylum", "Class").
prev.filter: Numeric; a minimum prevalence threshold for taxa inclusion in the analysis.
abund.filter: Numeric; a minimum abundance threshold for taxa inclusion in the analysis.
feature.dat.type: Character; "count" or "proportion", indicating the type of feature data.
...: Additional arguments passed to other methods.

Value

A nested list structure. The top level of the list corresponds to different time points, and each element contains a list of dataframes for each taxonomic level. Each dataframe provides statistical analysis results for taxa at that level and time point.

Details

The function integrates various data manipulations, normalization procedures, and statistical tests to assess the significance of taxa changes over time or between groups. It allows for the adjustment of covariates and handles both count and proportion data types.

The function constructs a mixed-effects model formula based on the provided variables, handling fixed and random effects to account for repeated measures in subjects. It performs filtering based on prevalence and abundance thresholds and applies normalization and aggregation procedures as necessary.

Importantly, the function conducts differential abundance analysis separately for each time point in the longitudinal data. This approach allows for the identification of taxa that show significant changes at specific time points, providing insights into the dynamics of the microbiome over time.

Examples

if (FALSE) { # \dontrun{
# Example 1: Analyzing the ECAM dataset
data("ecam.obj")

# Analyzing the impact of delivery method on microbial composition over months
result1 <- generate_taxa_per_time_test_long(
  data.obj = ecam.obj,
  subject.var = "studyid",
  time.var = "month_num",
  group.var = "delivery",
  adj.vars = "diet",
  feature.level = c("Phylum", "Class"),
  feature.dat.type = "proportion"
)

# Visualizing the results for the ECAM dataset
dotplot_ecam <- generate_taxa_per_time_dotplot_long(
  data.obj = ecam.obj,
  test.list = result1,
  group.var = "delivery",
  time.var = "month_num",
  feature.level = c("Phylum", "Class")
)

# Example 2: Analyzing the Type 2 Diabetes dataset
data("subset_T2D.obj")

# Longitudinal analysis of microbial changes in different racial groups
result2 <- generate_taxa_per_time_test_long(
  data.obj = subset_T2D.obj,
  subject.var = "subject_id",
  time.var = "visit_number_num",
  group.var = "subject_race",
  adj.vars = "sample_body_site",
  prev.filter = 0.1,
  abund.filter = 0.001,
  feature.level = c("Genus", "Family"),
  feature.dat.type = "count"
)

# Visualizing the results for the Type 2 Diabetes dataset
dotplot_T2D <- generate_taxa_per_time_dotplot_long(
  data.obj = subset_T2D.obj,
  test.list = result2,
  group.var = "subject_race",
  time.var = "visit_number_num",
  t0.level = unique(subset_T2D.obj$meta.dat$visit_number_num)[1],
  ts.levels = unique(subset_T2D.obj$meta.dat$visit_number_num)[-1],
  feature.level = c("Genus", "Family")
)
} # }