Filter a Microbiome Data Matrix by Prevalence and Average Abundance

This function filters taxa in a microbiome matrix based on specified prevalence and average abundance thresholds.

Usage

mStat_filter(x, prev.filter, abund.filter)

Arguments

x: A matrix containing taxa (in rows) and sample (in columns) microbial abundance data.
prev.filter: Numeric, the minimum prevalence threshold for a taxon to be retained. Prevalence is calculated as the proportion of samples where the taxon is present.
abund.filter: Numeric, the minimum average abundance threshold for a taxon to be retained.

Value

A matrix with taxa filtered based on the specified thresholds.

Details

The function first converts the input data into a long format and then groups by taxa. It computes both the average abundance and prevalence for each taxon. Subsequently, it filters out taxa that do not meet the provided prevalence and average abundance thresholds.

Examples


# Example with simulated data
if (FALSE) { # \dontrun{
data_matrix <- matrix(c(0, 3, 4, 0, 2, 7, 8, 9, 10), ncol=3)
colnames(data_matrix) <- c("sample1", "sample2", "sample3")
rownames(data_matrix) <- c("taxa1", "taxa2", "taxa3")

filtered_data_simulated <- mStat_filter(data_matrix, 0.5, 5)
print(filtered_data_simulated)
} # }

# Example with real dataset: peerj32.obj
if (FALSE) { # \dontrun{
data(peerj32.obj)
data_matrix_real <- peerj32.obj$feature.tab

# Assuming the matrix contains counts with taxa in rows and samples in columns
filtered_data_real <- mStat_filter(data_matrix_real, 0.5, 5)
print(filtered_data_real)
} # }