Biomarker detection for disease classification in longitudinal microbiome data
By Chao Cheng, Hanteng Ma, Yujie Zhong, Anne-Catrin Uhlemann, Xingdong Feng, Jianhua Hu in Compositional data Functional data analysis High-dimensional data logistic regression
June 1, 2025
The microbiome has been found to have a close relationship with human health. Advancements in sequencing technologies have enabled in-depth studies of microbial communities and their associations with various diseases. When analyzing microbiome data, it is common to perform compositional scale normalization to ensure statistical validity. This requires special treatment to address the unique characteristics of microbiome data. Furthermore, biomedical studies often involve repeated measurements of microbial samples, which adds complexity to the data analysis. In this paper we focus on a liver transplant microbiome study. The main objective is to investigate the association between the colonization status of multidrug-resistant bacteria (MDRB) and the longitudinal microbial abundance profile. To accomplish this, we employ a regularized functional logistic regression model in our analysis. Specifically, we utilize the log-contrast model with a low-rank approximation to handle the compositional covariates and nonconvex penalties to select the important components in the covariate space. We propose an efficient estimation algorithm and establish the oracle property of the estimator. We name this new development as Functional Compositional data Quadratic Method (FCQM). We demonstrate the promise of the proposed method with extensive simulation studies and the liver transplant application.
An R package LogisticFAR is provided for this paper. Click the Code button above to check its Github repo.

- Posted on:
- June 1, 2025
- Length:
- 1 minute read, 211 words