Variable Selection under Logistic Regression for Compositional Functional Data

By Chao Cheng in Functional data analysis Compositional data Variable selection Microbiome

June 12, 2022

Abstract

This is my talk at 2022 International Workshop on Complex Functional Data Analysis

Date

June 11 – 12, 2022

Time

8:20 AM – 6:00 PM

Location

Online

Event

The gut microbiome has been shown to be closely related to human health. During the study researchers often take various samples for sequencing and identifying the microbiome, resulting naturally a set of trajectories describing this dynamic eco-system. In this paper we propose a logistic model for high dimensional functional compositional data in order to analyze the relationship between gut microbiome and colonizing status of multi-drug resistant bacteria (MDRB) after liver transplant operation. The proposed model is based on the linear log-contrast model for the compositional data but with some advances: the model incorporates both scalar and functional covariates for better model flexibility. A set of basis functions are chosen to perform a low-rank approximation for both the functional covariates and their corresponding functional coefficients. In such a way we achieve dimension reduction for the infinitely dimensional functions, and the functional variable selection problem can then take a form as the group-wise variable selection. The resulting model takes the form of a logistic regression subject to grouping but on an affine subspace. We develop an algorithm based on MM principle to solve this specific problem. The convergence property of the proposed algorithm is established. Also the statistical properties of the estimators are given for several penalty functions. Finally the proposed method is used to study the relationship between MDRB status and gut microbiome of patients before and after liver transplant operations. The analysis is conduct based on different biological level and variable selection approaches, which has shown consistent results in variable selection across these different levels, implying that the proposed method is promising for such studies.

img

Posted on:
June 12, 2022
Length:
2 minute read, 265 words
Categories:
Functional data analysis Compositional data Variable selection Microbiome
Tags:
Functional data analysis Compositional data Variable selection Microbiome
See Also:
Robust analysis of cancer heterogeneity for high-dimensional data
Robust Analysis of Cancer Heterogeneity for High-dimensional Data
Robust Subgroup Analysis for High-dimensional Data(preprint)