Skip to contents

This function picks the optimal lambda in a solution path using the BIC criteria.

Usage

BIC_Pick(
  y_vec,
  x_mat,
  solution_path,
  real_logit_vec,
  k_n,
  a = 1,
  bic_kn = k_n,
  complex_bound
)

Arguments

y_vec

response vector, 0 for control, 1 for case. n = length(y_vec) is the number of observations.

x_mat

covariate matrix, consists of two parts. dim(x_mat) = (n, h + p * kn) First h columns are for demographical covariates(can include an intercept term) Rest columns are for p functional covariates, each being represented by a set of basis functions resulting kn covariates.

solution_path

A solution path from function Logistic_FAR_Path

real_logit_vec

NOT used in this function

k_n

number of basis functions.(This is also number of covariates in each group)

a

a scalar adjusting the loglik in the first part of BIC

bic_kn

a scalar adjusting the model complexsity part of BIC

complex_bound

Numeric, the upper bound of the model complexsity to be considered. If not supplied, all functional covariates will be considered. In the case of p > n, this may lead to model saturation which makes the BIC cirteria favor a much more complex model because it offers near-perfect fitting results on the training set.

BIC

In this function, BIC is defined as $$ BIC = 1 / a * loglik + df * log(n) / bic_kn , $$ where df is the degree of freedom of the model. In this case, it's the number of active covariates in the functional part of x_mat. Since the algorithm form the problem into a group lasso scenario, here the number of active covariates equals to the number of active functional \(x(t)\) times the number of basis functions k_n.