This function picks the optimal lambda
in a solution path using the
BIC criteria.
Usage
BIC_Pick(
y_vec,
x_mat,
solution_path,
real_logit_vec,
k_n,
a = 1,
bic_kn = k_n,
complex_bound
)
Arguments
- y_vec
response vector, 0 for control, 1 for case. n = length(y_vec) is the number of observations.
- x_mat
covariate matrix, consists of two parts. dim(x_mat) = (n, h + p * kn) First h columns are for demographical covariates(can include an intercept term) Rest columns are for p functional covariates, each being represented by a set of basis functions resulting kn covariates.
- solution_path
A solution path from function
Logistic_FAR_Path
- real_logit_vec
NOT used in this function
- k_n
number of basis functions.(This is also number of covariates in each group)
- a
a scalar adjusting the loglik in the first part of BIC
- bic_kn
a scalar adjusting the model complexsity part of BIC
- complex_bound
Numeric, the upper bound of the model complexsity to be considered. If not supplied, all functional covariates will be considered. In the case of
p > n
, this may lead to model saturation which makes the BIC cirteria favor a much more complex model because it offers near-perfect fitting results on the training set.
BIC
In this function, BIC is defined as
$$
BIC = 1 / a * loglik + df * log(n) / bic_kn
,
$$
where df
is the degree of freedom of the model. In this case, it's the number
of active covariates in the functional part of x_mat
. Since the algorithm form
the problem into a group lasso scenario, here the number of active covariates equals
to the number of active functional \(x(t)\) times the number of basis functions k_n
.