R/pre_define_functions.r
RSAVS_Summary_Iteration.Rd
This function is designed to summary and improve the resutls during the iteration of ADMM algorithm.
RSAVS_Summary_Iteration(
y_vec,
x_mat,
beta_vec,
mu_vec,
s_vec,
w_vec,
loss_type,
loss_param,
phi
)
numerical response vector. n = length(y_vec)
is the number of observations.
numerical covariate matrix. p = ncol(x_mat)
is the number of covariates.
covariate effect vector during the ADMM algorithm.
subgroup effect vector during the ADMM algorithm
augmented vector for pair-wise difference of mu_vec
in ADMM algorithm.
augmented vector for beta_vec
in ADMM algorithm.
character string indicating type of loss function.
numerical vector for necessary parameters in loss function.
a parameter needed in mBIC. It controls how strong mBIC penalizes the complexity of the candidate model.
a list, containing:
bic
: the bic value.
mu_vec
: the improved mu vector.
group_num
: number of subgroups in the improved mu_vec
.
active_num
: number of active covariates in the beta_vec
.
This function has two purposes:
Determine and improve beta_vec and mu_vec, if possible.
Compute BIC.
Since for large scale data set, especially with big number of observations, it's impossible to first
save all the variables during the ADMM algorithm over the lam1_length * lam2_length
grid points
of lambdas, then pick a best solution with mBIC. For s_vec
alone, this means we have to save a
matrix with n * (n - 1) / 2
rows and lam1_length * lam2_length
columns, which is hard
for a single computer. Instead, we summarise each iteration during the algorithm. Then there's no need
for storing so many data.
In the ADMM algorithm, it will take many iterations to reach a sharp tolerance.
But one can stop the algorithm early stage by setting a small max_iter
. This is
equivalent to setting a loose tolerance. Then the mu_vec
and beta_vec
will
not be close to their augmented counterparts s_vec
and w_vec
. But these
counterparts actually provides the sparsity information during the algorithm, therefore
w_vec
will provide the estimate of covariate effect while beta_vec
is just a intermediate variable.
mu_vec
is also the intermediate variable. Improvement is needed like forming
a reasonalbe subgroup structure. One possible solution is to utilize s_vec
to
improve mu_vec
. Another is to apply some cluster methods on mu_vec
. See
RSAVS_Determine_Mu
for more details.