7 Competitor Methods
We also provide implementations of some competitor feature selection methods. We used these in the simulation studies in our paper to compare cluster stability selection to the protolasso (Reid and Tibshirani, 2016) and the cluster representative lasso (Bühlmann et. al. 2013), two other feature selection methods that are designed for data with clustered features. These feature selection methods are in some ways closely related, so their implementations share helper functions.
-
protolasso()-
processClusterLassoInputs()checks and formats the function inputs -
getXglmnet()formats the provided design matrixXglmnetfor the lasso as implemented byglmnet(for the protolasso, this means discarding all features from each cluster except the one most highly correlated with the response; for the cluster representative lasso, this means replacing the clustered features with a simple average of the cluster members).-
checkGetXglmnetInputs()verifies the inputs togetXglmnet()
-
- Finally,
getClusterSelsFromGlmnet()extracts the relevant output from the results yielded by aglmnetlasso fit.-
getSelectedSets()takes in a single selected set fromXglmnetand yields a selected feature set in the original feature space (with each selected cluster fromXglmnetreplaced by its prototype) as well as a selected set of clusters.
-
-
clusterRepLasso()
The shared core of protolasso() and clusterRepLasso(). The two procedures differ only in how getXglmnet() constructs the design matrix (its type argument); the rest of the body – input handling, the glmnet fit, and extracting selected sets – is identical, so it lives here once.
#' Shared implementation of protolasso and clusterRepLasso
#'
#' Internal helper holding the common body of `protolasso()` and
#' `clusterRepLasso()`, which differ only in the `type` passed to `getXglmnet()`
#' (the sole point where the two procedures diverge: protolasso discards
#' non-prototype cluster members, clusterRepLasso replaces each cluster with its
#' representative).
#' @param X,y,clusters,nlambda As documented in `protolasso()` /
#' `clusterRepLasso()`.
#' @param type Character; either "protolasso" or "clusterRepLasso", passed to
#' `getXglmnet()` to select the design-matrix construction.
#' @return As documented in `protolasso()` / `clusterRepLasso()`.
#' @author Gregory Faletto, Jacob Bien
#' @keywords internal
#' @noRd
clusterLassoCore <- function(X, y, clusters, nlambda, type){
# Handle and format inputs; get cluster prototypes
ret <- processClusterLassoInputs(X, y, clusters, nlambda)
x <- ret$x
clusters <- ret$clusters
prototypes <- ret$prototypes
feat_names <- ret$var_names
rm(ret)
# Format the design matrix for glmnet according to the chosen procedure
# (type = "protolasso" or "clusterRepLasso"); see getXglmnet().
X_glmnet <- getXglmnet(x, clusters, type=type, prototypes=prototypes)
# getXglmnet returns one column per cluster, so a single all-encompassing
# cluster (or a genuine p < 2 input, which processClusterLassoInputs does not
# rule out) yields a 1-column design that glmnet cannot fit ("x should be a
# matrix with 2 or more columns"). Fail early with a message naming both
# degenerate causes instead of surfacing glmnet's opaque error. This catches
# both protolasso() and clusterRepLasso(), which route through here.
if(ncol(X_glmnet) < 2){
stop("protolasso()/clusterRepLasso() need at least 2 cluster representatives to fit the lasso, but the provided data yields only 1 (all features are in a single cluster, or p < 2).")
}
# Estimate the lasso on the cluster prototypes / representatives
fit <- glmnet::glmnet(x=X_glmnet, y=y, family="gaussian", nlambda=nlambda)
lasso_sets <- unique(glmnet::predict.glmnet(fit, type="nonzero"))
# Obtain a tidy list of selected sets--one for each model size
cluster_sel_results <- getClusterSelsFromGlmnet(lasso_sets, clusters,
prototypes, feat_names)
return(list(selected_sets=cluster_sel_results$selected_sets,
selected_clusts_list=cluster_sel_results$selected_clusts_list,
beta=fit$beta))
}protolasso():
#' Select features via the protolasso (Reid and Tibshirani 2016)
#'
#' @param X An n x p numeric matrix (preferably) or a data.frame (which will
#' be coerced internally to a matrix by the function model.matrix) containing
#' p >= 2 features/predictors. Must not contain missing (`NA`) values.
#' @param y The response; A length n numeric (or integer) real-valued vector.
#' @param clusters A list of integer vectors; each vector should contain the
#' indices of a cluster of features (a subset of 1:p). (If there is only one
#' cluster, clusters can either be a list of length 1 or an integer vector.)
#' All of the provided clusters must be non-overlapping. Every feature not
#' appearing in any cluster will be assumed to be unclustered (that is, they
#' will be treated as if they are in a "cluster" containing only themselves).
#' CAUTION: if the provided X is a data.frame that contains a categorical
#' feature with more than two levels, then the resulting matrix made from
#' model.matrix will have a different number of columns than the provided
#' data.frame, some of the feature numbers will change, and the clusters
#' argument will not work properly (in the current version of the package).
#' To get correct results in this case, please use model.matrix to convert
#' the data.frame to a numeric matrix on your own, then provide this matrix
#' and cluster assignments with respect to this matrix. Default is list() (so no
#' clusters are specified).
#' @param nlambda Integer; the number of lambda values to use in the lasso fit
#' for the protolasso. Default is 100 (following the default for glmnet). For
#' now, nlambda must be at least 2 (using a single lambda is not supported).
#' @return A list with three elements. \item{selected_sets}{A list of integer
#' vectors. Entry k of this list contains a selected set (an integer vector) of
#' size k yielded by the protolasso (If no set of size k was selected, entry k
#' will be empty.)} \item{selected_clusts_list}{A list; each element of the list
#' is a named list of selected clusters. (That is, if a selected set of size k
#' was yielded by the protolasso, then `selected_clusts_list[[k]]` is a named
#' list of length k, where each member of the list is an integer vector
#' of cluster members. In particular, `selected_clusts_lists[[k]][[j]]` will be
#' the cluster that contains feature `selected_sets[[k]][j]`.)} \item{beta}{The
#' beta output from glmnet when the lasso was estimated on a matrix of
#' prototypes. (See documentation for the function glmnet from the glmnet
#' package for details.)}
#' @author Gregory Faletto, Jacob Bien
#' @references Reid, S., & Tibshirani, R. (2016). Sparse regression and marginal
#' testing using cluster prototypes. \emph{Biostatistics}, 17(2), 364–376.
#' \url{https://doi.org/10.1093/biostatistics/kxv049}.
#' @examples
#' set.seed(1)
#' data <- genClusteredData(n = 50, p = 11, k_unclustered = 2,
#' cluster_size = 4, n_clusters = 1, snr = 3)
#' clusters <- list(cluster1 = 1:4)
#' res <- protolasso(X = data$X, y = data$y, clusters = clusters)
#' str(res, max.level = 1)
#' @export
protolasso <- function(X, y, clusters=list(), nlambda=100){
clusterLassoCore(X, y, clusters, nlambda, type="protolasso")
}#' Check the inputs to protolasso and clusterRepLasso, format clusters, and
#' identify prototypes for each cluster
#'
#' @param X An n x p numeric matrix (preferably) or a data.frame (which will
#' be coerced internally to a matrix by the function model.matrix) containing
#' p >= 2 features/predictors
#' @param y The response; A length n numeric (or integer) real-valued vector.
#' @param clusters A list of integer vectors; each vector should contain the
#' indices of a cluster of features (a subset of 1:p). (If there is only one
#' cluster, clusters can either be a list of length 1 or an integer vector.)
#' All of the provided clusters must be non-overlapping. Every feature not
#' appearing in any cluster will be assumed to be unclustered (that is, they
#' will be treated as if they are in a "cluster" containing only themselves).
#' Default is list() (so no clusters are specified).
#' @param nlambda Integer; the number of lambda values to use in the lasso fit
#' for the protolasso. Default is 100 (following the default for glmnet). For
#' now, nlambda must be at least 2 (using a single lambda is not supported).
#' @return A list with four elements. \item{x}{The provided X, converted to a
#' matrix if it was provided as a data.frame, and with column names removed.}
#' \item{clusters}{A named list where each entry is an integer vector of indices
#' of features that are in a common cluster. (The length of list clusters is
#' equal to the number of clusters.) All identified clusters are
#' non-overlapping. All features appear in exactly one cluster (any unclustered
#' features will be put in their own "cluster" of size 1).}
#' \item{prototypes}{An integer vector whose length is equal to the number of
#' clusters. Entry i is the index of the feature belonging to cluster i that is
#' most highly correlated with y (that is, the prototype for the cluster, as in
#' the protolasso; see Reid and Tibshirani 2016).} \item{var_names}{If the
#' provided X matrix had column names, the names of the featurrs in the provided
#' X matrix. If no names were provided, feat_names will be NA.}
#' @author Gregory Faletto, Jacob Bien
#' @references Reid, S., & Tibshirani, R. (2016). Sparse regression and marginal
#' testing using cluster prototypes. \emph{Biostatistics}, 17(2), 364–376.
#' \url{https://doi.org/10.1093/biostatistics/kxv049}.
#' @keywords internal
#' @noRd
processClusterLassoInputs <- function(X, y, clusters, nlambda){
stopifnot(is.matrix(X) | is.data.frame(X))
checkNoNAs(X, "X")
# Check if x is a matrix; if it's a data.frame, convert to matrix.
X <- coerceDataFrameToMatrix(X, clusters)
stopifnot(is.matrix(X))
feat_names <- as.character(NA)
if(!is.null(colnames(X))){
feat_names <- colnames(X)
if(any(is.na(feat_names))){
stop("Some features in provided X matrix had valid names and some had NA names; please neither name all features in X or remove the names altogether.")
}
}
n <- nrow(X)
colnames(X) <- character()
stopifnot(is.numeric(y) | is.integer(y))
stopifnot(n == length(y))
stopifnot(all(is.finite(y)))
# Check clusters argument
clusters <- checkCssClustersInput(clusters)
# Format clusters into a list where all features are in exactly one
# cluster (any unclustered features are put in their own "cluster" of size
# 1).
clust_names <- as.character(NA)
if(!is.null(names(clusters)) & is.list(clusters)){
clust_names <- names(clusters)
}
cluster_results <- formatClusters(clusters, p=ncol(X),
clust_names=clust_names, get_prototypes=TRUE, x=X, y=y)
clusters <- cluster_results$clusters
prototypes <- cluster_results$prototypes
rm(cluster_results)
stopifnot(length(clusters) == length(prototypes))
stopifnot(is.numeric(nlambda) | is.integer(nlambda))
stopifnot(length(nlambda) == 1)
stopifnot(!is.na(nlambda))
stopifnot(nlambda >= 2)
stopifnot(nlambda == round(nlambda))
return(list(x=X, clusters=clusters, prototypes=prototypes,
var_names=feat_names))
}getXglmnet():
#' Converts the provided design matrix to an appropriate format for either the
#' protolasso or the cluster representative lasso.
#'
#' Creates design matrix for glmnet by dealing with clusters (for
#' type="protolasso", discards all cluster members except prototype; for
#' type="clusterRepLasso", replaces all cluster members with a simple
#' average of all the cluster members).
#' @param x A numeric matrix; the provided matrix with n observations and p
#' features.
#' @param clusters A named list where each entry is an integer vector of indices
#' of features that are in a common cluster. (The length of list clusters should
#' be equal to the number of clusters.) All identified clusters should be
#' non-overlapping. All features should appear in exactly one cluster (any
#' unclustered features should be put in their own "cluster" of size 1).
#' @param type Character; "protolasso" for the protolasso or "clusterRepLasso"
#' for the cluster representative lasso.
#' @param prototypes Only required for type "protolasso". An integer vector
#' whose length is equal to the number of clusters. Entry i should be the
#' prototype for cluster i (the feature belonging to cluster i that is most
#' highly correlated with y; see Reid and Tibshirani 2016).
#' @return A numeric matrix; the design matrix as required for the protolasso or
#' cluster representative lasso, prepared for input to glmnet.
#' @author Gregory Faletto, Jacob Bien
#' @references Reid, S., & Tibshirani, R. (2016). Sparse regression and marginal
#' testing using cluster prototypes. \emph{Biostatistics}, 17(2), 364–376.
#' \url{https://doi.org/10.1093/biostatistics/kxv049}.
#' @keywords internal
#' @noRd
getXglmnet <- function(x, clusters, type, prototypes=NA){
# Check inputs
checkGetXglmnetInputs(x, clusters, type, prototypes)
n <- nrow(x)
p <- ncol(x)
X_glmnet_cols <- vector("list", length(clusters))
for(i in 1:length(clusters)){
cluster_i <- clusters[[i]]
if(length(cluster_i) == 1){
X_glmnet_i <- x[, cluster_i]
} else{
stopifnot(length(cluster_i) > 1)
if(type == "protolasso"){
prototype_ind_i <- which(prototypes %in% cluster_i)
stopifnot(length(prototype_ind_i) == 1)
prototype_i <- prototypes[prototype_ind_i]
X_glmnet_i <- x[, prototype_i]
} else {
stopifnot(type == "clusterRepLasso")
X_glmnet_i <- rowMeans(x[, cluster_i])
}
}
stopifnot(length(X_glmnet_i) == n)
X_glmnet_cols[[i]] <- X_glmnet_i
}
# do.call(cbind, ...) reproduces cbind's exact type-promotion (and so
# preserves the integer storage of an integer x) while avoiding the
# O(n*k^2) copying of growing X_glmnet column by column (#58).
X_glmnet <- do.call(cbind, X_glmnet_cols)
stopifnot(ncol(X_glmnet) == length(clusters))
stopifnot(ncol(X_glmnet) == length(clusters))
colnames(X_glmnet) <- character()
# Check output
stopifnot(is.matrix(X_glmnet))
stopifnot(nrow(X_glmnet) == n)
stopifnot(ncol(X_glmnet) <= p)
stopifnot(ncol(X_glmnet) >= 1)
return(X_glmnet)
}#' Verifies the inputs for getXglmnet.
#'
#' @param x A numeric matrix; the provided matrix with n observations and p
#' features.
#' @param clusters A named list where each entry is an integer vector of indices
#' of features that are in a common cluster. (The length of list clusters should
#' be equal to the number of clusters.) All identified clusters should be
#' non-overlapping. All features should appear in exactly one cluster (any
#' unclustered features should be put in their own "cluster" of size 1).
#' @param type Character; "protolasso" for the protolasso or "clusterRepLasso"
#' for the cluster representative lasso.
#' @param prototypes Only required for type "protolasso". An integer vector
#' whose length is equal to the number of clusters. Entry i should be the
#' prototype for cluster i (the feature belonging to cluster i that is most
#' highly correlated with y; see Reid and Tibshirani 2016).
#' @author Gregory Faletto, Jacob Bien
#' @references Reid, S., & Tibshirani, R. (2016). Sparse regression and marginal
#' testing using cluster prototypes. \emph{Biostatistics}, 17(2), 364–376.
#' \url{https://doi.org/10.1093/biostatistics/kxv049}.
#' @keywords internal
#' @noRd
checkGetXglmnetInputs <- function(x, clusters, type, prototypes){
stopifnot(is.matrix(x))
stopifnot(is.list(clusters))
stopifnot(all(lengths(clusters) >= 1))
stopifnot(length(type) == 1)
stopifnot(is.character(type))
stopifnot(!is.na(type))
stopifnot(type %in% c("protolasso", "clusterRepLasso"))
stopifnot(!is.na(prototypes))
stopifnot(is.integer(prototypes))
stopifnot(all(!is.na(prototypes)))
stopifnot(length(prototypes) == length(unique(prototypes)))
stopifnot(all(prototypes %in% 1:ncol(x)))
for(i in 1:length(clusters)){
cluster_i <- clusters[[i]]
stopifnot(sum(prototypes %in% cluster_i) == 1)
}
}#' Extracts selected clusters and cluster prototypes from the glmnet lasso
#' output
#'
#' @param lasso_sets A list of integer vectors. Each vector represents a set of
#' features selected by the lasso for a given value of the penalty parameter
#' lambda.
#' @param clusters A named list where each entry is an integer vector of indices
#' of features that are in a common cluster. (The length of list clusters is
#' equal to the number of clusters.) All identified clusters must be
#' non-overlapping. All features appear in exactly one cluster (any unclustered
#' features must be in their own "cluster" of size 1).
#' @param prototypes An integer vector whose length must be equal to the number
#' of clusters. Entry i should be the index of the feature belonging to cluster
#' i that is most highly correlated with y (that is, the prototype for the
#' cluster, as in the protolasso; see Reid and Tibshirani 2016).
#' @param feat_names Character vector; the names of the features in X. (If the
#' X provided to protolasso or clusterRepLasso did not have feature names,
#' feat_names will be NA.)
#' @return A list containing the following items: \item{selected_sets}{A list of
#' integer vectors. Entry k of this list contains a selected set of size k
#' yielded by glmnet--each member of the set is the index of a single feature
#' from a cluster selected by either the protolasso or the cluster
#' representative lasso (the prototype from that cluster--the cluster member
#' most highly correlated with y). (If no set of size k was selected, entry k
#' will be NULL.)} \item{selected_clusts_list}{A list of lists; entry k of this
#' list is a list of length k of clusters (the clusters that were selected by
#' the cluster representative lasso). Again, if no set of size k was selected,
#' entry k will be NULL.}
#' @author Gregory Faletto, Jacob Bien
#' @references Reid, S., & Tibshirani, R. (2016). Sparse regression and marginal
#' testing using cluster prototypes. \emph{Biostatistics}, 17(2), 364–376.
#' \url{https://doi.org/10.1093/biostatistics/kxv049}. \cr Bühlmann, P.,
#' Rütimann, P., van de Geer, S., & Zhang, C. H. (2013). Correlated variables in
#' regression: Clustering and sparse estimation.
#' \emph{Journal of Statistical Planning and Inference}, 143(11), 1835–1858.
#' \url{https://doi.org/10.1016/j.jspi.2013.05.019}.
#' @keywords internal
#' @noRd
getClusterSelsFromGlmnet <- function(lasso_sets, clusters, prototypes,
feat_names){
if(any(!is.na(feat_names))){
stopifnot(all(!is.na(feat_names)))
}
# Compute each set's length once (used both for max_length and the
# per-size subset below) instead of re-walking lengths every iteration (#58)
set_lengths <- lengths(lasso_sets)
# Largest selected set among all those in lasso_sets
max_length <- max(set_lengths)
# Preparing lists to store
selected_sets <- list()
selected_clusts_list <- list()
for(j in 1:max_length){
# Lasso selected set of size j
lasso_sets_j <- lasso_sets[set_lengths == j]
# Are there any lasso selected sets of size j? (If not, we will skip to
# the next j, and slot j in the list will be empty.)
if(length(lasso_sets_j) > 0){
# Select the first set of size j
lasso_set_j <- lasso_sets_j[[1]]
stopifnot(length(lasso_set_j) == j)
ret <- getSelectedSets(lasso_set=lasso_set_j, clusters=clusters,
prototypes=prototypes, feat_names=feat_names)
selected_sets[[j]] <- ret$selected_set
selected_clusts_list[[j]] <- ret$selected_clusts_list
rm(ret)
}
}
stopifnot(length(selected_sets) <= max_length)
stopifnot(length(selected_clusts_list) <= max_length)
return(list(selected_sets=selected_sets,
selected_clusts_list=selected_clusts_list))
}#' Converts a selected set from X_glmnet to selected sets and selected clusters
#' from the original feature space of X.
#'
#' @param lasso_set A vector containing the indices of selected cluster
#' representatives or prototypes.
#' @param clusters A named list where each entry is an integer vector of indices
#' of features that are in a common cluster. (The length of list clusters is
#' equal to the number of clusters.) All identified clusters must be
#' non-overlapping. All features appear in exactly one cluster (any unclustered
#' features must be in their own "cluster" of size 1).
#' @param prototypes An integer vector whose length must be equal to the number
#' of clusters. Entry i should be the index of the feature belonging to cluster
#' i that is most highly correlated with y (that is, the prototype for the
#' cluster, as in the protolasso).
#' @param feat_names Character vector; the names of the features in X.
#' @return A list containing two items: \item{selected_set}{An integer vector
#' with length equal to lasso_set containing a set of selected features in the
#' original X matrix. (Selections in lasso_set corresponding to a cluster will
#' be replaced by the cluster's prototype from X.)}
#' \item{selected_clusts_list}{A named list of integer vectors with length equal
#' to selected_set. `selected_clusts_list[[k]]` will be an integer vector
#' containing the indices of the features in X that are in the cluster
#' containing prototype `selected_set[k]`.}
#' @author Gregory Faletto, Jacob Bien
#' @keywords internal
#' @noRd
getSelectedSets <- function(lasso_set, clusters, prototypes, feat_names){
model_size <- length(lasso_set)
stopifnot(model_size > 0)
stopifnot(length(unique(lasso_set)) == model_size)
stopifnot(all(lasso_set <= length(clusters)))
selected_set <- integer()
selected_clusts_list <- list()
# Recover features from original feature space
for(k in 1:model_size){
selected_cluster_k <- clusters[[lasso_set[k]]]
stopifnot(is.integer(selected_cluster_k))
selected_clusts_list[[k]] <- selected_cluster_k
if(length(selected_cluster_k) == 1){
stopifnot(!(selected_cluster_k %in% selected_set))
selected_set <- c(selected_set, selected_cluster_k)
} else{
sel_prototype <- which(prototypes %in% selected_cluster_k)
stopifnot(length(sel_prototype) == 1)
stopifnot(!(prototypes[sel_prototype] %in% selected_set))
selected_set <- c(selected_set, prototypes[sel_prototype])
}
}
stopifnot(length(selected_set) == model_size)
stopifnot(length(unique(selected_set)) == model_size)
if(any(!is.na(feat_names))){
names(selected_set) <- feat_names[selected_set]
}
stopifnot(length(selected_clusts_list) == model_size)
all_feats <- unlist(selected_clusts_list)
stopifnot(length(all_feats) == length(unique(all_feats)))
return(list(selected_set=selected_set,
selected_clusts_list=selected_clusts_list))
}#' Select features via the cluster representative lasso (Bühlmann et. al. 2013)
#'
#' @param X An n x p numeric matrix (preferably) or a data.frame (which will
#' be coerced internally to a matrix by the function model.matrix) containing
#' p >= 2 features/predictors. Must not contain missing (`NA`) values.
#' @param y The response; A length n numeric (or integer) real-valued vector.
#' @param clusters A list of integer vectors; each vector should contain the
#' indices of a cluster of features (a subset of 1:p). (If there is only one
#' cluster, clusters can either be a list of length 1 or an integer vector.)
#' All of the provided clusters must be non-overlapping. Every feature not
#' appearing in any cluster will be assumed to be unclustered (that is, they
#' will be treated as if they are in a "cluster" containing only themselves).
#' CAUTION: if the provided X is a data.frame that contains a categorical
#' feature with more than two levels, then the resulting matrix made from
#' model.matrix will have a different number of columns than the provided
#' data.frame, some of the feature numbers will change, and the clusters
#' argument will not work properly (in the current version of the package).
#' To get correct results in this case, please use model.matrix to convert
#' the data.frame to a numeric matrix on your own, then provide this matrix
#' and cluster assignments with respect to this matrix. Default is list() (so no
#' clusters are specified).
#' @param nlambda Integer; the number of lambda values to use in the lasso fit
#' for the cluster representative lasso. Default is 100 (following the default
#' for glmnet). For now, nlambda must be at least 2 (using a single lambda is
#' not supported).
#' @return A list with three elements. \item{selected_sets}{A list of integer
#' vectors. Entry k of this list contains a selected set (an integer vector) of
#' size k yielded by the lasso--each member of the set is the index of a single
#' feature from a cluster selected by the cluster representative lasso (the
#' prototype from that cluster--the cluster member most highly correlated with
#' y). (If no set of size k was selected, entry k will be empty.)}
#' \item{selected_clusts_list}{A list; each element of the list is a named list
#' of selected clusters. (That is, if a selected set of size k was yielded by
#' the cluster representative lasso, then `selected_clusts_list[[k]]` is a named
#' list of length k, where each member of the list is an integer vector
#' of cluster members. Note that `selected_clusts_lists[[k]][[j]]` will be the
#' cluster that contains feature `selected_sets[[k]][j]`.)} \item{beta}{The beta
#' output from glmnet when the lasso was estimated on a matrix of prototypes.
#' (See documentation for the function glmnet from the glmnet package for
#' details.)}
#' @author Gregory Faletto, Jacob Bien
#' @references Bühlmann, P., Rütimann, P., van de Geer, S., & Zhang, C. H.
#' (2013). Correlated variables in regression: Clustering and sparse estimation.
#' \emph{Journal of Statistical Planning and Inference}, 143(11), 1835–1858.
#' \url{https://doi.org/10.1016/j.jspi.2013.05.019}. \cr Jerome Friedman, Trevor
#' Hastie, Robert Tibshirani (2010). Regularization Paths for Generalized Linear
#' Models via Coordinate Descent. \emph{Journal of Statistical Software}, 33(1)
#' 1-22. URL \url{https://www.jstatsoft.org/v33/i01/}.
#' @examples
#' set.seed(1)
#' data <- genClusteredData(n = 50, p = 11, k_unclustered = 2,
#' cluster_size = 4, n_clusters = 1, snr = 3)
#' clusters <- list(cluster1 = 1:4)
#' res <- clusterRepLasso(X = data$X, y = data$y, clusters = clusters)
#' str(res, max.level = 1)
#' @export
clusterRepLasso <- function(X, y, clusters=list(), nlambda=100){
clusterLassoCore(X, y, clusters, nlambda, type="clusterRepLasso")
}