Title: | Regularized Multi-Task Learning |
---|---|
Description: | Efficient solvers for 10 regularized multi-task learning algorithms applicable for regression, classification, joint feature selection, task clustering, low-rank learning, sparse learning and network incorporation. Based on the accelerated gradient descent method, the algorithms feature a state-of-art computational complexity O(1/k^2). Sparse model structure is induced by the solving the proximal operator. The detail of the package is described in the paper of Han Cao and Emanuel Schwarz (2018) <doi:10.1093/bioinformatics/bty831>. |
Authors: | Han Cao [cre, aut, cph], Emanuel Schwarz [aut] |
Maintainer: | Han Cao <[email protected]> |
License: | GPL-3 |
Version: | 0.9 |
Built: | 2025-02-10 05:14:51 UTC |
Source: | https://github.com/transbiozi/rmtl |
This package provides an efficient implementation of regularized multi-task learning (MTL) comprising 10 algorithms applicable for regression, classification, joint feature selection, task clustering, low-rank learning, sparse learning and network incorporation. All algorithms are implemented based on the accelerated gradient descent method and feature a complexity of O(1/k^2). Parallel computing is allowed to improve the efficiency. Sparse model structure is induced by the solving the proximal operator.
This package provides 10 multi-task learning algorithms (5 classification and 5 regression), which incorporate five regularization strategies for knowledge transferring among tasks. All algorithms share the same framework:
where is the loss function (logistic loss for classification or least square loss for linear regression).
is the cross-task regularization for knowledge transfer, and
is used for improving the
generalization.
and
are
predictors matrices and responses of
tasks respectively, while each task
contains
subjects and
predictors.
is the coefficient matrix, where
, the
th column of
,
refers to the coefficient vector of task
.
The function jointly modulates multi-task models(
) according to specific
prior structure of
. In this package, 5 common regularization methods are implemented to incorporate different priors, i.e.
sparse structure (
), joint feature selection (
), low-rank structure
(
), network-based relatedness across tasks (
) and task clustering
(
). To call a specific method correctly, the corresponding "short name" has to be given.
Follow the above sequence of methods, the short names are defined:
L21
, Lasso
, Trace
, Graph
and CMTL
For all algorithms, we implemented an solver based on the accelerated
gradient descent method, which takes advantage of information from the
previous two iterations to calculate the current gradient and then
achieves an improved convergent rate. To solve the non-smooth and convex
regularizer, the proximal operator is applied. Moreover, backward
line search is used to determine the appropriate step-size in each
iteration. Overall, the solver achieves a complexity of
and is optimal among first-order gradient
descent methods.
For the academic references of the implemented algorithms, the readers are referred to the paper (doi:10.1093/bioinformatics/bty831) or the vignettes in the package.
Calculate the averaged prediction error across tasks. For classification problem, the miss-classification rate is returned, and for regression problem, the mean square error(MSE) is returned.
calcError(m, newX = NULL, newY = NULL)
calcError(m, newX = NULL, newY = NULL)
m |
A MTL model |
newX |
The feature matrices of new individuals |
newY |
The responses of new individuals |
The averaged prediction error
#create example data data<-Create_simulated_data(Regularization="L21", type="Regression") #train a model model<-MTL(data$X, data$Y, type="Regression", Regularization="L21", Lam1=0.1, Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500)) #calculate the training error calcError(model, newX=data$X, newY=data$Y) #calculate the test error calcError(model, newX=data$tX, newY=data$tY)
#create example data data<-Create_simulated_data(Regularization="L21", type="Regression") #train a model model<-MTL(data$X, data$Y, type="Regression", Regularization="L21", Lam1=0.1, Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500)) #calculate the training error calcError(model, newX=data$X, newY=data$Y) #calculate the test error calcError(model, newX=data$tX, newY=data$tY)
Create an example dataset which contains 1), training datasets (X: feature matrices, Y: response vectors); 2), test datasets
(tX: feature matrices, tY: response vectors); 3), the ground truth model (W: coefficient matrix) and 4), extra
information for some algorithms (i.e. a matrix for encoding the network information is necessary for calling the MTL method with network
structure(Regularization=Graph
)
Create_simulated_data(t = 5, p = 50, n = 20, type = "Regression", Regularization = "L21")
Create_simulated_data(t = 5, p = 50, n = 20, type = "Regression", Regularization = "L21")
t |
Number of tasks |
p |
Number of features |
n |
Number of samples of each task. For simplicity, all tasks contain the same number of samples. |
type |
The type of problem, must be "Regression" or "Classification" |
Regularization |
The type of MTL algorithm (cross-task regularizer). The value must be
one of { |
The example dataset.
data<-Create_simulated_data(t=5,p=50, n=20, type="Regression", Regularization="L21") str(data)
data<-Create_simulated_data(t=5,p=50, n=20, type="Regression", Regularization="L21") str(data)
Perform the k-fold cross-validation to estimate the .
cvMTL(X, Y, type = "Classification", Regularization = "L21", Lam1_seq = 10^seq(1, -4, -1), Lam2 = 0, G = NULL, k = 2, opts = list(init = 0, tol = 10^-3, maxIter = 1000), stratify = FALSE, nfolds = 5, ncores = 2, parallel = FALSE)
cvMTL(X, Y, type = "Classification", Regularization = "L21", Lam1_seq = 10^seq(1, -4, -1), Lam2 = 0, G = NULL, k = 2, opts = list(init = 0, tol = 10^-3, maxIter = 1000), stratify = FALSE, nfolds = 5, ncores = 2, parallel = FALSE)
X |
A set of feature matrices |
Y |
A set of responses, could be binary (classification
problem) or continues (regression problem). The valid
value of binary outcome |
type |
The type of problem, must be |
Regularization |
The type of MTL algorithm (cross-task regularizer). The value must be
one of { |
Lam1_seq |
A positive sequence of |
Lam2 |
A positive constant |
G |
A matrix to encode the network information. This parameter
is only used in the MTL with graph structure ( |
k |
A positive number to modulate the structure of clusters
with the default of 2. This parameter is only used in MTL with
clustering structure ( |
opts |
Options of the optimization procedure. One can set the
initial search point, the tolerance and the maximized number of
iterations through the parameter. The default value is
|
stratify |
|
nfolds |
The number of folds |
ncores |
The number of cores used for parallel computing with the default value of 2 |
parallel |
|
The estimated and related information
#create the example data data<-Create_simulated_data(Regularization="L21", type="Classification") #perform the cross validation cvfit<-cvMTL(data$X, data$Y, type="Classification", Regularization="L21", Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500), nfolds=5, stratify=TRUE, Lam1_seq=10^seq(1,-4, -1)) #show meta-infomration str(cvfit) #plot the CV accuracies across lam1 sequence plot(cvfit)
#create the example data data<-Create_simulated_data(Regularization="L21", type="Classification") #perform the cross validation cvfit<-cvMTL(data$X, data$Y, type="Classification", Regularization="L21", Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500), nfolds=5, stratify=TRUE, Lam1_seq=10^seq(1,-4, -1)) #show meta-infomration str(cvfit) #plot the CV accuracies across lam1 sequence plot(cvfit)
Train a multi-task learning model.
MTL(X, Y, type = "Classification", Regularization = "L21", Lam1 = 0.1, Lam1_seq = NULL, Lam2 = 0, opts = list(init = 0, tol = 10^-3, maxIter = 1000), G = NULL, k = 2)
MTL(X, Y, type = "Classification", Regularization = "L21", Lam1 = 0.1, Lam1_seq = NULL, Lam2 = 0, opts = list(init = 0, tol = 10^-3, maxIter = 1000), G = NULL, k = 2)
X |
A set of feature matrices |
Y |
A set of responses, could be binary (classification
problem) or continues (regression problem). The valid
value of binary outcome |
type |
The type of problem, must be |
Regularization |
The type of MTL algorithm (cross-task regularizer). The value must be
one of { |
Lam1 |
A positive constant |
Lam1_seq |
A positive sequence of |
Lam2 |
A non-negative constant |
opts |
Options of the optimization procedure. One can set the
initial search point, the tolerance and the maximized number of
iterations using this parameter. The default value is
|
G |
A matrix to encode the network information. This parameter
is only used in the MTL with graph structure ( |
k |
A positive number to modulate the structure of clusters
with the default of 2. This parameter is only used in MTL with
clustering structure ( |
The trained model including the coefficient matrix W
and intercepts C
and related meta information
#create the example data data<-Create_simulated_data(Regularization="L21", type="Regression") #train a MTL model #cold-start model<-MTL(data$X, data$Y, type="Regression", Regularization="L21", Lam1=0.1, Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500)) #warm-start model<-MTL(data$X, data$Y, type="Regression", Regularization="L21", Lam1=0.1, Lam1_seq=10^seq(1,-4, -1), Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500)) #meta-information str(model) #plot the historical objective values plotObj(model)
#create the example data data<-Create_simulated_data(Regularization="L21", type="Regression") #train a MTL model #cold-start model<-MTL(data$X, data$Y, type="Regression", Regularization="L21", Lam1=0.1, Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500)) #warm-start model<-MTL(data$X, data$Y, type="Regression", Regularization="L21", Lam1=0.1, Lam1_seq=10^seq(1,-4, -1), Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500)) #meta-information str(model) #plot the historical objective values plotObj(model)
Plot the cross-validation curve
## S3 method for class 'cvMTL' plot(x, ...)
## S3 method for class 'cvMTL' plot(x, ...)
x |
The returned object of function |
... |
Other parameters |
#create the example data data<-Create_simulated_data(Regularization="L21", type="Classification") #perform the cv cvfit<-cvMTL(data$X, data$Y, type="Classification", Regularization="L21", Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500), nfolds=5, stratify=TRUE, Lam1_seq=10^seq(1,-4, -1)) #plot the curve plot(cvfit)
#create the example data data<-Create_simulated_data(Regularization="L21", type="Classification") #perform the cv cvfit<-cvMTL(data$X, data$Y, type="Classification", Regularization="L21", Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500), nfolds=5, stratify=TRUE, Lam1_seq=10^seq(1,-4, -1)) #plot the curve plot(cvfit)
Plot the values of objective function across iterations in the optimization procedure. This function indicates the "inner status" of the solver during the optimization, and could be used for diagnosis of the solver and training procedure.
plotObj(m)
plotObj(m)
m |
A trained MTL model |
#create the example date data<-Create_simulated_data(Regularization="L21", type="Regression") #Train a MTL model model<-MTL(data$X, data$Y, type="Regression", Regularization="L21", Lam1=0.1, Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500)) #plot the objective values plotObj(model)
#create the example date data<-Create_simulated_data(Regularization="L21", type="Regression") #Train a MTL model model<-MTL(data$X, data$Y, type="Regression", Regularization="L21", Lam1=0.1, Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500)) #plot the objective values plotObj(model)
Predict the outcomes of new individuals. For classification, the probability of the individual being assigned to positive label P(y==1) is estimated, and for regression, the prediction score is estimated
## S3 method for class 'MTL' predict(object, newX = NULL, ...)
## S3 method for class 'MTL' predict(object, newX = NULL, ...)
object |
A trained MTL model |
newX |
The feature matrices of new individuals |
... |
Other parameters |
The predictive outcome
#Create data data<-Create_simulated_data(Regularization="L21", type="Regression") #Train model<-MTL(data$X, data$Y, type="Regression", Regularization="L21", Lam1=0.1, Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500)) predict(model, newX=data$tX)
#Create data data<-Create_simulated_data(Regularization="L21", type="Regression") #Train model<-MTL(data$X, data$Y, type="Regression", Regularization="L21", Lam1=0.1, Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500)) predict(model, newX=data$tX)
Print the meta information of the model
## S3 method for class 'MTL' print(x, ...)
## S3 method for class 'MTL' print(x, ...)
x |
A trained MTL model |
... |
Other parameters |
#create data data<-Create_simulated_data(Regularization="L21", type="Regression") #train a MTL model model<-MTL(data$X, data$Y, type="Regression", Regularization="L21", Lam1=0.1, Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500)) #print the information of the model print(model)
#create data data<-Create_simulated_data(Regularization="L21", type="Regression") #train a MTL model model<-MTL(data$X, data$Y, type="Regression", Regularization="L21", Lam1=0.1, Lam2=0, opts=list(init=0, tol=10^-6, maxIter=1500)) #print the information of the model print(model)