Title: | Asymptotic Covariance Matrix of Regression Models for Multiple Outcomes |
---|---|
Description: | Regression models can be fitted for multiple outcomes simultaneously. This package computes estimates of parameters across fitted models and returns the matrix of asymptotic covariance. Various applications of this package, including PATED (Prognostic Variables Assisted Treatment Effect Detection), multiple comparison adjustment, are illustrated. |
Authors: | Han Zhang [aut, cre] |
Maintainer: | Han Zhang <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.8 |
Built: | 2025-02-18 05:39:57 UTC |
Source: | https://github.com/zhangh12/multipleoutcomes |
actg dataset from Hosmer et al.
A data frame
Identification Code
Time to AIDS diagnosis or death (days).
Event indicator. 1 = AIDS defining diagnosis, 0 = Otherwise.
Time to death (days)
Event indicator for death (only). 1 = Death, 0 = Otherwise.
Treatment indicator. 1 = Treatment includes IDV, 0 = Control group.
Treatment group indicator. 1 = ZDV + 3TC. 2 = ZDV + 3TC + IDV. 3 = d4T + 3TC. 4 = d4T + 3TC + IDV.
CD4 stratum at screening. 0 = CD4 <= 50. 1 = CD4 > 50.
0 = Male. 1 = Female.
Race/Ethnicity. 1 = White Non-Hispanic. 2 = Black Non-Hispanic. 3 = Hispanic. 4 = Asian, Pacific Islander. 5 = American Indian, Alaskan Native. 6 = Other/unknown.
IV drug use history. 1 = Never. 2 = Currently. 3 = Previously.
Hemophiliac. 1 = Yes. 0 = No.
Karnofsky Performance Scale. 100 = Normal; no complaint no evidence of disease. 90 = Normal activity possible; minor signs/symptoms of disease. 80 = Normal activity with effort; some signs/symptoms of disease. 70 = Cares for self; normal activity/active work not possible.
Baseline CD4 count (Cells/Milliliter).
Months of prior ZDV use (months).
Age at Enrollment (years).
ftp://ftp.wiley.com/public/sci_tech_med/survival
Hosmer, D.W. and Lemeshow, S. and May, S. (2008) Applied Survival Analysis: Regression Modeling of Time to Event Data: Second Edition, John Wiley and Sons Inc., New York, NY
data(actg)
data(actg)
Compute asymptotic variance-covariance matrix of parameters in given models It is used for models where bootstrap is not needed
asymptoticMultipleOutcomes( ..., data, family, data_index = NULL, score_epsilon = 1e-06 )
asymptoticMultipleOutcomes( ..., data, family, data_index = NULL, score_epsilon = 1e-06 )
Compute bootstrapped variance-covariance matrix of parameters in given models It is used when at least one of the specified model needs bootstrap, for example, Kaplan-Merier estimate for probability of survival, or quantiles are used for prognostic variables.
bootstrapMultipleOutcomes( ..., data, data_index = NULL, nboot = 10, compute_cov = TRUE, seed = NULL )
bootstrapMultipleOutcomes( ..., data, data_index = NULL, nboot = 10, compute_cov = TRUE, seed = NULL )
multipleOutcomes
when bootstrap will be used to estimate
variance-covariance matrixProcess inputs of multipleOutcomes
when bootstrap will be used to estimate
variance-covariance matrix
checkBootstrapInput(..., data, data_index)
checkBootstrapInput(..., data, data_index)
multipleOutcomes
when asymptotic properties are used
to estimate variance-covariance matrixProcess inputs of multipleOutcomes
when asymptotic properties are used
to estimate variance-covariance matrix
checkDefaultInput(..., family, data, data_index)
checkDefaultInput(..., family, data, data_index)
coef
is a generic function.
## S3 method for class 'multipleOutcomes' coef(object, model_index = NULL, ...)
## S3 method for class 'multipleOutcomes' coef(object, model_index = NULL, ...)
object |
an object returned by |
model_index |
|
... |
for debugging only |
a vector of coefficient estimates
PATED adjusted KM curve
Conventional KM curve
Generate two curves of survival probability with pointwise 95% confidence interval:
PATED adjusted KM curve
Conventional KM curve
comparePointwiseConfidenceIntervalWidth( pated_res, km_res, transform = "identity" )
comparePointwiseConfidenceIntervalWidth( pated_res, km_res, transform = "identity" )
return estimate of log HR from coxph model to be used when bootstrap is needed.
coxphMO(formula, data = NULL, ties = c("efron", "breslow", "exact"))
coxphMO(formula, data = NULL, ties = c("efron", "breslow", "exact"))
pated
about the format of input
.
Transformations are supported when computing confidence intervals.
Refer to https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_lifetest_details08.htm
or the LIFETEST proc in SAS for more details of transformation.Create curve of survival probability based on conventional KM method, or
PATED adjusted KM estimates.
Refer to km_res or pated_res in pated
about the format of input
.
Transformations are supported when computing confidence intervals.
Refer to https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_lifetest_details08.htm
or the LIFETEST proc in SAS for more details of transformation.
createKaplanMeierCurve(input, title = "", transform = "identity")
createKaplanMeierCurve(input, title = "", transform = "identity")
Figure out the time points with at least one event. This is used to generate KM-like curves.
extractKaplanMeierTimes(coef, sort = TRUE)
extractKaplanMeierTimes(coef, sort = TRUE)
return estimates from GEE model to be used when bootstrap is needed.
geeMO(formula, id, data = NULL, family, corstr)
geeMO(formula, id, data = NULL, family, corstr)
g(S(t)), a transformed estimate of survival probability. See https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_lifetest_details08.htm
gFunction(conf.type = c("log", "log-log", "plain", "logit"))
gFunction(conf.type = c("log", "log-log", "plain", "logit"))
Inversed function of g(S(t)). See https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_lifetest_details08.htm
gInverseFunction(conf.type = c("log", "log-log", "plain", "logit"))
gInverseFunction(conf.type = c("log", "log-log", "plain", "logit"))
return estimates from GLM model to be used when bootstrap is needed.
glmMO(formula, family, data = NULL)
glmMO(formula, family, data = NULL)
df is of length that equals to the number of specified models dfi is the number of parameters in model i This function returns index of parameters of model i in the vector of parameters of all models
IDMapping(df, i, var_name = NULL)
IDMapping(df, i, var_name = NULL)
multipleOutcomes
can fit different types of models for multiple outcomes
simultaneously and return model parameters and variance-covariance matrix
for further analysis.
multipleOutcomes( ..., data, family = NULL, data_index = NULL, nboot = 0, compute_cov = TRUE, seed = NULL, score_epsilon = 1e-06 )
multipleOutcomes( ..., data, family = NULL, data_index = NULL, nboot = 0, compute_cov = TRUE, seed = NULL, score_epsilon = 1e-06 )
... |
formulas of models to be fitted, or moment functions for gmm. |
data |
a data frame if all models are fitted on the same dataset;
otherwise a list of data frames for fitting models in |
family |
a character vector of families to be used in the models.
Currently only |
data_index |
|
nboot |
non-zero integer if bootstrap is adopted. By default 0. |
score_epsilon |
whatever. |
It returns an object of class "multipleOutcomes", which is a list containing the following components:
coefficients |
an unnamed vector of coefficients of all fitted models.
Use id_map for variable mapping. |
mcov |
a unnamed matrix of covariance of coefficients . Use id_map
for variable mapping. |
id_map |
a list mapping the elements in coefficients and mcov to
variable names. |
n_shared_sample_sizes |
a matrix of shared sample sizes between datasets being used to fit the models. |
call |
the matched call. |
## More examples can be found in the vignettes. library(mvtnorm) genData <- function(seed = NULL){ set.seed(seed) n <- 400 sigma <- matrix(c(1, .6, .6, 1), 2) x <- rmvnorm(n, sigma = sigma) gam <- c(.1, -.2) z <- rbinom(n, 1, plogis(1-1/(1+exp(-.5+x%*%gam+.1*rnorm(n))))) bet <- c(-.2,.2) #y <- rbinom(n, 1, plogis(1-1/(1+exp(-.5+x%*%bet + .2*z-.3*rnorm(n))))) y <- -.5+x%*%bet + .2*z-.3*rnorm(n) data.frame(y = y, z = z, x1 = x[, 1], x2 = x[, 2]) } dat <- genData(123456) dat1 <- head(dat,200) dat2 <- tail(dat,200) ## fitting four models simultaneously. fit <- multipleOutcomes( y ~ z + x1 - 1, z ~ x1 + x2, z ~ x1 - 1, y ~ x2, ## z can be fitted with a linear or logistic regression family = c('gaussian', 'binomial', 'gaussian','gaussian'), data = list(dat1, dat2), ## each dataset is used to fit two models data_index = c(1, 1, 2, 2) ) ## unnamed coefficients of all model parameters coef(fit) ## named coefficients of a specific model coef(fit, 2) ## unnamed covariance matrix of all model parameters vcov(fit) ## named covariance matrix of a specific model vcov(fit, 1) ## summary of all parameter estimates summary(fit) ## summary of parameters in a specific model summary(fit, 4)
## More examples can be found in the vignettes. library(mvtnorm) genData <- function(seed = NULL){ set.seed(seed) n <- 400 sigma <- matrix(c(1, .6, .6, 1), 2) x <- rmvnorm(n, sigma = sigma) gam <- c(.1, -.2) z <- rbinom(n, 1, plogis(1-1/(1+exp(-.5+x%*%gam+.1*rnorm(n))))) bet <- c(-.2,.2) #y <- rbinom(n, 1, plogis(1-1/(1+exp(-.5+x%*%bet + .2*z-.3*rnorm(n))))) y <- -.5+x%*%bet + .2*z-.3*rnorm(n) data.frame(y = y, z = z, x1 = x[, 1], x2 = x[, 2]) } dat <- genData(123456) dat1 <- head(dat,200) dat2 <- tail(dat,200) ## fitting four models simultaneously. fit <- multipleOutcomes( y ~ z + x1 - 1, z ~ x1 + x2, z ~ x1 - 1, y ~ x2, ## z can be fitted with a linear or logistic regression family = c('gaussian', 'binomial', 'gaussian','gaussian'), data = list(dat1, dat2), ## each dataset is used to fit two models data_index = c(1, 1, 2, 2) ) ## unnamed coefficients of all model parameters coef(fit) ## named coefficients of a specific model coef(fit, 2) ## unnamed covariance matrix of all model parameters vcov(fit) ## named covariance matrix of a specific model vcov(fit, 1) ## summary of all parameter estimates summary(fit) ## summary of parameters in a specific model summary(fit, 4)
pated
is a wrapper function of multipleOutcomes
for testing treatment effect
in randomized clinical trials. It assumes that prognostic variables are fully
randomized. This assumption can help enhancing statistical power of conventional
approaches in detecting the treatment effect. Specifically, the sensitivity
of the conventional models specified in ...
are improved by pated
.
pated( ..., data, family = NULL, data_index = NULL, nboot = 0, compute_cov = TRUE, seed = NULL, transform = "identity" )
pated( ..., data, family = NULL, data_index = NULL, nboot = 0, compute_cov = TRUE, seed = NULL, transform = "identity" )
... |
formulas of models to be fitted, or moment functions for gmm. |
data |
a data frame if all models are fitted on the same dataset;
otherwise a list of data frames for fitting models in |
family |
a character vector of families to be used in the models.
All families supported by |
data_index |
|
a data frame of testing results.
## More examples can be found in the vignettes. library(survival) library(mvtnorm) library(tidyr) genData <- function(seed = NULL){ set.seed(seed) n <- 200 sigma <- matrix(c(1, .6, .6, 1), 2) x <- rmvnorm(n, sigma = sigma) z1 <- rbinom(n, 1, .6) z2 <- rnorm(n) gam <- c(.1, -.2) trt <- rbinom(n, 1, .5) bet <- c(-.2,.2) y <- -.5+x %*% bet + z1 * .3 - z2 * .1 + .1 * trt-.1 * rnorm(n) death <- rbinom(n, 1, .8) id <- 1:n data.frame( y = y, trt = trt, z1 = z1, z2 = z2, x1 = x[, 1], x2 = x[, 2], death, id) } dat1 <- genData(seed = 31415926) ## create a dataset with repeated measurements x dat2 <- dat1 %>% pivot_longer(c(x1, x2), names_to='tmp', values_to='x') %>% dplyr::select(x, trt, id) %>% as.data.frame() fit <- pated( Surv(time=y, event=death) ~ trt, z1 ~ trt, z2 ~ trt, x ~ trt, family=c('logrank', 'binomial', 'gaussian', 'gee+id+gaussian'), data=list(dat1, dat2), data_index = c(1, 1, 1, 2)) fit
## More examples can be found in the vignettes. library(survival) library(mvtnorm) library(tidyr) genData <- function(seed = NULL){ set.seed(seed) n <- 200 sigma <- matrix(c(1, .6, .6, 1), 2) x <- rmvnorm(n, sigma = sigma) z1 <- rbinom(n, 1, .6) z2 <- rnorm(n) gam <- c(.1, -.2) trt <- rbinom(n, 1, .5) bet <- c(-.2,.2) y <- -.5+x %*% bet + z1 * .3 - z2 * .1 + .1 * trt-.1 * rnorm(n) death <- rbinom(n, 1, .8) id <- 1:n data.frame( y = y, trt = trt, z1 = z1, z2 = z2, x1 = x[, 1], x2 = x[, 2], death, id) } dat1 <- genData(seed = 31415926) ## create a dataset with repeated measurements x dat2 <- dat1 %>% pivot_longer(c(x1, x2), names_to='tmp', values_to='x') %>% dplyr::select(x, trt, id) %>% as.data.frame() fit <- pated( Surv(time=y, event=death) ~ trt, z1 ~ trt, z2 ~ trt, x ~ trt, family=c('logrank', 'binomial', 'gaussian', 'gee+id+gaussian'), data=list(dat1, dat2), data_index = c(1, 1, 1, 2)) fit
Summarize an analysis of multiple outcomes.
## S3 method for class 'summary.multipleOutcomes' print(x, ...)
## S3 method for class 'summary.multipleOutcomes' print(x, ...)
x |
an object returned by |
... |
for debugging only. |
an invisible object.
## no example
## no example
Return difference in quantile between arms formula can be endpoint ~ trt
quantileMO(formula, data = NULL, probs = c(0.25, 0.5, 0.75))
quantileMO(formula, data = NULL, probs = c(0.25, 0.5, 0.75))
Generate bootstrap dataset. When missing data presents, dataset is split into groups. Patients in the same group have missing on the same covariates. Bootstrap is carried out in each of the groups
sampleWithReplacement(data)
sampleWithReplacement(data)
simulateMoData
generates data for simulation and testing purposes.
simulateMoData(n = 500, hr = 0.8, seed = NULL)
simulateMoData(n = 500, hr = 0.8, seed = NULL)
n |
an integer for total sample size of a randomized control trial of two arms. |
hr |
hazard ratio of treatment. |
seed |
random seed. By default |
summary
method for class multipleOutcomes
.
## S3 method for class 'multipleOutcomes' summary(object, model_index = NULL, ...)
## S3 method for class 'multipleOutcomes' summary(object, model_index = NULL, ...)
object |
an object returned by |
model_index |
|
... |
for debugging only |
a list
Returns the variance-covariance matrix of the main parameters of fitted model
objects. The "main" parameters of models correspond to those returned by coef
.
## S3 method for class 'multipleOutcomes' vcov(object, model_index = NULL, ...)
## S3 method for class 'multipleOutcomes' vcov(object, model_index = NULL, ...)
object |
an object returned by |
model_index |
|
... |
for debugging only |
a matrix of covariance of all estimates