Package 'multipleOutcomes' reference manual

Title:	Asymptotic Covariance Matrix of Regression Models for Multiple Outcomes
Description:	Regression models can be fitted for multiple outcomes simultaneously. This package computes estimates of parameters across fitted models and returns the matrix of asymptotic covariance. Various applications of this package, including PATED (Prognostic Variables Assisted Treatment Effect Detection), multiple comparison adjustment, are illustrated.
Authors:	Han Zhang [aut, cre]
Maintainer:	Han Zhang <[email protected]>
License:	MIT + file LICENSE
Version:	0.8
Built:	2025-03-20 04:32:37 UTC
Source:	https://github.com/zhangh12/multipleoutcomes

ACTG 320 Clinical Trial Dataset

Description

actg dataset from Hosmer et al.

Format

A data frame

id: Identification Code
time: Time to AIDS diagnosis or death (days).
censor: Event indicator. 1 = AIDS defining diagnosis, 0 = Otherwise.
time_d: Time to death (days)
censor_d: Event indicator for death (only). 1 = Death, 0 = Otherwise.
tx: Treatment indicator. 1 = Treatment includes IDV, 0 = Control group.
txgrp: Treatment group indicator. 1 = ZDV + 3TC. 2 = ZDV + 3TC + IDV. 3 = d4T + 3TC. 4 = d4T + 3TC + IDV.
strat2: CD4 stratum at screening. 0 = CD4 <= 50. 1 = CD4 > 50.
sex: 0 = Male. 1 = Female.
raceth: Race/Ethnicity. 1 = White Non-Hispanic. 2 = Black Non-Hispanic. 3 = Hispanic. 4 = Asian, Pacific Islander. 5 = American Indian, Alaskan Native. 6 = Other/unknown.
ivdrug: IV drug use history. 1 = Never. 2 = Currently. 3 = Previously.
hemophil: Hemophiliac. 1 = Yes. 0 = No.
karnof: Karnofsky Performance Scale. 100 = Normal; no complaint no evidence of disease. 90 = Normal activity possible; minor signs/symptoms of disease. 80 = Normal activity with effort; some signs/symptoms of disease. 70 = Cares for self; normal activity/active work not possible.
cd4: Baseline CD4 count (Cells/Milliliter).
priorzdv: Months of prior ZDV use (months).
age: Age at Enrollment (years).

Source

ftp://ftp.wiley.com/public/sci_tech_med/survival

References

Hosmer, D.W. and Lemeshow, S. and May, S. (2008) Applied Survival Analysis: Regression Modeling of Time to Event Data: Second Edition, John Wiley and Sons Inc., New York, NY

Examples

data(actg)

data(actg)

Compute asymptotic variance-covariance matrix of parameters in given models It is used for models where bootstrap is not needed

Description

Compute asymptotic variance-covariance matrix of parameters in given models It is used for models where bootstrap is not needed

Usage

asymptoticMultipleOutcomes(
  ...,
  data,
  family,
  data_index = NULL,
  score_epsilon = 1e-06
)
asymptoticMultipleOutcomes(
  ...,
  data,
  family,
  data_index = NULL,
  score_epsilon = 1e-06
)

Compute bootstrapped variance-covariance matrix of parameters in given models It is used when at least one of the specified model needs bootstrap, for example, Kaplan-Merier estimate for probability of survival, or quantiles are used for prognostic variables.

Description

Compute bootstrapped variance-covariance matrix of parameters in given models It is used when at least one of the specified model needs bootstrap, for example, Kaplan-Merier estimate for probability of survival, or quantiles are used for prognostic variables.

Usage

bootstrapMultipleOutcomes(
  ...,
  data,
  data_index = NULL,
  nboot = 10,
  compute_cov = TRUE,
  seed = NULL
)
bootstrapMultipleOutcomes(
  ...,
  data,
  data_index = NULL,
  nboot = 10,
  compute_cov = TRUE,
  seed = NULL
)

Process inputs of `multipleOutcomes` when bootstrap will be used to estimate variance-covariance matrix

Description

Process inputs of multipleOutcomes when bootstrap will be used to estimate variance-covariance matrix

Usage

checkBootstrapInput(..., data, data_index)
checkBootstrapInput(..., data, data_index)

Process inputs of `multipleOutcomes` when asymptotic properties are used to estimate variance-covariance matrix

Description

Process inputs of multipleOutcomes when asymptotic properties are used to estimate variance-covariance matrix

Usage

checkDefaultInput(..., family, data, data_index)
checkDefaultInput(..., family, data, data_index)

Extract Model Coefficients

Description

coef is a generic function.

Usage

## S3 method for class 'multipleOutcomes'
coef(object, model_index = NULL, ...)
## S3 method for class 'multipleOutcomes'
coef(object, model_index = NULL, ...)

Arguments

`object`	an object returned by `multipleOutcomes()`.
`model_index`	`NULL` if displaying coefficients of all fitted models; otherwise, an integer indicating the fitted model.
`...`	for debugging only

Value

a vector of coefficient estimates

Generate two curves of survival probability with pointwise 95% confidence interval:

PATED adjusted KM curve

Conventional KM curve

Description

Generate two curves of survival probability with pointwise 95% confidence interval:

PATED adjusted KM curve
Conventional KM curve

Usage

comparePointwiseConfidenceIntervalWidth(
  pated_res,
  km_res,
  transform = "identity"
)
comparePointwiseConfidenceIntervalWidth(
  pated_res,
  km_res,
  transform = "identity"
)

return estimate of log HR from coxph model to be used when bootstrap is needed.

Description

return estimate of log HR from coxph model to be used when bootstrap is needed.

Usage

coxphMO(formula, data = NULL, ties = c("efron", "breslow", "exact"))
coxphMO(formula, data = NULL, ties = c("efron", "breslow", "exact"))

Create curve of survival probability based on conventional KM method, or PATED adjusted KM estimates. Refer to km_res or pated_res in `pated` about the format of `input`. Transformations are supported when computing confidence intervals. Refer to https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_lifetest_details08.htm or the LIFETEST proc in SAS for more details of transformation.

Description

Create curve of survival probability based on conventional KM method, or PATED adjusted KM estimates. Refer to km_res or pated_res in pated about the format of input. Transformations are supported when computing confidence intervals. Refer to https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_lifetest_details08.htm or the LIFETEST proc in SAS for more details of transformation.

Usage

createKaplanMeierCurve(input, title = "", transform = "identity")
createKaplanMeierCurve(input, title = "", transform = "identity")

Figure out the time points with at least one event. This is used to generate KM-like curves.

Description

Figure out the time points with at least one event. This is used to generate KM-like curves.

Usage

extractKaplanMeierTimes(coef, sort = TRUE)
extractKaplanMeierTimes(coef, sort = TRUE)

return estimates from GEE model to be used when bootstrap is needed.

Description

return estimates from GEE model to be used when bootstrap is needed.

Usage

geeMO(formula, id, data = NULL, family, corstr)
geeMO(formula, id, data = NULL, family, corstr)

g(S(t)), a transformed estimate of survival probability. See https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_lifetest_details08.htm

Description

g(S(t)), a transformed estimate of survival probability. See https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_lifetest_details08.htm

Usage

gFunction(conf.type = c("log", "log-log", "plain", "logit"))
gFunction(conf.type = c("log", "log-log", "plain", "logit"))

Inversed function of g(S(t)). See https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_lifetest_details08.htm

Description

Inversed function of g(S(t)). See https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_lifetest_details08.htm

Usage

gInverseFunction(conf.type = c("log", "log-log", "plain", "logit"))
gInverseFunction(conf.type = c("log", "log-log", "plain", "logit"))

return estimates from GLM model to be used when bootstrap is needed.

Description

return estimates from GLM model to be used when bootstrap is needed.

Usage

glmMO(formula, family, data = NULL)
glmMO(formula, family, data = NULL)

df is of length that equals to the number of specified models dfi is the number of parameters in model i This function returns index of parameters of model i in the vector of parameters of all models

Description

df is of length that equals to the number of specified models dfi is the number of parameters in model i This function returns index of parameters of model i in the vector of parameters of all models

Usage

IDMapping(df, i, var_name = NULL)
IDMapping(df, i, var_name = NULL)

Fitting Regression Models for Multiple Outcomes and Returning the Matrix of Covariance

Description

multipleOutcomes can fit different types of models for multiple outcomes simultaneously and return model parameters and variance-covariance matrix for further analysis.

Usage

multipleOutcomes(
  ...,
  data,
  family = NULL,
  data_index = NULL,
  nboot = 0,
  compute_cov = TRUE,
  seed = NULL,
  score_epsilon = 1e-06
)
multipleOutcomes(
  ...,
  data,
  family = NULL,
  data_index = NULL,
  nboot = 0,
  compute_cov = TRUE,
  seed = NULL,
  score_epsilon = 1e-06
)

Arguments

`...`	formulas of models to be fitted, or moment functions for gmm.
`data`	a data frame if all models are fitted on the same dataset; otherwise a list of data frames for fitting models in `...`. Note that a dataset can be used to fit multiple models, thus, `length(data)` is unnecessary to be equal to the number of models in `...`. The row names in a data frame are treated as sample IDs. Consequently, for any two records in different data frames that correspond to the same sample, their row names should be consistent.
`family`	a character vector of families to be used in the models. Currently only `gaussian`, `binomial`, `coxph`, `logrank`, `gmm` and `gee` are supported. To analyze longitudinal data and repeated measurements, `family` should be something like `gee+id+family+corstr`, where id is the column name defining cluster (i.e. one patient per cluster). The covariance estimate may be inaccurate if number of clusters is too small. The supported families for `gee` include `gaussian`, `binomial`, `poisson`, `Gamma`, and `quasi`. The supported correlation structures include `independence`, `stat_M_dep`, `non_stat_M_dep`, `exchangeable`, `AR-M` and `unstructured`. So a valid family string is like `gee+user_id+binomial+exchangeable`. `family` can be of length 1 if all models are fitted in the same family; otherwise family should be specified for each of the models.
`data_index`	`NULL` if `data` is a data frame; otherwise, a vector in integer specifying mapping a model in `...` to a data frame in `data` (a list).
`nboot`	non-zero integer if bootstrap is adopted. By default 0.
`score_epsilon`	whatever.

Value

It returns an object of class "multipleOutcomes", which is a list containing the following components:

`coefficients`	an unnamed vector of coefficients of all fitted models. Use `id_map` for variable mapping.

`mcov`	a unnamed matrix of covariance of `coefficients`. Use `id_map` for variable mapping.

`id_map`	a list mapping the elements in `coefficients` and `mcov` to variable names.

`n_shared_sample_sizes`	a matrix of shared sample sizes between datasets being used to fit the models.

`call`	the matched call.

Examples

## More examples can be found in the vignettes.
library(mvtnorm)
genData <- function(seed = NULL){

  set.seed(seed)
  n <- 400
  sigma <- matrix(c(1, .6, .6, 1), 2)
  x <- rmvnorm(n, sigma = sigma)
  gam <- c(.1, -.2)
  z <- rbinom(n, 1, plogis(1-1/(1+exp(-.5+x%*%gam+.1*rnorm(n)))))

  bet <- c(-.2,.2)
  #y <- rbinom(n, 1, plogis(1-1/(1+exp(-.5+x%*%bet + .2*z-.3*rnorm(n)))))
  y <- -.5+x%*%bet + .2*z-.3*rnorm(n)

  data.frame(y = y, z = z, x1 = x[, 1], x2 = x[, 2])

}

dat <- genData(123456)
dat1 <- head(dat,200)
dat2 <- tail(dat,200)
## fitting four models simultaneously.
fit <-
  multipleOutcomes(
    y ~ z + x1 - 1,
    z ~ x1 + x2,
    z ~ x1 - 1,
    y ~ x2,
    ## z can be fitted with a linear or logistic regression
    family = c('gaussian', 'binomial', 'gaussian','gaussian'),
    data = list(dat1, dat2),
    ## each dataset is used to fit two models
    data_index = c(1, 1, 2, 2)
  )

  ## unnamed coefficients of all model parameters
  coef(fit)

  ## named coefficients of a specific model
  coef(fit, 2)

  ## unnamed covariance matrix of all model parameters
  vcov(fit)

  ## named covariance matrix of a specific model
  vcov(fit, 1)

  ## summary of all parameter estimates
  summary(fit)

  ## summary of parameters in a specific model
  summary(fit, 4)

## More examples can be found in the vignettes.
library(mvtnorm)
genData <- function(seed = NULL){

  set.seed(seed)
  n <- 400
  sigma <- matrix(c(1, .6, .6, 1), 2)
  x <- rmvnorm(n, sigma = sigma)
  gam <- c(.1, -.2)
  z <- rbinom(n, 1, plogis(1-1/(1+exp(-.5+x%*%gam+.1*rnorm(n)))))

  bet <- c(-.2,.2)
  #y <- rbinom(n, 1, plogis(1-1/(1+exp(-.5+x%*%bet + .2*z-.3*rnorm(n)))))
  y <- -.5+x%*%bet + .2*z-.3*rnorm(n)

  data.frame(y = y, z = z, x1 = x[, 1], x2 = x[, 2])

}

dat <- genData(123456)
dat1 <- head(dat,200)
dat2 <- tail(dat,200)
## fitting four models simultaneously.
fit <-
  multipleOutcomes(
    y ~ z + x1 - 1,
    z ~ x1 + x2,
    z ~ x1 - 1,
    y ~ x2,
    ## z can be fitted with a linear or logistic regression
    family = c('gaussian', 'binomial', 'gaussian','gaussian'),
    data = list(dat1, dat2),
    ## each dataset is used to fit two models
    data_index = c(1, 1, 2, 2)
  )

  ## unnamed coefficients of all model parameters
  coef(fit)

  ## named coefficients of a specific model
  coef(fit, 2)

  ## unnamed covariance matrix of all model parameters
  vcov(fit)

  ## named covariance matrix of a specific model
  vcov(fit, 1)

  ## summary of all parameter estimates
  summary(fit)

  ## summary of parameters in a specific model
  summary(fit, 4)

Prognostic Variables Assisted Treatment Effect Detection

Description

pated is a wrapper function of multipleOutcomes for testing treatment effect in randomized clinical trials. It assumes that prognostic variables are fully randomized. This assumption can help enhancing statistical power of conventional approaches in detecting the treatment effect. Specifically, the sensitivity of the conventional models specified in ... are improved by pated.

Usage

pated(
  ...,
  data,
  family = NULL,
  data_index = NULL,
  nboot = 0,
  compute_cov = TRUE,
  seed = NULL,
  transform = "identity"
)
pated(
  ...,
  data,
  family = NULL,
  data_index = NULL,
  nboot = 0,
  compute_cov = TRUE,
  seed = NULL,
  transform = "identity"
)

Arguments

`...`	formulas of models to be fitted, or moment functions for gmm.
`data`	a data frame if all models are fitted on the same dataset; otherwise a list of data frames for fitting models in `...`. Note that a dataset can be used to fit multiple models, thus, `length(data)` is unnecessary to be equal to the number of models in `...`. The row names in a data frame are treated as sample IDs. Consequently, for any two records in different data frames that correspond to the same sample, their row names should be consistent.
`family`	a character vector of families to be used in the models. All families supported by `multipleOutcomes` are also supported by `pated`. `family` can be of length 1 if all models are fitted in the same family; otherwise family should be specified for each of the models in `...`.
`data_index`	`NULL` if `data` is a data frame; otherwise, a vector in integer specifying mapping a model in `...` to a data frame in `data` (a list).

Value

a data frame of testing results.

Examples

## More examples can be found in the vignettes.
library(survival)
library(mvtnorm)
library(tidyr)
genData <- function(seed = NULL){

  set.seed(seed)
  n <- 200
  sigma <- matrix(c(1, .6, .6, 1), 2)
  x <- rmvnorm(n, sigma = sigma)
  z1 <- rbinom(n, 1, .6)
  z2 <- rnorm(n)
  gam <- c(.1, -.2)
  trt <- rbinom(n, 1, .5)

  bet <- c(-.2,.2)
  y <- -.5+x %*% bet + z1 * .3 - z2 * .1 + .1 * trt-.1 * rnorm(n)
  death <- rbinom(n, 1, .8)
  id <- 1:n
  data.frame(
    y = y, trt = trt, 
    z1 = z1, z2 = z2, 
    x1 = x[, 1], x2 = x[, 2], 
    death, id)

}

dat1 <- genData(seed = 31415926)

## create a dataset with repeated measurements x
dat2 <- dat1 %>% pivot_longer(c(x1, x2), names_to='tmp', values_to='x') %>% 
dplyr::select(x, trt, id) %>% as.data.frame()

fit <- 
  pated(
    Surv(time=y, event=death) ~ trt,
    z1 ~ trt, 
    z2 ~ trt, 
    x ~ trt, 
    family=c('logrank', 'binomial', 'gaussian', 'gee+id+gaussian'), 
    data=list(dat1, dat2), data_index = c(1, 1, 1, 2))

fit

## More examples can be found in the vignettes.
library(survival)
library(mvtnorm)
library(tidyr)
genData <- function(seed = NULL){

  set.seed(seed)
  n <- 200
  sigma <- matrix(c(1, .6, .6, 1), 2)
  x <- rmvnorm(n, sigma = sigma)
  z1 <- rbinom(n, 1, .6)
  z2 <- rnorm(n)
  gam <- c(.1, -.2)
  trt <- rbinom(n, 1, .5)

  bet <- c(-.2,.2)
  y <- -.5+x %*% bet + z1 * .3 - z2 * .1 + .1 * trt-.1 * rnorm(n)
  death <- rbinom(n, 1, .8)
  id <- 1:n
  data.frame(
    y = y, trt = trt, 
    z1 = z1, z2 = z2, 
    x1 = x[, 1], x2 = x[, 2], 
    death, id)

}

dat1 <- genData(seed = 31415926)

## create a dataset with repeated measurements x
dat2 <- dat1 %>% pivot_longer(c(x1, x2), names_to='tmp', values_to='x') %>% 
dplyr::select(x, trt, id) %>% as.data.frame()

fit <- 
  pated(
    Surv(time=y, event=death) ~ trt,
    z1 ~ trt, 
    z2 ~ trt, 
    x ~ trt, 
    family=c('logrank', 'binomial', 'gaussian', 'gee+id+gaussian'), 
    data=list(dat1, dat2), data_index = c(1, 1, 1, 2))

fit

Title Summarize an Analysis of Multiple Outcomes.

Description

Summarize an analysis of multiple outcomes.

Usage

## S3 method for class 'summary.multipleOutcomes'
print(x, ...)
## S3 method for class 'summary.multipleOutcomes'
print(x, ...)

Arguments

`x`	an object returned by `multipleOutcomes()`.
`...`	for debugging only.

Value

an invisible object.

Examples

## no example
## no example

Return difference in quantile between arms formula can be endpoint ~ trt

Description

Return difference in quantile between arms formula can be endpoint ~ trt

Usage

quantileMO(formula, data = NULL, probs = c(0.25, 0.5, 0.75))
quantileMO(formula, data = NULL, probs = c(0.25, 0.5, 0.75))

Generate bootstrap dataset. When missing data presents, dataset is split into groups. Patients in the same group have missing on the same covariates. Bootstrap is carried out in each of the groups

Description

Generate bootstrap dataset. When missing data presents, dataset is split into groups. Patients in the same group have missing on the same covariates. Bootstrap is carried out in each of the groups

Usage

sampleWithReplacement(data)
sampleWithReplacement(data)

Generating Data for Simulation and Testing

Description

simulateMoData generates data for simulation and testing purposes.

Usage

simulateMoData(n = 500, hr = 0.8, seed = NULL)
simulateMoData(n = 500, hr = 0.8, seed = NULL)

Arguments

`n`	an integer for total sample size of a randomized control trial of two arms.
`hr`	hazard ratio of treatment.
`seed`	random seed. By default `NULL` for no seed being specified.

Object Summaries

Description

summary method for class multipleOutcomes.

Usage

## S3 method for class 'multipleOutcomes'
summary(object, model_index = NULL, ...)
## S3 method for class 'multipleOutcomes'
summary(object, model_index = NULL, ...)

Arguments

`object`	an object returned by `multipleOutcomes()`.
`model_index`	`NULL` if displaying summary of all fitted models; otherwise, an integer indicating the fitted model.
`...`	for debugging only

Value

a list

Calculate Variance-Covariance Matrix for a Fitted Model Object

Description

Returns the variance-covariance matrix of the main parameters of fitted model objects. The "main" parameters of models correspond to those returned by coef.

Usage

## S3 method for class 'multipleOutcomes'
vcov(object, model_index = NULL, ...)
## S3 method for class 'multipleOutcomes'
vcov(object, model_index = NULL, ...)

Arguments

`object`	an object returned by `multipleOutcomes()`.
`model_index`	`NULL` if displaying covariance matrix of all fitted models; otherwise, an integer indicating the fitted model.
`...`	for debugging only

Value

a matrix of covariance of all estimates

Package 'multipleOutcomes'

Help Index

ACTG 320 Clinical Trial Dataset

Description

Format

Source

References

Examples

Compute asymptotic variance-covariance matrix of parameters in given models It is used for models where bootstrap is not needed

Description

Usage

Compute bootstrapped variance-covariance matrix of parameters in given models It is used when at least one of the specified model needs bootstrap, for example, Kaplan-Merier estimate for probability of survival, or quantiles are used for prognostic variables.

Description

Usage

Process inputs of multipleOutcomes when bootstrap will be used to estimate variance-covariance matrix

Description

Usage

Process inputs of multipleOutcomes when asymptotic properties are used to estimate variance-covariance matrix

Description

Usage

Extract Model Coefficients

Description

Usage

Arguments

Value

Generate two curves of survival probability with pointwise 95% confidence interval: PATED adjusted KM curve Conventional KM curve

Description

Usage

return estimate of log HR from coxph model to be used when bootstrap is needed.

Description

Usage

Description

Usage

Figure out the time points with at least one event. This is used to generate KM-like curves.

Description

Usage

return estimates from GEE model to be used when bootstrap is needed.

Description

Usage

g(S(t)), a transformed estimate of survival probability. See https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_lifetest_details08.htm

Description

Usage

Inversed function of g(S(t)). See https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_lifetest_details08.htm

Description

Usage

return estimates from GLM model to be used when bootstrap is needed.

Description

Usage

df is of length that equals to the number of specified models dfi is the number of parameters in model i This function returns index of parameters of model i in the vector of parameters of all models

Description

Usage

Fitting Regression Models for Multiple Outcomes and Returning the Matrix of Covariance

Description

Usage

Arguments

Value

Examples

Prognostic Variables Assisted Treatment Effect Detection

Description

Usage

Arguments

Value

Examples

Title Summarize an Analysis of Multiple Outcomes.

Description

Usage

Arguments

Value

Examples

Return difference in quantile between arms formula can be endpoint ~ trt

Description

Usage

Generate bootstrap dataset. When missing data presents, dataset is split into groups. Patients in the same group have missing on the same covariates. Bootstrap is carried out in each of the groups

Description

Usage

Generating Data for Simulation and Testing

Description

Usage

Arguments

Object Summaries

Process inputs of `multipleOutcomes` when bootstrap will be used to estimate variance-covariance matrix

Process inputs of `multipleOutcomes` when asymptotic properties are used to estimate variance-covariance matrix

Generate two curves of survival probability with pointwise 95% confidence interval:

PATED adjusted KM curve

Conventional KM curve