Package ‘ORTH.Ord’

Introduction

ORTH.Ord is a package designed for analyzing correlated ordinal outcomes which are commonly seen in longitudinal studies or clustered clinical trials. It implements a modified version of alternating logistic regressions (ALR) with estimation based on orthogonalized residuals (ORTH), which use paired estimating equations to jointly estimate parameters in marginal mean and within-association models. The within-cluster association between ordinal responses is modeled by global pairwise odds ratios (POR). This R package also provides a finite-sample bias correction to POR parameter estimates based on matrix multiplicative adjusted orthogonalized residuals (MMORTH) for correcting estimating equations, and different bias-corrected variance estimators such as BC1, BC2, and BC3. We refer users to our published paper (Meng et al., 2023) for details.

Statistical Methods

ALR uses marginal models with generalized estimating equations (GEE) to jointly estimate marginal means and within-cluster associations. Let Oij be an ordinal response with C + 1 levels, say 1, …, C + 1, for the jth observation in the ith cluster, where cluster i has ni observations and i = 1, …, N. Let Yij(c) denote a binary indicator for the level of ordinal response Oij such that Yij(c) = I(Oij ≤ c) where c = 1, …, C. For a given cluster i, the response vector Yi = (Yi1(1), …, Yi1(C), …, Yini(1), …, Yini(C))′ has Cni elements. Ordinal outcomes are usually modeled with the proportional odds assumption, which means the covariate vector Xij will have the same effect across all levels of Oij. A marginal mean model for Yij(c) using a proportional-odds cumulative logit model can be written as: where the intercept δc represents the log-odds of falling into or below level c (Oij ≤ c) when Xij = 0, and coefficient vector β represents the effect of each element of Xij and remains constant across levels of Oij under the proportional odds assumption.

        In addition to estimating parameters in model (1) using GEE, a within-cluster association structure is specified. The POR, denoted as ψij, k(a, b) for measuring within-cluster association, is defined for the response pair (Yij(a), Yik(b)) as:

where μij(a) = E(Yij(a)) = Pr(Yij(a) = 1), μik(b) = E(Yik(b)) = Pr(Yik(b) = 1), and μij, k(a, b) = E(Yij(a)Yik(b)) = Pr(Yij(a) = Yik(b) = 1) for 1 ≤ a, b ≤ C,1 ≤ j < k ≤ ni. If we let α be a vector of association parameters, then a generalized linear model for ψij, k(a, b) is specified as:
where Zij, k(a, b) is a covariate vector. We assume the POR is independent from cutpoints a and b, namely ψij, k(a, b) = ψij, k, so that a parsimonious model for the within-cluster association can be obtained. Note that model (2) is the association model for ALR. The mean parameter β from model (1) and association parameter α from model (2) are jointly estimated through GEE. The estimate of β from model (1) is the solution to the estimating equations: where Di = ∂μi/∂β, μi is determined by model (1), and Vi is a working variance matrix for the binary response Yi.

        ALR also relies on another estimating equations to estimate association parameter vector α. In this package, the estimating equation for α is based on ORTH, which could reduce the dependence of the variance estimate on observation ordering and increase efficiency when dealing with unequal cluster sizes. Define σij, j(a) = Var(Yij(a)) = μij(a)(1 − μij(a)), σik, k(b) = Var(Yik(b)) = μik(b)(1 − μik(b)), and σij, k(a, b) = Cov(Yij(a), Yik(b)) = μij, k(a, b) − μij(a)μik(b) for 1 ≤ a, b ≤ C,1 ≤ j < k ≤ ni. In ORTH, orthogonalized residuals, denoted as Tij, k(a, b), are based on the expectations of cross-products Yij(a)Yik(b) conditional on Yij(a) and Yik(b): E(Yij(a)Yik(b)|Yij(a), Yik(b)). Then Tij, k(a, b) can be expressed as:

where We also define to be vectors for the orthogonalized residuals, where the dimension of Ti is C2ni(ni − 1)/2. In ORTH, association parameter α is estimated by the solution to where Si = E(−∂Ti/∂α), and Pi ≈ Var(Ti) with elements defined by Cov(Tij, k(a, b), Tij′, k(a′, b′)).

        In order to adjust for small-sample bias in the ORTH procedure, we extend MMORTH to correlated ordinal data analysis. In MMORTH, matrix multiplicative adjusted orthogonalized residuals are used by substituting Rij, k(a, b) with a bias-corrected correlation ij, k(a, b). Let the cluster leverage matrix be $H_{1i}=D_{i}\left(\sum_{i=1}^N D_{i}'V_{i}^{-1}D_{i}\right)^{-1}D_{i}'V_{i}^{-1}$, and define Gi = (ICni − H1i)−1 where ICni is an identity matrix. Let ri be a Cni × 1 vector (ri1(1), ⋯, ri1(C), ⋯, rini(1), ⋯, rini(C)) where rijc = {(Yij(c) − μij(c))/(σij, j(c))1/2} with 1 ≤ c ≤ C and j = 1, ⋯ni, and define matrix Ri = riri. We further define Gij to be the jth row of matrix Gi, and Ri ⋅ k(a, b) to be the kth column of matrix Ri, then ij, k(a, b) = GijRi ⋅ k(a, b). The bias-corrected orthogonalized residual, denoted as ij, k(a, b), is defined as:

MMORTH uses ij, k(a, b) as an estimate of Tij, k(a, b) in the estimating equation (4).

      The GEE sandwich variance estimators also tend to be biased when the number of clusters is small. We implement three popular bias-corrected sandwich variance estimators in this R package, which can be used in combination with ORTH and MMORTH. Let $\Omega_{1i}=\sum_{i=1}^{N} D_{i}'V_{i}^{-1}D_{i}$ be the inverse of the model-based variance, then the sandwich variance estimator for β̂ can be expressed as:

Further, let $\Omega_{2i}=\sum_{i=1}^{N} S_{i}'P_{i}^{-1}S_{i}$, then the sandwich variance estimator for α̂ is: When B1i = I, B2i = I, C1i = I and C2i = I, there is no bias-correction, and estimators (5) and (6) refer to the uncorrected sandwich estimators for β and α; we will refer to these estimators as BC0. Different choices for B1i, B2i, C1i, and C2i will give different bias-corrected sandwich estimators. The three commonly used approaches for bias corrections considered here are illustrated as follows. Define H1i and H2i as cluster leverage matrices based on the marginal mean and association regression models. Let B1i = (I − H1i)−1/2, B2i = (I − H2i)−1/2, C1i = I and C2i = I; then the estimators (5) and (6) will be equal to the bias-corrected covariance estimators of BC1. When setting B1i = (I − H1i)−1, B2i = (I − H2i)−1, C1i = I and C2i = I, the estimators (5) and (6) become the bias-corrected covariance estimators of BC2. To obtain the bias-corrected covariance estimators of BC3, one can set B1i and B2i as identity matrix I, and let C1i = diag{(1 − min {ζ, [Q1i]jj})−1/2} and C2i = diag{(1 − min {ζ, [Q2i]jj})−1/2}, where Q1i = DiVi−1DiΩ1i−1, Q2i = SiPi−1SiΩ2i−1, and set the bound parameter ζ = 0.75 to avoid over-correction.The three bias-corrected sandwich estimators are summarized in Table 1. BC2 was reported to have a greater amount of correction than both BC1 and BC3 in general, which will result in a larger standard error of parameter estimates.

Table 1: Summary of bias-corrected sandwich variance estimators for β̂ and α̂.
Label           Sandwich variance estimator for β̂           Sandwich variance estimator for α̂
                  C1i                           B1i                   C2i                           B2i
BC0                  I                                     I                  I                                     I
BC1                 I                           (I − H1i)−1/2                 I                            (I − H2i)−1/2
BC2                 I                            (I − H1i)−1                 I                             (I − H2i)−1
BC3 diag{(1 − min {ζ, [Q1i]jj})−1/2}*         I diag{(1 − min {ζ, [Q2i]jj})−1/2}**         I
*Q1i = DiVi−1DiΩ1i−1; * * Q2i = SiPi−1SiΩ2i−1

Function Description

This package ORTH.Ord provides an ALR modeling framework to jointly estimate marginal means and within-cluster association of ordinal outcomes with the ability to adjust for small-sample bias. When using this package, the input data must be numeric for both the response variable and all covariates. For example, if an ordinal response is coded as “never”, “sometimes”, and “always”, the user need to convert it to numeric values, i.e. 1, 2 and 3 accordingly. The arguments of ORTH.Ord function are:


ORTH.Ord(formula_mean, data_mean, cluster, formula_por = NULL, data_por = NULL, 
         MMORTH = FALSE, BC = NULL, init_beta = NULL, init_alpha = NULL,
         miter = 30, crit_level = 0.0001)

The details and defaults of arguments are summarized in Table 2, and further explanation is provided below.

Table 2: Arguments of ORTH.Ord

Argument Description Default
formula_mean the symbolic description of the marginal mean model that contains the ordinal outcome and marginal mean covariates.
data_mean the data set containing the ordinal outcome and marginal mean covariates.
cluster cluster ID (consecutive integers) in data_mean.
formula_por the symbolic description of marginal association model in the form of a one-sided formula. NULL
data_por a data set for marginal association model. NULL
MMORTH a logical value to indicate if matrix-adjusted estimating equations will be applied for association estimation. FALSE
BC an option to apply bias-correction on covariance estimation. NULL
init_beta pre-specified starting values for parameters in the mean model. NULL
init_alpha pre-specified starting values for parameters in the association model. NULL
miter maximum number of iterations for Fisher scoring. 30
crit_level tolerance for convergence. 0.0001
         The input argument formula_mean is the symbolic description of the marginal mean model, e.g., formula_mean=Y ∼ x1 + x2. The argument data_mean is an R data set to fit the mean model (1), which should include all the variables required for fitting the mean model, i.e. an ordinal variable as response and one or more variables as covariates. All the variables in the R data set must be coded numerically; character values are required to be converted into numerical values during the data preprocessing step. The argument cluster is the column name for the cluster variable. The argument formula_por is the symbolic description of the marginal association model. Unlike formula_mean, formula_por is defined as a one-sided formula, e.g., formula_por= ∼ a0 + a1. The argument data_por is an R data set to which we will fit association model (2), and should include a variable for cluster and covariates for pairwise association parameters α which often are indicator variables. The default for arguments data_por and formula_por is NULL. When either of the two arguments is not specified, independence working correlation will be used for β estimating equations (1), meaning Ri(ρ) = Ini and Vi = Ai. The data for data_por must be all numeric too.


        The argument MMORTH is used to indicate whether one wants to apply MMORTH to the estimating equations to adjust for small-sample bias. The default for MMORTH is FALSE, which will use ORTH without bias correction on the estimating equation for correlation model; when MMORTH= TRUE, MMORTH method will be employed. Please note that MMORTH=TRUE works only when both formula_por and data_por are specified. Using the independence working correlation will automatically suppress MMORTH. The argument BC offers an option to adjust for bias on the sandwich estimators for both β and α with BC1, BC2, or BC3 methods. When BC is set to the default, the program will only output the standard errors, z-values, and corresponding p-values obtained from the uncorrected sandwich estimator (BC0). The possible values for BC include “BC1”, “BC2”, and “BC3”. One can specify a single method (e.g., BC=“BC1”) or multiple methods (e.g., BC=c(“BC1”,“BC2”,“BC3”)) in ORTH.Ord to output the standard errors, z-values, and p-values from each method. The arguments init_beta and init_alpha offer the options to pre-specify the initial values for parameters of the mean model and the association model. The dimension of the vectors for pre-specified starting values should match that of data_mean and data_por. The argument miter is the maximum number of iterations whose default is 30. The argument crit_level is a critical value for determining model convergence; the default is 0.0001, which means the model is considered converged if the absolute difference between parameter estimations from two consecutive iterations is smaller or equal to 0.0001.


       The value returned by the function ORTH.Ord is a list. If argument BC is not specified (i.e., BC=NULL), the output will be a list with two elements. The first element is a data frame including point estimates, standard errors, z-values, and p-values for model parameters; the second element is a variance-covariance matrix of model parameters without bias-correction (BC0). When argument BC is specified, for example BC=c(“BC1”,“BC2”,“BC3”), additional elements will be included in the output list which are variance-covariance matrices of model parameters based on BC1, BC2, and BC3.

Reference

Can Meng, Mary Ryan, Paul Rathouz, Elizabeth Turner, John S Preisser, and Fan Li. 2023. ORTH.Ord: An R package for analyzing correlated ordinal outcomes using alternating logistic regressions with orthogonalized residuals. Computer Methods and Programs in Biomedicine, 237, DOI:10.1016/j.cmpb.2023.107567.