Package ‘ORTH.Ord’

Introduction

ORTH.Ord is a package designed for analyzing correlated ordinal outcomes which are commonly seen in longitudinal studies or clustered clinical trials. It implements a modified version of alternating logistic regressions (ALR) with estimation based on orthogonalized residuals (ORTH), which use paired estimating equations to jointly estimate parameters in marginal mean and within-association models. The within-cluster association between ordinal responses is modeled by global pairwise odds ratios (POR). This R package also provides a finite-sample bias correction to POR parameter estimates based on matrix multiplicative adjusted orthogonalized residuals (MMORTH) for correcting estimating equations, and different bias-corrected variance estimators such as BC1, BC2, and BC3. We refer users to our published paper (Meng et al., 2023) for details.

Statistical Methods

ALR uses marginal models with generalized estimating equations (GEE) to jointly estimate marginal means and within-cluster associations. Let O_ij be an ordinal response with C + 1 levels, say 1, …, C + 1, for the j^th observation in the i^th cluster, where cluster i has n_i observations and i = 1, …, N. Let Y_ij^(c) denote a binary indicator for the level of ordinal response O_ij such that Y_ij^(c) = I(O_ij ≤ c) where c = 1, …, C. For a given cluster i, the response vector Y_i = (Y_i1⁽¹⁾, …, Y_i1^(C), …, Y_{in_i}⁽¹⁾, …, Y_{in_i}^(C))′ has Cn_i elements. Ordinal outcomes are usually modeled with the proportional odds assumption, which means the covariate vector X_ij will have the same effect across all levels of O_ij. A marginal mean model for Y_ij^(c) using a proportional-odds cumulative logit model can be written as: where the intercept δ_c represents the log-odds of falling into or below level c (O_ij ≤ c) when X_ij = 0, and coefficient vector β represents the effect of each element of X_ij and remains constant across levels of O_ij under the proportional odds assumption.

In addition to estimating parameters in model (1) using GEE, a within-cluster association structure is specified. The POR, denoted as ψ_ij, k^(a, b) for measuring within-cluster association, is defined for the response pair (Y_ij^(a), Y_ik^(b)) as:

where μ_ij^(a) = E(Y_ij^(a)) = Pr(Y_ij^(a) = 1), μ_ik^(b) = E(Y_ik^(b)) = Pr(Y_ik^(b) = 1), and μ_ij, k^(a, b) = E(Y_ij^(a)Y_ik^(b)) = Pr(Y_ij^(a) = Y_ik^(b) = 1) for 1 ≤ a, b ≤ C,1 ≤ j < k ≤ n_i. If we let α be a vector of association parameters, then a generalized linear model for ψ_ij, k^(a, b) is specified as:
where Z_ij, k^(a, b) is a covariate vector. We assume the POR is independent from cutpoints a and b, namely ψ_ij, k^(a, b) = ψ_ij, k, so that a parsimonious model for the within-cluster association can be obtained. Note that model (2) is the association model for ALR. The mean parameter β from model (1) and association parameter α from model (2) are jointly estimated through GEE. The estimate of β from model (1) is the solution to the estimating equations: where D_i = ∂μ_i/∂β^′, μ_i is determined by model (1), and V_i is a working variance matrix for the binary response Y_i.

ALR also relies on another estimating equations to estimate association parameter vector α. In this package, the estimating equation for α is based on ORTH, which could reduce the dependence of the variance estimate on observation ordering and increase efficiency when dealing with unequal cluster sizes. Define σ_ij, j^(a) = Var(Y_ij^(a)) = μ_ij^(a)(1 − μ_ij^(a)), σ_ik, k^(b) = Var(Y_ik^(b)) = μ_ik^(b)(1 − μ_ik^(b)), and σ_ij, k^(a, b) = Cov(Y_ij^(a), Y_ik^(b)) = μ_ij, k^(a, b) − μ_ij^(a)μ_ik^(b) for 1 ≤ a, b ≤ C,1 ≤ j < k ≤ n_i. In ORTH, orthogonalized residuals, denoted as T_ij, k^(a, b), are based on the expectations of cross-products Y_ij^(a)Y_ik^(b) conditional on Y_ij^(a) and Y_ik^(b): E(Y_ij^(a)Y_ik^(b)|Y_ij^(a), Y_ik^(b)). Then T_ij, k^(a, b) can be expressed as:

where We also define to be vectors for the orthogonalized residuals, where the dimension of T_i is C²n_i(n_i − 1)/2. In ORTH, association parameter α is estimated by the solution to where S_i = E(−∂T_i/∂α), and P_i ≈ Var(T_i) with elements defined by Cov(T_ij, k^(a, b), T_{ij′, k′}^{(a′, b′)}).

In order to adjust for small-sample bias in the ORTH procedure, we extend MMORTH to correlated ordinal data analysis. In MMORTH, matrix multiplicative adjusted orthogonalized residuals are used by substituting R_ij, k^(a, b) with a bias-corrected correlation R̃_ij, k^(a, b). Let the cluster leverage matrix be $H_{1i}=D_{i}\left(\sum_{i=1}^N D_{i}'V_{i}^{-1}D_{i}\right)^{-1}D_{i}'V_{i}^{-1}$, and define G_i = (I_{Cn_i} − H_1i)⁻¹ where I_{Cn_i} is an identity matrix. Let r_i be a Cn_i × 1 vector (r_i1⁽¹⁾, ⋯, r_i1^(C), ⋯, r_{in_i}⁽¹⁾, ⋯, r_{in_i}^(C)) where r_ij^c = {(Y_ij^(c) − μ_ij^(c))/(σ_ij, j^(c))^1/2} with 1 ≤ c ≤ C and j = 1, ⋯n_i, and define matrix R_i = r_ir′_i. We further define G_ij⋅ to be the j^th row of matrix G_i, and R_i ⋅ k^(a, b) to be the k^th column of matrix R_i, then R̃_ij, k^(a, b) = G_ij⋅R_i ⋅ k^(a, b). The bias-corrected orthogonalized residual, denoted as T̃_ij, k^(a, b), is defined as:

MMORTH uses T̃_ij, k^(a, b) as an estimate of T_ij, k^(a, b) in the estimating equation (4).

The GEE sandwich variance estimators also tend to be biased when the number of clusters is small. We implement three popular bias-corrected sandwich variance estimators in this R package, which can be used in combination with ORTH and MMORTH. Let $\Omega_{1i}=\sum_{i=1}^{N} D_{i}'V_{i}^{-1}D_{i}$ be the inverse of the model-based variance, then the sandwich variance estimator for β̂ can be expressed as:

Further, let $\Omega_{2i}=\sum_{i=1}^{N} S_{i}'P_{i}^{-1}S_{i}$, then the sandwich variance estimator for α̂ is: When B_1i = I, B_2i = I, C_1i = I and C_2i = I, there is no bias-correction, and estimators (5) and (6) refer to the uncorrected sandwich estimators for β and α; we will refer to these estimators as BC0. Different choices for B_1i, B_2i, C_1i, and C_2i will give different bias-corrected sandwich estimators. The three commonly used approaches for bias corrections considered here are illustrated as follows. Define H_1i and H_2i as cluster leverage matrices based on the marginal mean and association regression models. Let B_1i = (I − H_1i)^−1/2, B_2i = (I − H_2i)^−1/2, C_1i = I and C_2i = I; then the estimators (5) and (6) will be equal to the bias-corrected covariance estimators of BC1. When setting B_1i = (I − H_1i)⁻¹, B_2i = (I − H_2i)⁻¹, C_1i = I and C_2i = I, the estimators (5) and (6) become the bias-corrected covariance estimators of BC2. To obtain the bias-corrected covariance estimators of BC3, one can set B_1i and B_2i as identity matrix I, and let C_1i = diag{(1 − min {ζ, [Q_1i]_jj})^−1/2} and C_2i = diag{(1 − min {ζ, [Q_2i]_jj})^−1/2}, where Q_1i = D_i′V_i⁻¹D_iΩ_1i⁻¹, Q_2i = S_i′P_i⁻¹S_iΩ_2i⁻¹, and set the bound parameter ζ = 0.75 to avoid over-correction.The three bias-corrected sandwich estimators are summarized in Table 1. BC2 was reported to have a greater amount of correction than both BC1 and BC3 in general, which will result in a larger standard error of parameter estimates.

Table 1: Summary of bias-corrected sandwich variance estimators for β̂ and α̂.
Label	Sandwich variance estimator for β̂	Sandwich variance estimator for α̂
	C_1i B_1i	C_2i B_2i
BC0	I I	I I
BC1	I (I − H_1i)^−1/2	I (I − H_2i)^−1/2
BC2	I (I − H_1i)⁻¹	I (I − H_2i)⁻¹
BC3	diag{(1 − min {ζ, [Q_1i]_jj})^−1/2}^* I	diag{(1 − min {ζ, [Q_2i]_jj})^−1/2}^** I
Q_1i = D_i′V_i⁻¹D_iΩ_1i⁻¹; * Q_2i = S_i′P_i⁻¹S_iΩ_2i⁻¹

Function Description

This package ORTH.Ord provides an ALR modeling framework to jointly estimate marginal means and within-cluster association of ordinal outcomes with the ability to adjust for small-sample bias. When using this package, the input data must be numeric for both the response variable and all covariates. For example, if an ordinal response is coded as “never”, “sometimes”, and “always”, the user need to convert it to numeric values, i.e. 1, 2 and 3 accordingly. The arguments of ORTH.Ord function are:

ORTH.Ord(formula_mean, data_mean, cluster, formula_por = NULL, data_por = NULL, 
         MMORTH = FALSE, BC = NULL, init_beta = NULL, init_alpha = NULL,
         miter = 30, crit_level = 0.0001)

The details and defaults of arguments are summarized in Table 2, and further explanation is provided below.

Table 2: Arguments of **ORTH.Ord**
Argument	Description	Default
formula_mean	the symbolic description of the marginal mean model that contains the ordinal outcome and marginal mean covariates.
data_mean	the data set containing the ordinal outcome and marginal mean covariates.
cluster	cluster ID (consecutive integers) in data_mean.
formula_por	the symbolic description of marginal association model in the form of a one-sided formula.	NULL
data_por	a data set for marginal association model.	NULL
MMORTH	a logical value to indicate if matrix-adjusted estimating equations will be applied for association estimation.	FALSE
BC	an option to apply bias-correction on covariance estimation.	NULL
init_beta	pre-specified starting values for parameters in the mean model.	NULL
init_alpha	pre-specified starting values for parameters in the association model.	NULL
miter	maximum number of iterations for Fisher scoring.	30
crit_level	tolerance for convergence.	0.0001

The input argument formula_mean is the symbolic description of the marginal mean model, e.g., formula_mean=Y ∼ x1 + x2. The argument data_mean is an R data set to fit the mean model (1), which should include all the variables required for fitting the mean model, i.e. an ordinal variable as response and one or more variables as covariates. All the variables in the R data set must be coded numerically; character values are required to be converted into numerical values during the data preprocessing step. The argument cluster is the column name for the cluster variable. The argument formula_por is the symbolic description of the marginal association model. Unlike formula_mean, formula_por is defined as a one-sided formula, e.g., formula_por= ∼ a0 + a1. The argument data_por is an R data set to which we will fit association model (2), and should include a variable for cluster and covariates for pairwise association parameters α which often are indicator variables. The default for arguments data_por and formula_por is NULL. When either of the two arguments is not specified, independence working correlation will be used for β estimating equations (1), meaning R_i(ρ) = I_{n_i} and V_i = A_i. The data for data_por must be all numeric too.

The argument MMORTH is used to indicate whether one wants to apply MMORTH to the estimating equations to adjust for small-sample bias. The default for MMORTH is FALSE, which will use ORTH without bias correction on the estimating equation for correlation model; when MMORTH= TRUE, MMORTH method will be employed. Please note that MMORTH=TRUE works only when both formula_por and data_por are specified. Using the independence working correlation will automatically suppress MMORTH. The argument BC offers an option to adjust for bias on the sandwich estimators for both β and α with BC1, BC2, or BC3 methods. When BC is set to the default, the program will only output the standard errors, z-values, and corresponding p-values obtained from the uncorrected sandwich estimator (BC0). The possible values for BC include “BC1”, “BC2”, and “BC3”. One can specify a single method (e.g., BC=“BC1”) or multiple methods (e.g., BC=c(“BC1”,“BC2”,“BC3”)) in ORTH.Ord to output the standard errors, z-values, and p-values from each method. The arguments init_beta and init_alpha offer the options to pre-specify the initial values for parameters of the mean model and the association model. The dimension of the vectors for pre-specified starting values should match that of data_mean and data_por. The argument miter is the maximum number of iterations whose default is 30. The argument crit_level is a critical value for determining model convergence; the default is 0.0001, which means the model is considered converged if the absolute difference between parameter estimations from two consecutive iterations is smaller or equal to 0.0001.

The value returned by the function ORTH.Ord is a list. If argument BC is not specified (i.e., BC=NULL), the output will be a list with two elements. The first element is a data frame including point estimates, standard errors, z-values, and p-values for model parameters; the second element is a variance-covariance matrix of model parameters without bias-correction (BC0). When argument BC is specified, for example BC=c(“BC1”,“BC2”,“BC3”), additional elements will be included in the output list which are variance-covariance matrices of model parameters based on BC1, BC2, and BC3.

Reference

Can Meng, Mary Ryan, Paul Rathouz, Elizabeth Turner, John S Preisser, and Fan Li. 2023. ORTH.Ord: An R package for analyzing correlated ordinal outcomes using alternating logistic regressions with orthogonalized residuals. Computer Methods and Programs in Biomedicine, 237, DOI:10.1016/j.cmpb.2023.107567.