PowerTOST

cran checks CRAN RStudio mirror downloads CRAN RStudio mirror downloads

The package contains functions to calculate power and estimate sample size for various study designs used in (not only bio-) equivalence studies.
Version 1.4.9.9999 built 2020-08-04 with R 4.0.2.

Supported Designs

#    design                        name    df
#  parallel           2 parallel groups   n-2
#       2x2               2x2 crossover   n-2
#     2x2x2             2x2x2 crossover   n-2
#       3x3               3x3 crossover 2*n-4
#     3x6x3             3x6x3 crossover 2*n-4
#       4x4               4x4 crossover 3*n-6
#     2x2x3   2x2x3 replicate crossover 2*n-3
#     2x2x4   2x2x4 replicate crossover 3*n-4
#     2x4x4   2x4x4 replicate crossover 3*n-4
#     2x3x3   partial replicate (2x3x3) 2*n-3
#     2x4x2            Balaam's (2x4x2)   n-2
#    2x2x2r Liu's 2x2x2 repeated x-over 3*n-2
#    paired                paired means   n-1

Codes of designs follow this pattern: treatments x sequences x periods.

Although some replicate designs are more ‘popular’ than others, sample size estimations are valid for all of the following designs:

design type sequences periods
2x2x4 full 2 TRTR\|RTRT 4
2x2x4 full 2 TRRT\|RTTR 4
2x2x4 full 2 TTRR\|RRTT 4
2x2x3 full 2 TRT\|RTR 3
2x2x3 full 2 TRR\|RTT 3
2x3x3 partial 3 TRR\|RTR\|RRT 3

Whilst “2x4x4” four period full replicate designs with four sequences (TRTR|RTRT|TRRT|RTTR or TRRT|RTTR|TTRR|RRTT) are supported, they should be avoided due to confounded effects.

TOC ↩︎

Purpose

For various methods power can be calculated based on

For all methods the sample size can be estimated based on

TOC ↩︎

Supported

Power and Sample Size

Power covers balanced as well as unbalanced sequences in crossover or replicate designs and equal/unequal group sizes in two-group parallel designs. Sample sizes are always rounded up to achieve balanced sequences or equal group sizes.

TOC ↩︎

Methods

TOC ↩︎

Helpers

TOC ↩︎

Defaults

Average Bioequivalence

θ0 0.95, target power 0.80, design “2x2” (TR|RT), exact method (Owen’s Q).

Reference-Scaled Average Bioequivalence

α 0.05, point estimate constraint (0.80, 1.25), homoscedasticity (CVwT = CVwR), scaling is based on CVwR, target power 0.80, design “2x3x3” (TRR|RTR|RRT), approximation by the non-central t-distribution, 100,000 simulations.

Highly Variable Drugs / Drug Products

θ0 0.90 as recommended by Tóthfalusi and Endrényi (2011).

EMA

Regulatory constant 0.76, upper cap of scaling at CVwR 50%, evaluation by ANOVA.

Health Canada

Regulatory constant 0.76, upper cap of scaling at CVwR ~57.4%, evaluation by intra-subject contrasts.

FDA

Regulatory constant log(1.25)/0.25, linearized scaled ABE (Howe’s approximation).

Narrow Therapeutic Index Drugs (FDA)

θ0 0.975, regulatory constant log(1.11111)/0.1, upper cap of scaling at CVwR ~21.4%, design “2x2x4” (TRTR|RTRT), linearized scaled ABE (Howe’s approximation), upper limit of the confidence interval of swT/swR ≤2.5.

Dose-Proportionality

β0 (slope) 1+log(0.95)/log(rd) where rd is the ratio of the highest and lowest dose, target power 0.80, crossover design, details of the sample size search suppressed.

Power Analysis

Minimum acceptable power 0.70. θ0, design, conditions, and sample size method depend on defaults of the respective approaches (ABE, ABEL, RSABE, NTID).

TOC ↩︎

Examples

Before running the examples attach the library.

library(PowerTOST)

If not noted otherwise, defaults are employed.

Parallel Design

Power for total CV 0.35, θ0 0.95, group sizes 52 and 49, design “parallel”.

power.TOST(CV = 0.35, theta0 = 0.95, n = c(52, 49), design = "parallel")
# [1] 0.8011186

Crossover Design

Sample size for assumed intra-subject CV 0.20.

sampleN.TOST(CV = 0.20)
# 
# +++++++++++ Equivalence test - TOST +++++++++++
#             Sample size estimation
# -----------------------------------------------
# Study design: 2x2 crossover 
# log-transformed data (multiplicative model)
# 
# alpha = 0.05, target power = 0.8
# BE margins = 0.8 ... 1.25 
# True ratio = 0.95,  CV = 0.2
# 
# Sample size (total)
#  n     power
# 20   0.834680

Sample size for equivalence of the ratio of two means with normality on original scale based on Fieller’s (‘fiducial’) confidence interval. CVw 0.20, CVb 0.40.
Note the default α 0.025 (95% CI) of this function because it is intended for studies with clinical endpoints.

sampleN.RatioF(CV = 0.20, CVb = 0.40)
# 
# +++++++++++ Equivalence test - TOST +++++++++++
#     based on Fieller's confidence interval
#             Sample size estimation
# -----------------------------------------------
# Study design: 2x2 crossover
# Ratio of means with normality on original scale
# alpha = 0.025, target power = 0.8
# BE margins = 0.8 ... 1.25 
# True ratio = 0.95,  CVw = 0.2,  CVb = 0.4
# 
# Sample size
#  n     power
# 28   0.807774

TOC ↩︎

Replicate Designs

ABE

Sample size for assumed intra-subject CV 0.45, θ0 0.90, three period full replicate study “2x2x3” (TRT|RTR or TRR|RTT).

sampleN.TOST(CV = 0.45, theta0 = 0.90, design = "2x2x3")
# 
# +++++++++++ Equivalence test - TOST +++++++++++
#             Sample size estimation
# -----------------------------------------------
# Study design: 2x2x3 (3 period full replicate) 
# log-transformed data (multiplicative model)
# 
# alpha = 0.05, target power = 0.8
# BE margins = 0.8 ... 1.25 
# True ratio = 0.9,  CV = 0.45
# 
# Sample size (total)
#  n     power
# 124   0.800125

Note that the conventional model assumes homoscedasticity. For heteroscedasticity we can ‘switch off’ all conditions of one of the methods for reference-scaled ABE. We assume a σ2 ratio of ⅔ (i.e., T has a lower variability than R). Only relevant columns of the data.frame shown.

reg <- reg_const("USER", r_const = NA, CVswitch = Inf,
                 CVcap = Inf, pe_constr = FALSE)
CV  <- CVp2CV(CV = 0.45, ratio = 2/3)
res <- sampleN.scABEL(CV=CV, design = "2x2x3", regulator = reg,
                      details = FALSE, print = FALSE)
print(res[c(3:4, 8:9)], digits = 5, row.names = FALSE)
#    CVwT    CVwR Sample size Achieved power
#  0.3987 0.49767         126         0.8052

Similar sample size because the pooled CV is still 0.45.

TOC ↩︎

ABEL

Sample size assuming homoscedasticity (CVw = 0.45).

sampleN.scABEL(CV = 0.45, details = TRUE)
# 
# +++++++++++ scaled (widened) ABEL +++++++++++
#             Sample size estimation
#    (simulation based on ANOVA evaluation)
# ---------------------------------------------
# Study design: 2x3x3 (partial replicate)
# log-transformed data (multiplicative model)
# 1e+05 studies for each step simulated.
# 
# alpha  = 0.05, target power = 0.8
# CVw(T) = 0.45; CVw(R) = 0.45
# True ratio = 0.9
# ABE limits / PE constraint = 0.8 ... 1.25 
# EMA regulatory settings
# - CVswitch            = 0.3 
# - cap on scABEL if CVw(R) > 0.5
# - regulatory constant = 0.76 
# - pe constraint applied
# 
# 
# Sample size search
#  n     power
# 36   0.7755 
# 39   0.8059

Iteratively adjust α to control the Type I Error (Labes, Schütz). Slight heteroscedasticity (CVwT 0.30, CVwR 0.35), four period full replicate “2x2x4” study, 30 subjects, balanced sequences.

scABEL.ad(CV = c(0.30, 0.35), design = "2x2x4", n = 30)
# +++++++++++ scaled (widened) ABEL ++++++++++++
#          iteratively adjusted alpha
#    (simulations based on ANOVA evaluation)
# ----------------------------------------------
# Study design: 2x2x4 (4 period full replicate)
# log-transformed data (multiplicative model)
# 1,000,000 studies in each iteration simulated.
# 
# CVwR 0.35, CVwT 0.3, n(i) 15|15 (N 30)
# Nominal alpha                 : 0.05 
# True ratio                    : 0.9000 
# Regulatory settings           : EMA (ABEL)
# Switching CVwR                : 0.3 
# Regulatory constant           : 0.76 
# Expanded limits               : 0.7723 ... 1.2948
# Upper scaling cap             : CVwR > 0.5 
# PE constraints                : 0.8000 ... 1.2500
# Empiric TIE for alpha 0.0500  : 0.06651
# Power for theta0 0.9000       : 0.814
# Iteratively adjusted alpha    : 0.03540
# Empiric TIE for adjusted alpha: 0.05000
# Power for theta0 0.9000       : 0.771

With the nominal α 0.05 the Type I Error will be inflated (0.0665). With the adjusted α 0.0354 (i.e., the 92.92% confidence interval) the TIE will be controlled, although with a slight loss in power (decreases from 0.814 to 0.771).
Consider sampleN.scABEL.ad(CV = c(0.30, 0.35), design = "2x2x4") to estimate the sample size which both controls the TIE and maintains the target power. In this example 34 subjects will be required.

TOC ↩︎

RSABE

HVD(P)s

Sample size for a four period full replicate “2x2x4” study (any of TRTR|RTRT, TRRT|RTTR, TTRR|RRTT) assuming heteroscedasticity (CVwT 0.40, CVwR 0.50). Details of the sample size search suppressed.

sampleN.RSABE(CV = c(0.40, 0.50), design = "2x2x4", details = FALSE)
# 
# ++++++++ Reference scaled ABE crit. +++++++++
#            Sample size estimation
# ---------------------------------------------
# Study design: 2x2x4 (4 period full replicate)
# log-transformed data (multiplicative model)
# 1e+05 studies for each step simulated.
# 
# alpha  = 0.05, target power = 0.8
# CVw(T) = 0.4; CVw(R) = 0.5
# True ratio = 0.9
# ABE limits / PE constraints = 0.8 ... 1.25 
# Regulatory settings: FDA 
# 
# Sample size
#  n    power
# 20   0.81509

TOC ↩︎

NTIDs

Sample size assuming heteroscedasticity (CVw 0.125, σ2 ratio 2.5, i.e., T has a substantially higher variability than R). TRTR|RTRT according to the FDA’s guidance. Assess additionally which one of the three components (scaled, ABE, swT/swR ratio) drives the sample size.

CV <- signif(CVp2CV(CV = 0.125, ratio = 2.5), 4)
n  <- sampleN.NTIDFDA(CV = CV)[["Sample size"]]
# 
# +++++++++++ FDA method for NTIDs ++++++++++++
#            Sample size estimation
# ---------------------------------------------
# Study design:  2x2x4 (TRTR|RTRT) 
# log-transformed data (multiplicative model)
# 1e+05 studies for each step simulated.
# 
# alpha  = 0.05, target power = 0.8
# CVw(T) = 0.1497, CVw(R) = 0.09433
# True ratio     = 0.975 
# ABE limits     = 0.8 ... 1.25 
# Implied scABEL = 0.9056 ... 1.1043 
# Regulatory settings: FDA 
# - Regulatory const. = 1.053605 
# - 'CVcap'           = 0.2142 
# 
# Sample size search
#  n     power
# 28   0.665530 
# 30   0.701440 
# 32   0.734240 
# 34   0.764500 
# 36   0.792880 
# 38   0.816080
suppressMessages(power.NTIDFDA(CV = CV, n = n, details = TRUE))
#        p(BE)  p(BE-sABEc)    p(BE-ABE) p(BE-sratio) 
#      0.81608      0.93848      1.00000      0.85794

The swT/swR component shows the lowest power and hence, drives the sample size.
Compare that with homoscedasticity (CVwT = CVwR = 0.125):

CV <- 0.125
n  <- sampleN.NTIDFDA(CV = CV, details = FALSE)[["Sample size"]]
# 
# +++++++++++ FDA method for NTIDs ++++++++++++
#            Sample size estimation
# ---------------------------------------------
# Study design:  2x2x4 (TRTR|RTRT) 
# log-transformed data (multiplicative model)
# 1e+05 studies for each step simulated.
# 
# alpha  = 0.05, target power = 0.8
# CVw(T) = 0.125, CVw(R) = 0.125
# True ratio     = 0.975 
# ABE limits     = 0.8 ... 1.25 
# Regulatory settings: FDA 
# 
# Sample size
#  n     power
# 16   0.822780
suppressMessages(power.NTIDFDA(CV = CV, n = n, details = TRUE))
#        p(BE)  p(BE-sABEc)    p(BE-ABE) p(BE-sratio) 
#      0.82278      0.84869      1.00000      0.95128

Here the scaled ABE component shows the lowest power and drives the sample size, which is much lower than in the previous example.

TOC ↩︎

Dose-Proportionality

CV 0.20, Doses 1, 2, and 8 units, β0 1, target power 0.90.

sampleN.dp(CV = 0.20, doses = c(1, 2, 8), beta0 = 1, targetpower = 0.90)
# 
# ++++ Dose proportionality study, power model ++++
#             Sample size estimation
# -------------------------------------------------
# Study design: crossover (3x3 Latin square) 
# alpha = 0.05, target power = 0.9
# Equivalence margins of R(dnm) = 0.8 ... 1.25 
# Doses = 1 2 8 
# True slope = 1, CV = 0.2
# Slope acceptance range = 0.89269 ... 1.1073 
# 
# Sample size (total)
#  n     power
# 18   0.915574

Note that the acceptance range of the slope depends on the ratio of the highest and lowest doses (i.e., it gets tighter for wider dose ranges and therefore, higher sample sizes will be required).
In an exploratory setting wider equivalence margins {θ1, θ2} (0.50, 2.00) are recommended, which would translate in this example to an acceptance range of 0.66667 ... 1.3333 and a sample size of only six subjects.

TOC ↩︎

Power Analysis

Explore impact of deviations from assumptions (higher CV, higher deviation of θ0 from 1, dropouts) on power. Assumed intra-subject CV 0.20, target power 0.90. Suppress the plot.

res <- pa.ABE(CV = 0.20, targetpower = 0.90)
print(res, plotit = FALSE)
# Sample size plan ABE
#  Design alpha  CV theta0 theta1 theta2 Sample size Achieved power
#     2x2  0.05 0.2   0.95    0.8   1.25          26      0.9176333
# 
# Power analysis
# CV, theta0 and number of subjects which lead to min. acceptable power of at least 0.7:
#  CV= 0.2729, theta0= 0.9044
#  n = 16 (power= 0.7354)

If the study starts with 26 subjects (power ~0.92), the CV can increase to ~0.27 or θ0 decrease to ~0.90 or the sample size decrease to 10 whilst power will still be ≥0.70.
However, this is not a substitute for the “Sensitivity Analysis” recommended in ICH-E9, since in a real study a combination of all effects occurs simultaneously. It is up to you to decide on reasonable combinations and analyze their respective power.

TOC ↩︎

Speed Comparisons

Performed on a Xeon E3-1245v3 3.4 GHz, 8 MB cache, 16 GB RAM, R 4.0.2 64 bit on Windows 7.

ABE

“2x2” crossover design, CV 0.17. Sample sizes and achieved power for the supported methods (the 1st one is the default).

#      method  n    power seconds
#       owenq 14 0.805683  0.0015
#         mvt 14 0.805690  0.1220
#  noncentral 14 0.805683  0.0010
#     shifted 16 0.852301  0.0005

The 2nd exact method is substantially slower than the 1st. The approximation based on the noncentral t-distribution is slightly faster but matches the 1st exact method closely. The approximation based on the shifted central t-distribution is the fastest but might estimate a sample size higher than necessary. Hence, it should be used only for comparative purposes.

ABEL

Four period full replicate study, homogenicity (CVwT = CVwR 0.45). Sample sizes and achieved power for the supported methods (‘key’ statistics or subject simulations).

#               method  n   power seconds
#     ‘key’ statistics 28 0.81116    0.16
#  subject simulations 28 0.81196    2.32

Simulating via the ‘key’ statistics is the method of choice for speed reasons.
However, subject simulations are recommended if

TOC ↩︎

Installation

You can install the released version of PowerTOST from CRAN with

package <- "PowerTOST"
inst    <- package %in% installed.packages()
if (length(package[!inst]) > 0) install.packages(package[!inst])

… and the development version from GitHub with

# install.packages("remotes")
remotes::install_github("Detlew/PowerTOST")

Skips installation from a github remote if the SHA-1 has not changed since last install. Use force = TRUE to force installation.

TOC ↩︎