Skip to contents

Introduction

pwr4exp provides tools for statistical power calculation in experimental designs analyzed within the linear mixed model framework, including both fixed-effects and mixed-effects models. Currently, the package does not support generalized linear models; thus, only normal response variables are supported. Degrees of freedom are approximated using the Satterthwaite method.

This vignette is based on the development version of the package, which incorporates various correlation structures in repeated measures and spatial data. It also includes options for generating input templates, simplifying the setup and execution of power analyses. The development version can be downloaded from GitHub:

devtools::install_github("an-ethz/pwr4exp")

Package overview

Performing a power analysis in pwr4exp involves two main steps:

  1. Create a design object: A design object, on which the power analysis will be conducted, needs to be created first. This is done using the function mkdesign for any designs, or the functions designCRD, designRCBD, designLSD, designCOD, and designSPD for the specific designs.

  2. Calculate power: The statistical power of the F-test for model terms and the t-test for specific contrasts can be evaluated using the functions pwr.anova pwr.contrast

The sections following the introduction are dedicated to these two steps. Additional theory and concepts are listed at the end.

Create a design object

mk_design

The function mkdesign is designed to create any design by specifying the design’s data structure, the intended model, and the necessary parameters.

Inputs

  • formula: The right-hand-side model formula (uses lme4::lmer() syntax for random effects).
  • data: A long-format data frame with one row per observation and columns for all independent variables specified in the formula (e.g., treatments, blocks, subjects). The data should reflect the design structure but exclude the response variable. Unused variables in the formula are ignored.
  • beta or means: Fixed-effects expectations. Specify either beta (a vector of regression coefficients) or means (expected mean values for fixed effects). Only one of these must be provided.
  • vcomp: Expected variance-covariance components for the random effects.
  • sigma2: Expected error variance (residual variance).
  • correlation: Correlation structure for repeated measures or spatial data.
  • template: Logical; when TRUE, generates templates for beta, means, and vcomp instead of the full design object.

Output

  • Returns a list of essential components required for power calculation functions.

Templates

Templates for beta, means, and vcomp can be generated when template = TRUE or when only the formula and data are provided. These templates serve as guides, outlining the elements and the order of parameter values in beta, means, and vcomp, based on the model and data structure.

Below, we create an example data frame to illustrate these templates, which may not represent a realistic design. The data frame includes:

  • Four categorical variables (fA, fB, fC, fD), and
  • Two numerical variables (x, z).
 df1 <- expand.grid(
   fA = factor(1:2), # factor A with 2 levels
   fB = factor(1:2), # factor B with 2 levels
   fC = factor(1:3), # factor C with 3 levels
   fD = factor(1:3), # factor D with 3 levels
   subject = factor(1:10)  # 10 subjects for each combination of factors
 )
 df1$x <- rnorm(nrow(df1))  # Numerical variable x
 df1$z <- rnorm(nrow(df1))  # Numerical variable z

beta template:

The template for beta specifies the order of model coefficients as they would appear in a fitted model. It is especially useful when directly providing expected model coefficients as measures of effect size.

For example, the expected values of the model coefficients for ~ fA*fB + x should be provided in the following sequence:

mkdesign( ~ fA * fB + x, df1)$fixeff$beta
#> (Intercept)         fA2         fB2           x     fA2:fB2 
#>           1           2           3           4           5

means template:

In many situations, it is more direct and convenient to provide means for categorical variables. The template for means represents marginal means for categorical variables, regression coefficients for numerical variables, and conditional means or regression coefficients for interactions.

Categorical variables without interactions

Marginal means for each level of the categorical variable(s) are required. For example, the expectations for each level of fA and fB follow this sequence:

mkdesign(~ fA + fB, df1, template = T)$fixeff$means
#> fA1 fA2 fB1 fB2 
#>   1   2   3   4

Interactions among categorical variables

Conditional (cell) means are required for all combinations of levels of the interacting categorical variables. For instance, cell means for all 12 combinations of the levels of fA, fB, and fC, are required in the following order:

mkdesign(~ fA * fB * fC, df1)$fixeff$means
#> fA1:fB1:fC1 fA2:fB1:fC1 fA1:fB2:fC1 fA2:fB2:fC1 fA1:fB1:fC2 fA2:fB1:fC2 
#>           1           2           3           4           5           6 
#> fA1:fB2:fC2 fA2:fB2:fC2 fA1:fB1:fC3 fA2:fB1:fC3 fA1:fB2:fC3 fA2:fB2:fC3 
#>           7           8           9          10          11          12

Numerical variables

Regression coefficients are required for numerical variables. If the model includes only numerical variables, the (intercept) must also be included. In such cases, means are identical to beta.

mkdesign(~ x + z, df1)$fixeff$means
#> (Intercept)           x           z 
#>           1           2           3

If there are interactions between numerical variables, regression coefficients for both main effects and interaction terms are required. Such as:

mkdesign(~ x * z, df1)$fixeff$means
#> (Intercept)           x           z         x:z 
#>           1           2           3           4

Categorical-by-numerical interactions

Marginal means for the categorical variable and regression coefficients for the numerical variable at each level of the categorical variable are required. For example, the means for levels of fA and regression coefficients for x at each level of fA are required for the model ~ fA * x:

mkdesign(~ fA * x, df1)$fixeff$means
#>   fA1   fA2 fA1:x fA2:x 
#>     1     2     3     4

Combining Multiple Situations

The rules for categorical and numerical variables described above also apply in mixed situations. For example, consider a model combining: - Interactions among three categorical variables (fA, fB, fC) - A categorical-by-numerical interaction (fD * x)
- Main effects for another numerical variable (z)

mkdesign(~ fA * fB * fC + fD * x + z, df1)$fixeff$means
#>         fD1         fD2         fD3           z       fD1:x       fD2:x 
#>           1           2           3           4           5           6 
#>       fD3:x fA1:fB1:fC1 fA2:fB1:fC1 fA1:fB2:fC1 fA2:fB2:fC1 fA1:fB1:fC2 
#>           7           8           9          10          11          12 
#> fA2:fB1:fC2 fA1:fB2:fC2 fA2:fB2:fC2 fA1:fB1:fC3 fA2:fB1:fC3 fA1:fB2:fC3 
#>          13          14          15          16          17          18 
#> fA2:fB2:fC3 
#>          19

The required elements and their order in means are: - Means for each level of fD (positions 1-3) - Regression coefficients for z (position 4) - Regression coefficients for x at each level of fD (positions 5-7) - Cell means for all combinations of levels of fA, fB, and fC (positions 8-19)

vcomp template:

The template for vcomp represents a variance-covariance matrix, where integers indicate the order of variance components in the input vector.

For a single random effect, a template is unnecessary, as it corresponds to a single variance value. For multiple random effects, the template outlines the sequence of variance and covariance components to be provided. It helps users specify and align variance components for random effects in the model.

For example, a model contains both random intercept and random slop for fA by subject:

mkdesign(~ fA * fB * fC * fD + (1 + fA | subject), df1)$varcov
#> $subject
#>             (Intercept) fA2
#> (Intercept)           1   2
#> fA2                   2   3

The template specifies the following required inputs: 1. Variance of the random intercept (1st component) 2. Covariance between the random intercept and fA2 (2nd component) 3. Variance of fA2 (3rd component)

Note: This example may not be statistically correct for the given data. It is provided solely to illustrate the structure of variance-covariance templates.

Correlation structures

Various correlation structures can be specified following the instructions from nlme::corClasses, including - corAR1 - corARMA - corCAR1 - corCompSymm - corExp - corGaus - corLin - corSymm - corRatio - corSpher

Note: In nlme::corAR1() and nlme::corARMA() when p=1 and q=0, the time variable must be an integer. However, in pwr4exp, this restriction has been released, factors are also supported.

Specific design functions

While mkdesign is highly flexible for creating any design, as long as the design’s data structure is provided, pwr4exp also includes specific functions for some common standard designs. These specialized functions define a design by its characteristics, such as treatment structure and replications, simplifying the process of manually creating a data frame.

Completely randomized design

designCRD(treatments, label, replicates, formula, beta, means, sigma2, template = FALSE)

Randomized complete block design

designRCBD(treatments, label, blocks, formula, beta, means, vcomp, sigma2, template = FALSE)

Latin square design

designLSD(treatments, label, squares, reuse, formula, beta, means, vcomp, sigma2, template = FALSE)

Crossover design

This is a special case of LSD where time periods and individuals act as blocks. Period blocks are reused when replicating squares.

designCOD(treatments, label, squares, formula, beta, means, vcomp, sigma2, template = FALSE)

Split-plot design

designSPD(trt.main, trt.sub, label, replicates, formula, beta, means, vcomp, sigma2, template = FALSE)

The inputs for these functions are the same as those for mkdesign, except that data is replaced by specifying the treatment profile and the number of replications.

Specific inputs

  • treatments: a integer-valued vector specifying the number of levels of treatment factors. A maximum of two factors is allowed, and arranged in a factorial design. For example, treatments = c(3, 2) specifies one treatment factor with 3 levels and another with 2 levels, forming a factorial treatment design.

  • trt.main and trt.sub: define the treatments on main plot and subplot, respectively, following the same rules of treatment.

  • label: an optional list, whose entries are labels to used for factor levels and whose names are treatment factors. If label is not specified, default names and labels are assigned to the factors and levels. For one treatment factor, the default is list(trt = c("1", "2", ...)). For two factors, the default is list(facA = c("1", "2", ...), facB = c("1", "2", ...)), where “facA” and “facB” represent the two factors, and “1”, “2”, etc., represent the levels of each factor.

  • replicates: the number of experimental units per treatment in a CRD or the number of main plots (i.e., the number of experimental units per treatment at the main plot level) in a SPD.

  • blocks: the number of blocks in a RCBD.

  • squares: the number of squares in a replicated LSD or COD.

  • reuse: specifies how to replicate squares when there are multiple squares. One of: row for reusing row blocks, col for reusing column blocks, or both for reusing both row and column blocks to replicate a single square.

Each of these specific design-generating functions has a default formula based on the treatment structure (e.g., one factor or factorial factors). If formula is not specified, a default formula with main effects and all interactions (if applicable) is used internally. In RCBD, LSD, COD, and SPD designs, all block factors are fitted as random effects. The formula component in the output list can be inspected.

Power calculation

Once the design object has been created, calculating power is straightforward. Statistical power of F-test for for omnibus tests can be calculated using the pwr.anova function.

Inputs - object: the design object created in the previous step - sig.level: significance level, default 0.05 - type: the type of ANOVA table, default Type III

Statistical power of t-test for specific contrast can be evaluated using the pwr.contrast function.

Inputs - object: the design object - which: the factor of interest - by: the variable to condition on - contrast: contrast method, include “pairwise”, “poly”, and “trt.vs.ctrl”, or any manually defined contrast vector - sig.level: significance level, default 0.05 - p.adj: whether the sig.level should be adjusted using the Bonferroni method, default FALSE - alternative: “two.sided” or “one.sided”. - strict: If TRUE, default, the power will include the probability of rejection in the opposite direction of the true effect, in the two-sided case. Otherwise, the power will be half the significance level if the true difference is zero.

Practical Examples

Example 1. Completely Randomized Design

In this example, we will create a CRD with one treatment factor. The design parameters are as follows:

  1. Treatments: 1 treatment factor with 4 levels.
  2. Replicates: 8 experimental units per treatment.
  3. Mean and effect size: The expected means for four levels are 35, 30, 37, 38
  4. Error variance: 15

Create the CRD

crd <- designCRD(
  treatments = 4,
  replicates = 8,
  means = c(35, 30, 37, 38),
  sigma2 = 15
)

Power for omnibus test

pwr.anova(crd)
#> Power of type III F-test
#>     NumDF DenDF sig.level   power
#> trt     3    28      0.05 0.95467

Power for specific contrasts

The power for detecting differences between treatments and the control:

pwr.contrast(crd, which =  "trt", contrast = "trt.vs.ctrl")
#>             effect df sig.level     power alternative
#> trt2 - trt1     -5 28      0.05 0.7028739   two.sided
#> trt3 - trt1      2 28      0.05 0.1694975   two.sided
#> trt4 - trt1      3 28      0.05 0.3216803   two.sided

The power for detecting polynomial contrasts across the treatment levels:

pwr.contrast(crd, which =  "trt", contrast = "poly")
#>           effect df sig.level     power alternative
#> linear        16 28      0.05 0.7130735   two.sided
#> quadratic      6 28      0.05 0.5617849   two.sided
#> cubic        -18 28      0.05 0.8098383   two.sided

The power for detecting pairwise comparison.

pwr.contrast(crd, which =  "trt", contrast = "pairwise")
#>             effect df sig.level      power alternative
#> trt1 - trt2      5 28      0.05 0.70287390   two.sided
#> trt1 - trt3     -2 28      0.05 0.16949749   two.sided
#> trt1 - trt4     -3 28      0.05 0.32168033   two.sided
#> trt2 - trt3     -7 28      0.05 0.93677955   two.sided
#> trt2 - trt4     -8 28      0.05 0.97860686   two.sided
#> trt3 - trt4     -1 28      0.05 0.07896844   two.sided

In data analysis, the P-value or significance level often needs to be adjusted for multiple comparisons (MCP). Most commonly used methods for MCP in experimental data are post-hoc, meaning they cannot be directly applied during the power analysis stage. However, one can tune the significance level to mimic these situations—for instance, by using a lower significance level to account for MCP.

pwr.contrast(crd, which =  "trt", contrast = "pairwise", sig.level = 0.01)
#>             effect df sig.level     power alternative
#> trt1 - trt2      5 28      0.01 0.4418907   two.sided
#> trt1 - trt3     -2 28      0.01 0.0546995   two.sided
#> trt1 - trt4     -3 28      0.01 0.1320866   two.sided
#> trt2 - trt3     -7 28      0.01 0.7946290   two.sided
#> trt2 - trt4     -8 28      0.01 0.9042775   two.sided
#> trt3 - trt4     -1 28      0.01 0.0194487   two.sided

The pwr.contrast function also includes an option to adjust the significance level using the Bonferroni method, though this approach may be overly conservative.

pwr.contrast(crd, which =  "trt", contrast = "pairwise", sig.level = 0.01)
#>             effect df sig.level     power alternative
#> trt1 - trt2      5 28      0.01 0.4418907   two.sided
#> trt1 - trt3     -2 28      0.01 0.0546995   two.sided
#> trt1 - trt4     -3 28      0.01 0.1320866   two.sided
#> trt2 - trt3     -7 28      0.01 0.7946290   two.sided
#> trt2 - trt4     -8 28      0.01 0.9042775   two.sided
#> trt3 - trt4     -1 28      0.01 0.0194487   two.sided

Example 2. Randomized complete block design

In this example, we will create an RCBD with two treatment factors. The design parameters are as follows:

  1. Treatments: A 2x2 factorial design.
  2. Replicates: 8 blocks.
  3. Means: B1B2A13538A24041 \begin{array}{c|c|c} & B1 & B2 \\ \hline A1 & 35 & 38 \\ A2 & 40 & 41 \\ \end{array} The corresponding beta values are as follows:
  • The intercept (A1B1): 35.
  • The effect of A2 alone: 5 units
  • The effect of B2 alone: 3 units
  • The interaction between A2 and B2: -2 units, meaning the combined effect of A2 and B2 is 2 units below the additive effects.
  1. Variance among blocks: 11.
  2. Error variance: 4. The total variance of the response variable (15) is decomposed into variance between blocks (11) and variance within blocks (4).

Create the RCBD

  • templates:
designRCBD(treatments = c(2, 2), blocks = 8, template = TRUE)
#> $fixeff
#> $fixeff$beta
#> (Intercept)       facA2       facB2 facA2:facB2 
#>           1           2           3           4 
#> 
#> $fixeff$means
#> facA1:facB1 facA2:facB1 facA1:facB2 facA2:facB2 
#>           1           2           3           4 
#> 
#> 
#> $varcov
#> $varcov$block
#>             (Intercept)
#> (Intercept)           1

Provide design paramters, beta or means, and vcomp according to the order above.

rcbd <- designRCBD(
  treatments = c(2, 2),
  blocks = 8,
  # beta = c(35, 5, 3, -2), # identical to means
  means = c(35, 40, 38, 41),
  vcomp = 11,
  sigma2 = 4
)

Evaluate statistical power

The power for main and interaction effects.

pwr.anova(rcbd)
#> Power of type III F-test
#>           NumDF DenDF sig.level   power
#> facA          1    21      0.05 0.99969
#> facB          1    21      0.05 0.76950
#> facA:facB     1    21      0.05 0.27138

The power for testing difference between levels of factor A conditioned on factor B:

pwr.contrast(rcbd, which = "facA", by = "facB")
#> $`facB = 1`
#>               effect df sig.level     power alternative
#> facA1 - facA2     -5 21      0.05 0.9974502   two.sided
#> 
#> $`facB = 2`
#>               effect df sig.level     power alternative
#> facA1 - facA2     -3 21      0.05 0.8160596   two.sided

Example 3. Latin square design

In this example, we extend the design from Example 2 by introducing another blocking factor, thus creating an LSD. The treatment structure and effect sizes remain the same as in Example 2. The design controls for two sources of variability (row and column blocks) while evaluating the treatment effects. In the LSD, the total variance (15) is further decomposed into three components:

  • Variance between row blocks (11),
  • Variance between column blocks (2), and
  • Residual error variance (2).

Input templates

designLSD(
  treatments = c(2, 2),
  squares = 4,
  reuse = "both",
  template = TRUE
)
#> $fixeff
#> $fixeff$beta
#> (Intercept)       facA2       facB2 facA2:facB2 
#>           1           2           3           4 
#> 
#> $fixeff$means
#> facA1:facB1 facA2:facB1 facA1:facB2 facA2:facB2 
#>           1           2           3           4 
#> 
#> 
#> $varcov
#> $varcov$row
#>             (Intercept)
#> (Intercept)           1
#> 
#> $varcov$col
#>             (Intercept)
#> (Intercept)           2

Either beta or means can be provided as in Example 2. Variance of row blocking factors and column block factors are requried sequentially.

Create the LSD

lsd <- designLSD(
  treatments = c(2, 2),
  label = list(temp = c("T1", "T2"), dosage = c("D1", "D2")),
  squares = 4,
  reuse = "both",
  means = c(35, 40, 38, 41),
  vcomp = c(11, 2),
  sigma2 = 2
)

Once the design has been created, pwr.anova and pwr.contrast can be used to evaludate statistical power as demonstrated above.

Example 4: Split-plot Design

In this example, we will create a SPD with two treatment factors, one at each level. The design parameters are as follows:

  1. Treatments: One main plot factor having 2 levels, and another factor with 3 levels at the sub-plot level.
  2. Replicates: There are 5 plots per main plot treatment, resulting in a total of 10 plots. This is a standard SPD, where the plot (block at the subplot level) size is assumed to equal the number of treatments. Thus, the design follows an RCBD structure at the subplot level.
  3. The corresponding cell means are: trt.sub1trt.sub2trt.sub3trt.main1202224trt.main2222428 \begin{array}{c|c|c} & trt.sub1 & trt.sub2 & trt.sub3 \\ \hline trt.main1 & 20 & 22 & 24\\ trt.main2 & 22 & 24 & 28 \\ \end{array}
  4. The total variance (15) is assumed to decompose into 4 for whole-plot error and 11 for subplot error.

Inputs templates

designSPD(
  trt.main = 2,
  trt.sub = 3, 
  replicates = 10, 
  template = T
)
#> Warning: Only rhs formula is required
#> formula coerced to: ~(1 | mainplot) + trt.main + trt.sub + trt.main:trt.sub.
#> $fixeff
#> $fixeff$beta
#>        (Intercept)          trt.main2           trt.sub2           trt.sub3 
#>                  1                  2                  3                  4 
#> trt.main2:trt.sub2 trt.main2:trt.sub3 
#>                  5                  6 
#> 
#> $fixeff$means
#> trt.main1:trt.sub1 trt.main2:trt.sub1 trt.main1:trt.sub2 trt.main2:trt.sub2 
#>                  1                  2                  3                  4 
#> trt.main1:trt.sub3 trt.main2:trt.sub3 
#>                  5                  6 
#> 
#> 
#> $varcov
#> $varcov$mainplot
#>             (Intercept)
#> (Intercept)           1

Create the SPD

spd <- designSPD(
  trt.main = 2,
  trt.sub = 3, 
  replicates = 10, 
  means = c(20, 22, 22, 24, 24, 28),
  vcomp = 4,
  sigma2 = 11
)
#> Warning: Only rhs formula is required
#> formula coerced to: ~(1 | mainplot) + trt.main + trt.sub + trt.main:trt.sub.

Example 5: Repeated measures

This example illustrates a repeated measures design with three treatments (CON, TRT1, TRT2) measured hourly over 8 hours. Within-subject correlations are modeled using an AR(1) structure with ρ=0.6\rho = 0.6 and σ2=2\sigma^2 = 2.

Design Details 1. Subjects: 6 per treatment group (total: 18 subjects).
2. Treatments: CON, TRT1, and TRT2.
3. Time Points: 8 hourly measurements.

4. Means:𝐓𝐫𝐞𝐚𝐭𝐦𝐞𝐧𝐭𝟏𝟐𝟑𝟒𝟓𝟔𝟕𝟖CON1.001.001.001.001.001.001.001.00TRT12.503.503.984.033.683.353.022.94TRT23.504.545.805.845.494.714.083.78 \begin{array}{c|cccccccc} \textbf{Treatment} & \textbf{1} & \textbf{2} & \textbf{3} & \textbf{4} & \textbf{5} & \textbf{6} & \textbf{7} & \textbf{8} \\ \hline \text{CON} & 1.00 & 1.00 & 1.00 & 1.00 & 1.00 & 1.00 & 1.00 & 1.00 \\ \text{TRT1} & 2.50 & 3.50 & 3.98 & 4.03 & 3.68 & 3.35 & 3.02 & 2.94 \\ \text{TRT2} & 3.50 & 4.54 & 5.80 & 5.84 & 5.49 & 4.71 & 4.08 & 3.78 \\ \end{array}

Create a data frame for this design

n_subject = 6 # Subjects per treatment
n_trt = 3 # Number of treatments
n_hour = 8 # Number of repeated measures (time points)
trt = c("CON", "TRT1", "TRT2")

df.rep <- data.frame(
  subject = as.factor(rep(seq_len(n_trt*n_subject), each = n_hour)),
  hour = as.factor(rep(seq_len(n_hour), n_subject*n_trt)),
  trt = rep(trt, each = n_subject*n_hour)
)

Input templates

Either values of beta or means are required in the following order:

mkdesign(formula = ~ trt*hour, data = df.rep)
#> $fixeff
#> $fixeff$beta
#>   (Intercept)       trtTRT1       trtTRT2         hour2         hour3 
#>             1             2             3             4             5 
#>         hour4         hour5         hour6         hour7         hour8 
#>             6             7             8             9            10 
#> trtTRT1:hour2 trtTRT2:hour2 trtTRT1:hour3 trtTRT2:hour3 trtTRT1:hour4 
#>            11            12            13            14            15 
#> trtTRT2:hour4 trtTRT1:hour5 trtTRT2:hour5 trtTRT1:hour6 trtTRT2:hour6 
#>            16            17            18            19            20 
#> trtTRT1:hour7 trtTRT2:hour7 trtTRT1:hour8 trtTRT2:hour8 
#>            21            22            23            24 
#> 
#> $fixeff$means
#>  trtCON:hour1 trtTRT1:hour1 trtTRT2:hour1  trtCON:hour2 trtTRT1:hour2 
#>             1             2             3             4             5 
#> trtTRT2:hour2  trtCON:hour3 trtTRT1:hour3 trtTRT2:hour3  trtCON:hour4 
#>             6             7             8             9            10 
#> trtTRT1:hour4 trtTRT2:hour4  trtCON:hour5 trtTRT1:hour5 trtTRT2:hour5 
#>            11            12            13            14            15 
#>  trtCON:hour6 trtTRT1:hour6 trtTRT2:hour6  trtCON:hour7 trtTRT1:hour7 
#>            16            17            18            19            20 
#> trtTRT2:hour7  trtCON:hour8 trtTRT1:hour8 trtTRT2:hour8 
#>            21            22            23            24 
#> 
#> 
#> $varcov
#> NULL

Create a design

design.rep <- mkdesign(
formula = ~ trt*hour,
data = df.rep,
means =  c(1, 2.50, 3.5,
           1, 3.50, 4.54,
           1, 3.98, 5.80,
           1, 4.03, 5.4,
           1, 3.68, 5.49,
           1, 3.35, 4.71,
           1, 3.02, 4.08,
           1, 2.94, 3.78),
sigma2 = 2,
correlation = corAR1(value = 0.6, form = ~ hour|subject)
)

Power calculation

Statistical power for main effects of treatment and time, and their interaction:

pwr.anova(design.rep)
#> Power of type III F-test
#>          NumDF  DenDF sig.level   power
#> trt          2 21.563      0.05 1.00000
#> hour         7 86.055      0.05 0.74687
#> trt:hour    14 86.055      0.05 0.38500

Statistical power for treatment difference at each hour:

pwr.contrast(design.rep, which = "trt", by = "hour", contrast = "trt.vs.ctrl", p.adj = TRUE)
#> $`hour = 1`
#>                  effect       df sig.level     power alternative
#> trtTRT1 - trtCON    1.5 64.41176     0.025 0.3299823   two.sided
#> trtTRT2 - trtCON    2.5 64.41176     0.025 0.7765112   two.sided
#> 
#> $`hour = 2`
#>                  effect       df sig.level     power alternative
#> trtTRT1 - trtCON   2.50 64.41176     0.025 0.7765112   two.sided
#> trtTRT2 - trtCON   3.54 64.41176     0.025 0.9777118   two.sided
#> 
#> $`hour = 3`
#>                  effect       df sig.level     power alternative
#> trtTRT1 - trtCON   2.98 64.41176     0.025 0.9093209   two.sided
#> trtTRT2 - trtCON   4.80 64.41176     0.025 0.9997845   two.sided
#> 
#> $`hour = 4`
#>                  effect       df sig.level     power alternative
#> trtTRT1 - trtCON   3.03 64.41176     0.025 0.9187320   two.sided
#> trtTRT2 - trtCON   4.40 64.41176     0.025 0.9988191   two.sided
#> 
#> $`hour = 5`
#>                  effect       df sig.level     power alternative
#> trtTRT1 - trtCON   2.68 64.41176     0.025 0.8355960   two.sided
#> trtTRT2 - trtCON   4.49 64.41176     0.025 0.9991794   two.sided
#> 
#> $`hour = 6`
#>                  effect       df sig.level     power alternative
#> trtTRT1 - trtCON   2.35 64.41176     0.025 0.7191810   two.sided
#> trtTRT2 - trtCON   3.71 64.41176     0.025 0.9865382   two.sided
#> 
#> $`hour = 7`
#>                  effect       df sig.level     power alternative
#> trtTRT1 - trtCON   2.02 64.41176     0.025 0.5730989   two.sided
#> trtTRT2 - trtCON   3.08 64.41176     0.025 0.9273867   two.sided
#> 
#> $`hour = 8`
#>                  effect       df sig.level     power alternative
#> trtTRT1 - trtCON   1.94 64.41176     0.025 0.5351535   two.sided
#> trtTRT2 - trtCON   2.78 64.41176     0.025 0.8635795   two.sided

Fundamental concepts

pwr4exp is developed based on linear mixed model (LMM) theory. The general form of an LMM can be expressed as:

y=Xβ+Zu+ε y = X\beta + Zu + \varepsilon

where: yy represents the observations of the response variable, β\beta represents the fixed effect coefficients, uu denotes the random effects, where uNq(0,G)u \sim N_q(0, G), ε\varepsilon represents the random errors, where εNn(0,R)\varepsilon \sim N_n(0, R), X(n×p)X_{(n \times p)} and Z(n×q)Z_{(n \times q)} are the design matrices for the fixed and random effects, respectively.

It is assumed that uu and ε\varepsilon are independent, and the marginal distribution of yy follows a normal distribution yNn(Xβ,V)y \sim N_n(X\beta, V), where:

V=ZGZT+R V = ZGZ^T + R

Inference on Treatment Effects

Inference on treatment effects often involves testing omnibus hypotheses and contrasts. These can be formulated using the general linear hypothesis:

H0:Kβ=0 H_0: K\beta = 0

where KK is a contrast matrix. When the variance-covariance parameters in GG and RR are known, the estimate of β\beta is:

β̂=(XTV1X)1XTV1y \hat{\beta} = (X^TV^{-1}X)^{-1}X^TV^{-1}y

And its variance is:

C=(XTV1X)1 C = (X^TV^{-1}X)^{-1}

The sampling distribution of Kβ̂K'\hat{\beta} is:

Kβ̂N(0,KCK) K'\hat{\beta} \sim N(0, K'CK)

However, in practical situations, the matrices GG and RR are unknown and must be estimated using methods like Maximum Likelihood (ML) or Restricted ML (REML). The estimate of β\beta is obtained by plugging in the estimated covariance matrices V̂\hat{V}, where:

V̂=ZĜZT+R̂ \hat{V} = Z\hat{G}Z^T + \hat{R}

The resulting estimate of β\beta is:

β̂=(XTV̂1X)1XTV̂1y \hat{\beta} = (X^T\hat{V}^{-1}X)^{-1}X^T\hat{V}^{-1}y

And its estimated variance is:

Ĉ=(XTV̂1X)1 \hat{C} = (X^T\hat{V}^{-1}X)^{-1}

When testing the null hypothesis H0:Kβ=0H_0: K\beta = 0, an approximate F-statistic is used. The F-statistic is given by:

F=(Kβ̂)T[KĈKT]1(Kβ̂)v1 F = \frac{(K\hat{\beta})^T [K\hat{C}K^T]^{-1} (K\hat{\beta})}{v_1}

FF follows an approximate F-distribution F(v1,v2)F(v_1, v_2) under H0H_0, where v1=rank(K)1v_1 = \text{rank}(K) \geq 1 represents the numerator degrees of freedom (df), v2v_2 is the denominator df.

When rank(K)=1\text{rank}(K) = 1, the F-statistic simplifies to the square of the t-statistic:

F=t2 F = t^2 where t=kβ̂kĈKt = \frac{k'\hat{\beta}}{\sqrt{k'\hat{C}K}} with v2v_2 df.

In balanced designs, where data is analyzed using a variance components model—commonly applied in experimental animal research—v2v_2​ can be precisely determined through degrees of freedom decomposition, as applied in analysis of variance (ANOVA).

However, for more general cases, v2v_2 must be approximated using methods.

The Satterthwaite approximation (Satterthwaite, 1946) for DF in t-tests can be calculated as outlined by Giesbrecht and Burns (1985):

v2=2(kTĈk)2gTAg v_2 = \frac{2(k^T \hat{C} k)^2}{g^T A g}

where: - gg is the gradient of kTC(θ̂)kk^T C(\hat{\theta}) k with respect to θ\theta, the variance-covariance parameters in VV, evaluated at θ̂\hat{\theta}. - Matrix AA is the asymptotic variance-covariance matrix of θ̂\hat{\theta}, obtained from the information matrix of ML or REML estimation of θ̂\hat{\theta} (Stroup, 2012).

In F-tests, v2v_2 can be calculated by following the procedures described by Fai and Cornelius (1996). First, KCK̂TKC\hat{K}^T is decomposed to yield KCK̂T=PTDPKC\hat{K}^T = P^T D P, where PP is an orthogonal matrix of eigenvectors, and DD is a diagonal matrix of eigenvalues. Define kmCkmTk_m Ck_m^T, where kmk_m is the mm-th row of PKP K, and let:

vm=2(Dm)2gmTAgm v_m = \frac{2(D_m)^2}{g_m^T A g_m}

where: - DmD_m is the mm-th diagonal element of DD. - gmg_m is the gradient of kmCkmTk_m C k_m^T with respect to θ\theta, evaluated at θ̂\hat{\theta}.

Then let:

E=m=1v1vmvm2I(vm>2) E = \sum_{m=1}^{v_1} \frac{v_m}{v_m - 2} I(v_m > 2)

where I(vm>2)I(v_m > 2) denotes the indicator function. The denominator DF v2v_2 is calculated as:

v2=2EEv1 v_2 = \frac{2E}{E - v_1} The Satterthwaite approximation can be applied in power analysis by plugging in the assumed values of variance parameters (Stroup, 2002).

Power Calculation Under the Alternative Hypothesis

Under the alternative hypothesis HA:Kβ0H_A: K'\beta \neq 0, the F-statistic follows a non-central distribution F(v1,v2,ϕ)F(v_1, v_2, \phi), where ϕ\phi is the non-centrality parameter that measures the departure from the null hypothesis H0H_0. The non-centrality parameter ϕ\phi is given by:

ϕ=(Kβ̂)T[KĈKT]1(Kβ̂) \phi = (K\hat{\beta})^T [K\hat{C}K^T]^{-1} (K\hat{\beta})

Once the distribution of the F-statistic under HAH_A is known, the power of the test can be calculated as the conditional probability of rejecting H0H_0 when HAH_A is true:

Power=P(reject H0:F>FcritHA) \text{Power} = P(\text{reject } H_0: F > F_{\text{crit}} \mid H_A)

Where: FcritF_{\text{crit}} is the critical value of the F-statistic used to reject H0H_0, determined such that P(F>FcritH0)=αP(F > F_{\text{crit}} \mid H_0) = \alpha, where α\alpha is the type I error rate.

The determination of the degrees of freedom v1v_1 and v2v_2, as well as the non-centrality parameter ϕ\phi, are critical steps for power calculation.

Generally, power analysis requires specifying the following components:

  • The Design to be evaluated, which determines the matrices XX (fixed effects) and ZZ (random effects).
  • The Treatment Effects, which determine β\beta (the fixed effect coefficients).
  • The Variance Components, which determine GG (the covariance matrix of the random effects) and RR (the covariance matrix of the residuals).

A key aspect of conducting a valid power analysis is obtaining reasonable estimates for the magnitude of the parameters that will be used in the model. This includes:

  • Treatment Effects (β\beta): The size of the treatment effect(s) you expect to detect. This can be obtained from previous studies, pilot experiments, or subject-matter expertise.
  • Variance Components (GG and RR): These represent the variability in the random effects and residual errors. Variance estimates are often obtained from:
    • Pilot studies or preliminary data, which can provide initial estimates of variability in random effects (e.g., subject-to-subject variability or group-level variability).
    • Literature on similar experiments, where variance components have been reported.
    • Subject-matter expertise, where researchers provide estimates based on their knowledge of the system being studied.

Performing a power analysis with unrealistic parameter magnitudes can lead to incorrect conclusions, either overestimating the likelihood of detecting a treatment effect or requiring an unnecessarily large sample size.

References

Satterthwaite, F. E. 1946. An approximate distribution of estimates of variance components. Biometrics Bull. 2:110–114. Hrong-Tai Fai, A., & Cornelius, P. L. (1996). Approximate F-tests of multiple degree of freedom hypotheses in generalized least squares analyses of unbalanced split-plot experiments. Journal of statistical computation and simulation, 54(4), 363-378.