Introduction
pwr4exp
provides tools for statistical power calculation
in experimental designs analyzed within the linear mixed model
framework, including both fixed-effects and mixed-effects models.
Currently, the package does not support generalized linear models; thus,
only normal response variables are supported. Degrees of freedom are
approximated using the Satterthwaite method.
This vignette is based on the development version of the package, which incorporates various correlation structures in repeated measures and spatial data. It also includes options for generating input templates, simplifying the setup and execution of power analyses. The development version can be downloaded from GitHub:
devtools::install_github("an-ethz/pwr4exp")
Package overview
Performing a power analysis in pwr4exp
involves two main
steps:
Create a design object: A design object, on which the power analysis will be conducted, needs to be created first. This is done using the function
mkdesign
for any designs, or the functionsdesignCRD
,designRCBD
,designLSD
,designCOD
, anddesignSPD
for the specific designs.Calculate power: The statistical power of the F-test for model terms and the t-test for specific contrasts can be evaluated using the functions
pwr.anova
pwr.contrast
The sections following the introduction are dedicated to these two steps. Additional theory and concepts are listed at the end.
Create a design object
mk_design
The function mkdesign
is designed to create any design
by specifying the design’s data structure, the intended model, and the
necessary parameters.
Inputs
-
formula
: The right-hand-side model formula (useslme4::lmer()
syntax for random effects). -
data
: A long-format data frame with one row per observation and columns for all independent variables specified in the formula (e.g., treatments, blocks, subjects). The data should reflect the design structure but exclude the response variable. Unused variables in theformula
are ignored. -
beta
ormeans
: Fixed-effects expectations. Specify eitherbeta
(a vector of regression coefficients) ormeans
(expected mean values for fixed effects). Only one of these must be provided. -
vcomp
: Expected variance-covariance components for the random effects. -
sigma2
: Expected error variance (residual variance). -
correlation
: Correlation structure for repeated measures or spatial data. -
template
: Logical; whenTRUE
, generates templates forbeta
,means
, andvcomp
instead of the full design object.
Output
- Returns a list of essential components required for power calculation functions.
Templates
Templates for beta
, means
, and
vcomp
can be generated when template = TRUE
or
when only the formula and data are provided. These templates serve as
guides, outlining the elements and the order of parameter values in
beta
, means
, and vcomp
, based on
the model and data structure.
Below, we create an example data frame to illustrate these templates, which may not represent a realistic design. The data frame includes:
- Four categorical variables (
fA
,fB
,fC
,fD
), and - Two numerical variables (
x
,z
).
df1 <- expand.grid(
fA = factor(1:2), # factor A with 2 levels
fB = factor(1:2), # factor B with 2 levels
fC = factor(1:3), # factor C with 3 levels
fD = factor(1:3), # factor D with 3 levels
subject = factor(1:10) # 10 subjects for each combination of factors
)
df1$x <- rnorm(nrow(df1)) # Numerical variable x
df1$z <- rnorm(nrow(df1)) # Numerical variable z
beta
template:
The template for beta
specifies the order of model
coefficients as they would appear in a fitted model. It is especially
useful when directly providing expected model coefficients as measures
of effect size.
For example, the expected values of the model coefficients for
~ fA*fB + x
should be provided in the following
sequence:
mkdesign( ~ fA * fB + x, df1)$fixeff$beta
#> (Intercept) fA2 fB2 x fA2:fB2
#> 1 2 3 4 5
means
template:
In many situations, it is more direct and convenient to provide means
for categorical variables. The template for means
represents marginal means for categorical variables, regression
coefficients for numerical variables, and conditional means or
regression coefficients for interactions.
Categorical variables without interactions
Marginal means for each level of the categorical variable(s) are
required. For example, the expectations for each level of
fA
and fB
follow this sequence:
mkdesign(~ fA + fB, df1, template = T)$fixeff$means
#> fA1 fA2 fB1 fB2
#> 1 2 3 4
Interactions among categorical variables
Conditional (cell) means are required for all combinations of levels
of the interacting categorical variables. For instance, cell means for
all 12 combinations of the levels of fA
, fB
,
and fC
, are required in the following order:
mkdesign(~ fA * fB * fC, df1)$fixeff$means
#> fA1:fB1:fC1 fA2:fB1:fC1 fA1:fB2:fC1 fA2:fB2:fC1 fA1:fB1:fC2 fA2:fB1:fC2
#> 1 2 3 4 5 6
#> fA1:fB2:fC2 fA2:fB2:fC2 fA1:fB1:fC3 fA2:fB1:fC3 fA1:fB2:fC3 fA2:fB2:fC3
#> 7 8 9 10 11 12
Numerical variables
Regression coefficients are required for numerical variables. If the
model includes only numerical variables, the (intercept)
must also be included. In such cases, means
are identical
to beta
.
mkdesign(~ x + z, df1)$fixeff$means
#> (Intercept) x z
#> 1 2 3
If there are interactions between numerical variables, regression coefficients for both main effects and interaction terms are required. Such as:
mkdesign(~ x * z, df1)$fixeff$means
#> (Intercept) x z x:z
#> 1 2 3 4
Categorical-by-numerical interactions
Marginal means for the categorical variable and regression
coefficients for the numerical variable at each level of the categorical
variable are required. For example, the means for levels of
fA
and regression coefficients for x
at each
level of fA
are required for the model
~ fA * x
:
mkdesign(~ fA * x, df1)$fixeff$means
#> fA1 fA2 fA1:x fA2:x
#> 1 2 3 4
Combining Multiple Situations
The rules for categorical and numerical variables described above
also apply in mixed situations. For example, consider a model combining:
- Interactions among three categorical variables (fA
,
fB
, fC
) - A categorical-by-numerical
interaction (fD * x
)
- Main effects for another numerical variable (z
)
mkdesign(~ fA * fB * fC + fD * x + z, df1)$fixeff$means
#> fD1 fD2 fD3 z fD1:x fD2:x
#> 1 2 3 4 5 6
#> fD3:x fA1:fB1:fC1 fA2:fB1:fC1 fA1:fB2:fC1 fA2:fB2:fC1 fA1:fB1:fC2
#> 7 8 9 10 11 12
#> fA2:fB1:fC2 fA1:fB2:fC2 fA2:fB2:fC2 fA1:fB1:fC3 fA2:fB1:fC3 fA1:fB2:fC3
#> 13 14 15 16 17 18
#> fA2:fB2:fC3
#> 19
The required elements and their order in means
are: -
Means for each level of fD
(positions 1-3) - Regression
coefficients for z
(position 4) - Regression coefficients
for x
at each level of fD
(positions 5-7) -
Cell means for all combinations of levels of fA
,
fB
, and fC
(positions 8-19)
vcomp
template:
The template for vcomp
represents a variance-covariance
matrix, where integers indicate the order of variance components in the
input vector.
For a single random effect, a template is unnecessary, as it corresponds to a single variance value. For multiple random effects, the template outlines the sequence of variance and covariance components to be provided. It helps users specify and align variance components for random effects in the model.
For example, a model contains both random intercept and random slop
for fA
by subject
:
mkdesign(~ fA * fB * fC * fD + (1 + fA | subject), df1)$varcov
#> $subject
#> (Intercept) fA2
#> (Intercept) 1 2
#> fA2 2 3
The template specifies the following required inputs: 1. Variance of the random intercept (1st component) 2. Covariance between the random intercept and fA2 (2nd component) 3. Variance of fA2 (3rd component)
Note: This example may not be statistically correct for the given data. It is provided solely to illustrate the structure of variance-covariance templates.
Correlation structures
Various correlation structures can be specified following the
instructions from nlme::corClasses
,
including - corAR1
- corARMA
- corCAR1
- corCompSymm
- corExp
- corGaus
- corLin
- corSymm
- corRatio
- corSpher
Note: In nlme::corAR1()
and nlme::corARMA()
when p=1
and q=0
, the time variable must be an
integer. However, in pwr4exp
, this restriction has been
released, factors are also supported.
Specific design functions
While mkdesign
is highly flexible for creating any
design, as long as the design’s data structure is provided,
pwr4exp
also includes specific functions for some common
standard designs. These specialized functions define a design by its
characteristics, such as treatment structure and replications,
simplifying the process of manually creating a data frame.
Completely randomized design
designCRD(treatments, label, replicates, formula, beta, means, sigma2, template = FALSE)
Randomized complete block design
designRCBD(treatments, label, blocks, formula, beta, means, vcomp, sigma2, template = FALSE)
Latin square design
designLSD(treatments, label, squares, reuse, formula, beta, means, vcomp, sigma2, template = FALSE)
Crossover design
This is a special case of LSD where time periods and individuals act as blocks. Period blocks are reused when replicating squares.
designCOD(treatments, label, squares, formula, beta, means, vcomp, sigma2, template = FALSE)
Split-plot design
designSPD(trt.main, trt.sub, label, replicates, formula, beta, means, vcomp, sigma2, template = FALSE)
The inputs for these functions are the same as those for
mkdesign
, except that data
is replaced by
specifying the treatment profile and the number of replications.
Specific inputs
treatments
: a integer-valued vector specifying the number of levels of treatment factors. A maximum of two factors is allowed, and arranged in a factorial design. For example,treatments = c(3, 2)
specifies one treatment factor with 3 levels and another with 2 levels, forming a factorial treatment design.trt.main
andtrt.sub
: define the treatments on main plot and subplot, respectively, following the same rules oftreatment
.label
: an optional list, whose entries are labels to used for factor levels and whose names are treatment factors. Iflabel
is not specified, default names and labels are assigned to the factors and levels. For one treatment factor, the default islist(trt = c("1", "2", ...))
. For two factors, the default islist(facA = c("1", "2", ...), facB = c("1", "2", ...))
, where “facA” and “facB” represent the two factors, and “1”, “2”, etc., represent the levels of each factor.replicates
: the number of experimental units per treatment in a CRD or the number of main plots (i.e., the number of experimental units per treatment at the main plot level) in a SPD.blocks
: the number of blocks in a RCBD.squares
: the number of squares in a replicated LSD or COD.reuse
: specifies how to replicate squares when there are multiple squares. One of:row
for reusing row blocks,col
for reusing column blocks, orboth
for reusing both row and column blocks to replicate a single square.
Each of these specific design-generating functions has a default
formula
based on the treatment structure (e.g., one factor
or factorial factors). If formula
is not specified, a
default formula with main effects and all interactions (if applicable)
is used internally. In RCBD, LSD, COD, and SPD designs, all block
factors are fitted as random effects. The formula
component
in the output list can be inspected.
Power calculation
Once the design object has been created, calculating power is
straightforward. Statistical power of F-test for for omnibus tests can
be calculated using the pwr.anova
function.
Inputs - object
: the design object
created in the previous step - sig.level
: significance
level, default 0.05 - type
: the type of ANOVA table,
default Type III
Statistical power of t-test for specific contrast can be evaluated
using the pwr.contrast
function.
Inputs - object
: the design object -
which
: the factor of interest - by
: the
variable to condition on - contrast
: contrast method,
include “pairwise”, “poly”, and “trt.vs.ctrl”, or any manually defined
contrast vector - sig.level
: significance level, default
0.05 - p.adj
: whether the sig.level should be adjusted
using the Bonferroni method, default FALSE - alternative
:
“two.sided” or “one.sided”. - strict
: If TRUE
,
default, the power will include the probability of rejection in the
opposite direction of the true effect, in the two-sided case. Otherwise,
the power will be half the significance level if the true difference is
zero.
Practical Examples
Example 1. Completely Randomized Design
In this example, we will create a CRD with one treatment factor. The design parameters are as follows:
- Treatments: 1 treatment factor with 4 levels.
- Replicates: 8 experimental units per treatment.
- Mean and effect size: The expected means for four levels are 35, 30, 37, 38
- Error variance: 15
Create the CRD
Power for omnibus test
pwr.anova(crd)
#> Power of type III F-test
#> NumDF DenDF sig.level power
#> trt 3 28 0.05 0.95467
Power for specific contrasts
The power for detecting differences between treatments and the control:
pwr.contrast(crd, which = "trt", contrast = "trt.vs.ctrl")
#> effect df sig.level power alternative
#> trt2 - trt1 -5 28 0.05 0.7028739 two.sided
#> trt3 - trt1 2 28 0.05 0.1694975 two.sided
#> trt4 - trt1 3 28 0.05 0.3216803 two.sided
The power for detecting polynomial contrasts across the treatment levels:
pwr.contrast(crd, which = "trt", contrast = "poly")
#> effect df sig.level power alternative
#> linear 16 28 0.05 0.7130735 two.sided
#> quadratic 6 28 0.05 0.5617849 two.sided
#> cubic -18 28 0.05 0.8098383 two.sided
The power for detecting pairwise comparison.
pwr.contrast(crd, which = "trt", contrast = "pairwise")
#> effect df sig.level power alternative
#> trt1 - trt2 5 28 0.05 0.70287390 two.sided
#> trt1 - trt3 -2 28 0.05 0.16949749 two.sided
#> trt1 - trt4 -3 28 0.05 0.32168033 two.sided
#> trt2 - trt3 -7 28 0.05 0.93677955 two.sided
#> trt2 - trt4 -8 28 0.05 0.97860686 two.sided
#> trt3 - trt4 -1 28 0.05 0.07896844 two.sided
In data analysis, the P-value or significance level often needs to be adjusted for multiple comparisons (MCP). Most commonly used methods for MCP in experimental data are post-hoc, meaning they cannot be directly applied during the power analysis stage. However, one can tune the significance level to mimic these situations—for instance, by using a lower significance level to account for MCP.
pwr.contrast(crd, which = "trt", contrast = "pairwise", sig.level = 0.01)
#> effect df sig.level power alternative
#> trt1 - trt2 5 28 0.01 0.4418907 two.sided
#> trt1 - trt3 -2 28 0.01 0.0546995 two.sided
#> trt1 - trt4 -3 28 0.01 0.1320866 two.sided
#> trt2 - trt3 -7 28 0.01 0.7946290 two.sided
#> trt2 - trt4 -8 28 0.01 0.9042775 two.sided
#> trt3 - trt4 -1 28 0.01 0.0194487 two.sided
The pwr.contrast
function also includes an option to
adjust the significance level using the Bonferroni
method, though this approach may be overly conservative.
pwr.contrast(crd, which = "trt", contrast = "pairwise", sig.level = 0.01)
#> effect df sig.level power alternative
#> trt1 - trt2 5 28 0.01 0.4418907 two.sided
#> trt1 - trt3 -2 28 0.01 0.0546995 two.sided
#> trt1 - trt4 -3 28 0.01 0.1320866 two.sided
#> trt2 - trt3 -7 28 0.01 0.7946290 two.sided
#> trt2 - trt4 -8 28 0.01 0.9042775 two.sided
#> trt3 - trt4 -1 28 0.01 0.0194487 two.sided
Example 2. Randomized complete block design
In this example, we will create an RCBD with two treatment factors. The design parameters are as follows:
- Treatments: A 2x2 factorial design.
- Replicates: 8 blocks.
- Means:
The corresponding
beta
values are as follows:
- The intercept (
A1B1
): 35. - The effect of
A2
alone: 5 units - The effect of
B2
alone: 3 units - The interaction between
A2
andB2
: -2 units, meaning the combined effect ofA2
andB2
is 2 units below the additive effects.
- Variance among blocks: 11.
- Error variance: 4. The total variance of the response variable (15) is decomposed into variance between blocks (11) and variance within blocks (4).
Create the RCBD
- templates:
designRCBD(treatments = c(2, 2), blocks = 8, template = TRUE)
#> $fixeff
#> $fixeff$beta
#> (Intercept) facA2 facB2 facA2:facB2
#> 1 2 3 4
#>
#> $fixeff$means
#> facA1:facB1 facA2:facB1 facA1:facB2 facA2:facB2
#> 1 2 3 4
#>
#>
#> $varcov
#> $varcov$block
#> (Intercept)
#> (Intercept) 1
Provide design paramters, beta
or means
,
and vcomp
according to the order above.
rcbd <- designRCBD(
treatments = c(2, 2),
blocks = 8,
# beta = c(35, 5, 3, -2), # identical to means
means = c(35, 40, 38, 41),
vcomp = 11,
sigma2 = 4
)
Evaluate statistical power
The power for main and interaction effects.
pwr.anova(rcbd)
#> Power of type III F-test
#> NumDF DenDF sig.level power
#> facA 1 21 0.05 0.99969
#> facB 1 21 0.05 0.76950
#> facA:facB 1 21 0.05 0.27138
The power for testing difference between levels of factor A conditioned on factor B:
pwr.contrast(rcbd, which = "facA", by = "facB")
#> $`facB = 1`
#> effect df sig.level power alternative
#> facA1 - facA2 -5 21 0.05 0.9974502 two.sided
#>
#> $`facB = 2`
#> effect df sig.level power alternative
#> facA1 - facA2 -3 21 0.05 0.8160596 two.sided
Example 3. Latin square design
In this example, we extend the design from Example 2 by introducing another blocking factor, thus creating an LSD. The treatment structure and effect sizes remain the same as in Example 2. The design controls for two sources of variability (row and column blocks) while evaluating the treatment effects. In the LSD, the total variance (15) is further decomposed into three components:
- Variance between row blocks (11),
- Variance between column blocks (2), and
- Residual error variance (2).
Input templates
designLSD(
treatments = c(2, 2),
squares = 4,
reuse = "both",
template = TRUE
)
#> $fixeff
#> $fixeff$beta
#> (Intercept) facA2 facB2 facA2:facB2
#> 1 2 3 4
#>
#> $fixeff$means
#> facA1:facB1 facA2:facB1 facA1:facB2 facA2:facB2
#> 1 2 3 4
#>
#>
#> $varcov
#> $varcov$row
#> (Intercept)
#> (Intercept) 1
#>
#> $varcov$col
#> (Intercept)
#> (Intercept) 2
Either beta
or means
can be provided as in
Example 2. Variance of row blocking factors and column block
factors are requried sequentially.
Create the LSD
lsd <- designLSD(
treatments = c(2, 2),
label = list(temp = c("T1", "T2"), dosage = c("D1", "D2")),
squares = 4,
reuse = "both",
means = c(35, 40, 38, 41),
vcomp = c(11, 2),
sigma2 = 2
)
Once the design has been created, pwr.anova
and
pwr.contrast
can be used to evaludate statistical power as
demonstrated above.
Example 4: Split-plot Design
In this example, we will create a SPD with two treatment factors, one at each level. The design parameters are as follows:
- Treatments: One main plot factor having 2 levels, and another factor with 3 levels at the sub-plot level.
- Replicates: There are 5 plots per main plot treatment, resulting in a total of 10 plots. This is a standard SPD, where the plot (block at the subplot level) size is assumed to equal the number of treatments. Thus, the design follows an RCBD structure at the subplot level.
- The corresponding cell means are:
- The total variance (15) is assumed to decompose into 4 for whole-plot error and 11 for subplot error.
Inputs templates
designSPD(
trt.main = 2,
trt.sub = 3,
replicates = 10,
template = T
)
#> Warning: Only rhs formula is required
#> formula coerced to: ~(1 | mainplot) + trt.main + trt.sub + trt.main:trt.sub.
#> $fixeff
#> $fixeff$beta
#> (Intercept) trt.main2 trt.sub2 trt.sub3
#> 1 2 3 4
#> trt.main2:trt.sub2 trt.main2:trt.sub3
#> 5 6
#>
#> $fixeff$means
#> trt.main1:trt.sub1 trt.main2:trt.sub1 trt.main1:trt.sub2 trt.main2:trt.sub2
#> 1 2 3 4
#> trt.main1:trt.sub3 trt.main2:trt.sub3
#> 5 6
#>
#>
#> $varcov
#> $varcov$mainplot
#> (Intercept)
#> (Intercept) 1
Create the SPD
Example 5: Repeated measures
This example illustrates a repeated measures design with three
treatments (CON
, TRT1
, TRT2
)
measured hourly over 8 hours. Within-subject correlations are modeled
using an AR(1) structure with
and
.
Design Details 1. Subjects: 6 per
treatment group (total: 18 subjects).
2. Treatments: CON
, TRT1
, and
TRT2
.
3. Time Points: 8 hourly measurements.
4. Means:
Create a data frame for this design
n_subject = 6 # Subjects per treatment
n_trt = 3 # Number of treatments
n_hour = 8 # Number of repeated measures (time points)
trt = c("CON", "TRT1", "TRT2")
df.rep <- data.frame(
subject = as.factor(rep(seq_len(n_trt*n_subject), each = n_hour)),
hour = as.factor(rep(seq_len(n_hour), n_subject*n_trt)),
trt = rep(trt, each = n_subject*n_hour)
)
Input templates
Either values of beta
or means
are required
in the following order:
mkdesign(formula = ~ trt*hour, data = df.rep)
#> $fixeff
#> $fixeff$beta
#> (Intercept) trtTRT1 trtTRT2 hour2 hour3
#> 1 2 3 4 5
#> hour4 hour5 hour6 hour7 hour8
#> 6 7 8 9 10
#> trtTRT1:hour2 trtTRT2:hour2 trtTRT1:hour3 trtTRT2:hour3 trtTRT1:hour4
#> 11 12 13 14 15
#> trtTRT2:hour4 trtTRT1:hour5 trtTRT2:hour5 trtTRT1:hour6 trtTRT2:hour6
#> 16 17 18 19 20
#> trtTRT1:hour7 trtTRT2:hour7 trtTRT1:hour8 trtTRT2:hour8
#> 21 22 23 24
#>
#> $fixeff$means
#> trtCON:hour1 trtTRT1:hour1 trtTRT2:hour1 trtCON:hour2 trtTRT1:hour2
#> 1 2 3 4 5
#> trtTRT2:hour2 trtCON:hour3 trtTRT1:hour3 trtTRT2:hour3 trtCON:hour4
#> 6 7 8 9 10
#> trtTRT1:hour4 trtTRT2:hour4 trtCON:hour5 trtTRT1:hour5 trtTRT2:hour5
#> 11 12 13 14 15
#> trtCON:hour6 trtTRT1:hour6 trtTRT2:hour6 trtCON:hour7 trtTRT1:hour7
#> 16 17 18 19 20
#> trtTRT2:hour7 trtCON:hour8 trtTRT1:hour8 trtTRT2:hour8
#> 21 22 23 24
#>
#>
#> $varcov
#> NULL
Create a design
design.rep <- mkdesign(
formula = ~ trt*hour,
data = df.rep,
means = c(1, 2.50, 3.5,
1, 3.50, 4.54,
1, 3.98, 5.80,
1, 4.03, 5.4,
1, 3.68, 5.49,
1, 3.35, 4.71,
1, 3.02, 4.08,
1, 2.94, 3.78),
sigma2 = 2,
correlation = corAR1(value = 0.6, form = ~ hour|subject)
)
Power calculation
Statistical power for main effects of treatment and time, and their interaction:
pwr.anova(design.rep)
#> Power of type III F-test
#> NumDF DenDF sig.level power
#> trt 2 21.563 0.05 1.00000
#> hour 7 86.055 0.05 0.74687
#> trt:hour 14 86.055 0.05 0.38500
Statistical power for treatment difference at each hour:
pwr.contrast(design.rep, which = "trt", by = "hour", contrast = "trt.vs.ctrl", p.adj = TRUE)
#> $`hour = 1`
#> effect df sig.level power alternative
#> trtTRT1 - trtCON 1.5 64.41176 0.025 0.3299823 two.sided
#> trtTRT2 - trtCON 2.5 64.41176 0.025 0.7765112 two.sided
#>
#> $`hour = 2`
#> effect df sig.level power alternative
#> trtTRT1 - trtCON 2.50 64.41176 0.025 0.7765112 two.sided
#> trtTRT2 - trtCON 3.54 64.41176 0.025 0.9777118 two.sided
#>
#> $`hour = 3`
#> effect df sig.level power alternative
#> trtTRT1 - trtCON 2.98 64.41176 0.025 0.9093209 two.sided
#> trtTRT2 - trtCON 4.80 64.41176 0.025 0.9997845 two.sided
#>
#> $`hour = 4`
#> effect df sig.level power alternative
#> trtTRT1 - trtCON 3.03 64.41176 0.025 0.9187320 two.sided
#> trtTRT2 - trtCON 4.40 64.41176 0.025 0.9988191 two.sided
#>
#> $`hour = 5`
#> effect df sig.level power alternative
#> trtTRT1 - trtCON 2.68 64.41176 0.025 0.8355960 two.sided
#> trtTRT2 - trtCON 4.49 64.41176 0.025 0.9991794 two.sided
#>
#> $`hour = 6`
#> effect df sig.level power alternative
#> trtTRT1 - trtCON 2.35 64.41176 0.025 0.7191810 two.sided
#> trtTRT2 - trtCON 3.71 64.41176 0.025 0.9865382 two.sided
#>
#> $`hour = 7`
#> effect df sig.level power alternative
#> trtTRT1 - trtCON 2.02 64.41176 0.025 0.5730989 two.sided
#> trtTRT2 - trtCON 3.08 64.41176 0.025 0.9273867 two.sided
#>
#> $`hour = 8`
#> effect df sig.level power alternative
#> trtTRT1 - trtCON 1.94 64.41176 0.025 0.5351535 two.sided
#> trtTRT2 - trtCON 2.78 64.41176 0.025 0.8635795 two.sided
Fundamental concepts
pwr4exp
is developed based on linear mixed model (LMM)
theory. The general form of an LMM can be expressed as:
where: represents the observations of the response variable, represents the fixed effect coefficients, denotes the random effects, where , represents the random errors, where , and are the design matrices for the fixed and random effects, respectively.
It is assumed that and are independent, and the marginal distribution of follows a normal distribution , where:
Inference on Treatment Effects
Inference on treatment effects often involves testing omnibus hypotheses and contrasts. These can be formulated using the general linear hypothesis:
where is a contrast matrix. When the variance-covariance parameters in and are known, the estimate of is:
And its variance is:
The sampling distribution of is:
However, in practical situations, the matrices and are unknown and must be estimated using methods like Maximum Likelihood (ML) or Restricted ML (REML). The estimate of is obtained by plugging in the estimated covariance matrices , where:
The resulting estimate of is:
And its estimated variance is:
When testing the null hypothesis , an approximate F-statistic is used. The F-statistic is given by:
follows an approximate F-distribution under , where represents the numerator degrees of freedom (df), is the denominator df.
When , the F-statistic simplifies to the square of the t-statistic:
where with df.
In balanced designs, where data is analyzed using a variance components model—commonly applied in experimental animal research— can be precisely determined through degrees of freedom decomposition, as applied in analysis of variance (ANOVA).
However, for more general cases, must be approximated using methods.
The Satterthwaite approximation (Satterthwaite, 1946) for DF in t-tests can be calculated as outlined by Giesbrecht and Burns (1985):
where: - is the gradient of with respect to , the variance-covariance parameters in , evaluated at . - Matrix is the asymptotic variance-covariance matrix of , obtained from the information matrix of ML or REML estimation of (Stroup, 2012).
In F-tests, can be calculated by following the procedures described by Fai and Cornelius (1996). First, is decomposed to yield , where is an orthogonal matrix of eigenvectors, and is a diagonal matrix of eigenvalues. Define , where is the -th row of , and let:
where: - is the -th diagonal element of . - is the gradient of with respect to , evaluated at .
Then let:
where denotes the indicator function. The denominator DF is calculated as:
The Satterthwaite approximation can be applied in power analysis by plugging in the assumed values of variance parameters (Stroup, 2002).
Power Calculation Under the Alternative Hypothesis
Under the alternative hypothesis , the F-statistic follows a non-central distribution , where is the non-centrality parameter that measures the departure from the null hypothesis . The non-centrality parameter is given by:
Once the distribution of the F-statistic under is known, the power of the test can be calculated as the conditional probability of rejecting when is true:
Where: is the critical value of the F-statistic used to reject , determined such that , where is the type I error rate.
The determination of the degrees of freedom and , as well as the non-centrality parameter , are critical steps for power calculation.
Generally, power analysis requires specifying the following components:
- The Design to be evaluated, which determines the matrices (fixed effects) and (random effects).
- The Treatment Effects, which determine (the fixed effect coefficients).
- The Variance Components, which determine (the covariance matrix of the random effects) and (the covariance matrix of the residuals).
A key aspect of conducting a valid power analysis is obtaining reasonable estimates for the magnitude of the parameters that will be used in the model. This includes:
- Treatment Effects (): The size of the treatment effect(s) you expect to detect. This can be obtained from previous studies, pilot experiments, or subject-matter expertise.
-
Variance Components
(
and
):
These represent the variability in the random effects and residual
errors. Variance estimates are often obtained from:
- Pilot studies or preliminary data, which can provide initial estimates of variability in random effects (e.g., subject-to-subject variability or group-level variability).
- Literature on similar experiments, where variance components have been reported.
- Subject-matter expertise, where researchers provide estimates based on their knowledge of the system being studied.
Performing a power analysis with unrealistic parameter magnitudes can lead to incorrect conclusions, either overestimating the likelihood of detecting a treatment effect or requiring an unnecessarily large sample size.
References
Satterthwaite, F. E. 1946. An approximate distribution of estimates of variance components. Biometrics Bull. 2:110–114. Hrong-Tai Fai, A., & Cornelius, P. L. (1996). Approximate F-tests of multiple degree of freedom hypotheses in generalized least squares analyses of unbalanced split-plot experiments. Journal of statistical computation and simulation, 54(4), 363-378.