Generates a design object for power analysis by specifying a model formula and data frame. This object is not a true experimental design as created by design generation procedures, where randomization and unit allocation are performed. Instead, it serves as an object containing all necessary information for power analysis, including design matrices, assumed values of model effects, and other necessary components.
Usage
mkdesign(
formula,
data,
beta = NULL,
means = NULL,
vcomp = NULL,
sigma2 = NULL,
correlation = NULL,
REML = TRUE,
template = FALSE,
...
)
Arguments
- formula
A right-hand-side formula specifying the model for testing treatment effects, with terms on the right of ~ , following
lme4::lmer()
syntax for random effects.- data
A data frame with all independent variables specified in the model, matching the design's structure.
- beta
One of the optional inputs for fixed effects. A vector of model coefficients where factor variable coefficients correspond to dummy variables created using "contr.treatment".
- means
One of the optional inputs for fixed effects. A vector of marginal or conditioned means (if factors have interactions). Regression coefficients are required for numerical variables. Either
beta
ormeans
must be provided, and their values must strictly follow a specific order. A template can be created to indicate the required input values and their order. See "Details" for more information.- vcomp
A vector of variance-covariance components for random effects, if present. The values must follow a strict order. See "Details".
- sigma2
error variance.
- correlation
Specifies correlation structures using nlme::corClasses functions. See "Details" for more information.
- REML
Specifies whether to use REML or ML estimates for variance-covariance parameters. Default is
TRUE
.- template
Default is
FALSE
. IfTRUE
, a template forbeta
,means
, andvcomp
is generated to indicate the required input order.- ...
Additional arguments passed to internal functions.
Details
data: A long-format data frame is required, as typically used in R for fitting linear models. This data frame can be created manually or with the help of design creation packages such as agricolae, crossdes, AlgDesign, or FrF2. It should include all independent variables specified in the model (e.g., treatments, blocks, subjects), which are generally determined during the experimental design phase. While the data frame may contain realizations of the response variable, these are not mandatory and will be ignored if present.
template: Templates are automatically generated when only the formula and data are supplied, or explicitly if
template = TRUE
. Templates serve as guides for specifying inputs:Template for
beta
: Represents the sequence of model coefficients.Template for
means
: Specifies the order of means (for categorical variables) and/or regression coefficients (for continuous variables), depending on the scenario:Categorical variables without interactions: Requires marginal means for each level of the categorical variable(s).
Interactions among categorical variables: Requires conditional (cell) means for all level combinations.
Numerical variables without interactions: Requires regression coefficients. The intercept must also be included if there are no categorical variables in the model.
Interactions among numerical variables: Requires regression coefficients for both main effects and interaction terms. The intercept must also be included if there are no categorical variables in the model.
Categorical-by-numerical interactions: Requires regression coefficients for the numerical variable at each level of the categorical variable, as well as marginal means for the levels of the categorical variable.
Note: For models containing only numerical variables, the inputs for
means
andbeta
are identical. See the "Examples" for illustrative scenarios.Template for
vcomp
: Represents a variance-covariance matrix, where integers indicate the order of variance components in the input vector.
correlation: Various correlation structures can be specified following the instructions from nlme::corClasses.
Note: In
nlme::corAR1()
andnlme::corARMA()
whenp=1
andq=0
, the time variable must be an integer. However, inpwr4exp
, this restriction has been released, factors are also supported.
Examples
# Using templates for specifying "means"
# Create an example data frame with four categorical variables (factors)
# and two numerical variables
df1 <- expand.grid(
fA = factor(1:2),
fB = factor(1:2),
fC = factor(1:3),
fD = factor(1:3),
subject = factor(1:10)
)
df1$x <- rnorm(nrow(df1)) # Numerical variable x
df1$z <- rnorm(nrow(df1)) # Numerical variable z
## Categorical variables without interactions
# Means of each level of fA and fB are required in sequence.
mkdesign(~ fA + fB, df1)$fixeff$means
#> fA1 fA2 fB1 fB2
#> 1 2 3 4
## Interactions among categorical variables
# Cell means for all combinations of levels of fA and fB are required.
mkdesign(~ fA * fB, df1)$fixeff$means
#> fA1:fB1 fA2:fB1 fA1:fB2 fA2:fB2
#> 1 2 3 4
## Numerical variables without and with interactions, identical to beta.
# Without interactions: Regression coefficients are required
mkdesign(~ x + z, df1)$fixeff$means
#> (Intercept) x z
#> 1 2 3
# With interactions: Coefficients for main effects and interaction terms are required.
mkdesign(~ x * z, df1)$fixeff$means
#> (Intercept) x z x:z
#> 1 2 3 4
## Categorical-by-numerical interactions
# Marginal means for each level of fA, and regression coefficients for x
# at each level of fA are required.
mkdesign(~ fA * x, df1)$fixeff$means
#> fA1 fA2 fA1:x fA2:x
#> 1 2 3 4
## Three factors with interactions:
# Cell means for all 12 combinations of the levels of fA, fB, and fC are required.
mkdesign(~ fA * fB * fC, df1)
#> $fixeff
#> $fixeff$beta
#> (Intercept) fA2 fB2 fC2 fC3 fA2:fB2
#> 1 2 3 4 5 6
#> fA2:fC2 fA2:fC3 fB2:fC2 fB2:fC3 fA2:fB2:fC2 fA2:fB2:fC3
#> 7 8 9 10 11 12
#>
#> $fixeff$means
#> fA1:fB1:fC1 fA2:fB1:fC1 fA1:fB2:fC1 fA2:fB2:fC1 fA1:fB1:fC2 fA2:fB1:fC2
#> 1 2 3 4 5 6
#> fA1:fB2:fC2 fA2:fB2:fC2 fA1:fB1:fC3 fA2:fB1:fC3 fA1:fB2:fC3 fA2:fB2:fC3
#> 7 8 9 10 11 12
#>
#>
#> $varcov
#> NULL
#>
# A design that mixes the above-mentioned scenarios:
# - Interactions among three categorical variables (fA, fB, fC)
# - A categorical-by-numerical interaction (fD * x)
# - Main effects for another numerical variable (z)
# The required inputs are:
# - Cell means for all combinations of levels of fA, fB, and fC
# - Means for each level of fD
# - Regression coefficients for x at each level of fD
# - Regression coefficients for z
mkdesign(~ fA * fB * fC + fD * x + z, df1)$fixeff$means
#> fD1 fD2 fD3 z fD1:x fD2:x
#> 1 2 3 4 5 6
#> fD3:x fA1:fB1:fC1 fA2:fB1:fC1 fA1:fB2:fC1 fA2:fB2:fC1 fA1:fB1:fC2
#> 7 8 9 10 11 12
#> fA2:fB1:fC2 fA1:fB2:fC2 fA2:fB2:fC2 fA1:fB1:fC3 fA2:fB1:fC3 fA1:fB2:fC3
#> 13 14 15 16 17 18
#> fA2:fB2:fC3
#> 19
# Using templates for specifying "vcomp"
# Assume df1 represents an RCBD with "subject" as a random blocking factor.
## Variance of the random effect "subject" (intercept) is required.
mkdesign(~ fA * fB * fC * fD + (1 | subject), df1)$varcov
#> $subject
#> (Intercept)
#> (Intercept) 1
#>
# Demonstration of templates for more complex random effects
## Note: This example is a demo and statistically incorrect for this data
## (no replicates under subject*fA). It only illustrates variance-covariance templates.
## Inputs required:
## - Variance of the random intercept (1st)
## - Covariance between the intercept and "fA2" (2nd)
## - Variance of "fA2" (3rd)
mkdesign(~ fA * fB * fC * fD + (1 + fA | subject), df1)$varcov
#> $subject
#> (Intercept) fA2
#> (Intercept) 1 2
#> fA2 2 3
#>
# Power analysis for repeated measures
## Create a data frame for a CRD with repeated measures
n_subject <- 6
n_trt <- 3
n_hour <- 8
trt <- c("CON", "TRT1", "TRT2")
df2 <- data.frame(
subject = as.factor(rep(seq_len(n_trt * n_subject), each = n_hour)), # Subject as a factor
hour = as.factor(rep(seq_len(n_hour), n_subject * n_trt)), # Hour as a factor
trt = rep(trt, each = n_subject * n_hour) # Treatment as a factor
)
## Templates
temp <- mkdesign(formula = ~ trt * hour, data = df2)
temp$fixeff$means # Fixed effects means template
#> trtCON:hour1 trtTRT1:hour1 trtTRT2:hour1 trtCON:hour2 trtTRT1:hour2
#> 1 2 3 4 5
#> trtTRT2:hour2 trtCON:hour3 trtTRT1:hour3 trtTRT2:hour3 trtCON:hour4
#> 6 7 8 9 10
#> trtTRT1:hour4 trtTRT2:hour4 trtCON:hour5 trtTRT1:hour5 trtTRT2:hour5
#> 11 12 13 14 15
#> trtCON:hour6 trtTRT1:hour6 trtTRT2:hour6 trtCON:hour7 trtTRT1:hour7
#> 16 17 18 19 20
#> trtTRT2:hour7 trtCON:hour8 trtTRT1:hour8 trtTRT2:hour8
#> 21 22 23 24
temp$vcov # Variance-covariance template
#> NULL
## Create a design object
# Assume repeated measures within a subject follow an AR1 process with a correlation of 0.6
design <- mkdesign(
formula = ~ trt * hour,
data = df2,
means = c(1, 2.50, 3.50,
1, 3.50, 4.54,
1, 3.98, 5.80,
1, 4.03, 5.40,
1, 3.68, 5.49,
1, 3.35, 4.71,
1, 3.02, 4.08,
1, 2.94, 3.78),
sigma2 = 2,
correlation = corAR1(value = 0.6, form = ~ hour | subject)
)
pwr.anova(design) # Perform power analysis
#> Power of type III F-test
#> NumDF DenDF sig.level power
#> trt 2 21.563 0.05 1.00000
#> hour 7 86.055 0.05 0.74687
#> trt:hour 14 86.055 0.05 0.38500
## When time is treated as a numeric variable
# Means of treatments and regression coefficients for hour at each treatment level are required
df2$hour <- as.integer(df2$hour)
mkdesign(formula = ~ trt * hour, data = df2)$fixeff$means
#> trtCON trtTRT1 trtTRT2 trtCON:hour trtTRT1:hour trtTRT2:hour
#> 1 2 3 4 5 6
## Polynomial terms of time in the model
mkdesign(formula = ~ trt + hour + I(hour^2) + trt:hour + trt:I(hour^2), data = df2)$fixeff$means
#> trtCON trtTRT1 trtTRT2 trtCON:hour
#> 1 2 3 4
#> trtTRT1:hour trtTRT2:hour trtCON:I(hour^2) trtTRT1:I(hour^2)
#> 5 6 7 8
#> trtTRT2:I(hour^2)
#> 9