Configuration¶
Methods for configuring effect sizes, variable types, correlations, and simulation parameters.
set_effects()¶
- MCPower.set_effects(effects_string)[source]¶
Set standardised effect sizes for predictors.
Effect sizes are expressed as standardised regression coefficients (beta weights). Each assignment maps an effect name to its size. Interaction effects use
:notation. For factor variables, specify effects for each dummy level with bracket notation.This setting is deferred until
apply()is called.- Parameters:
effects_string (str) – Comma-separated
name=valuepairs. Examples:"x1=0.5, x2=0.3, x1:x2=0.2","treatment=0.4, cyl[2]=0.2, cyl[3]=0.5".- Returns:
For method chaining.
- Return type:
self
- Raises:
TypeError – If effects_string is not a string.
ValueError – If effects_string is empty or contains invalid assignments (checked at apply time).
String Format¶
Comma-separated name=value pairs:
model.set_effects("x1=0.5, x2=0.3")
Interaction effects – use : notation:
model.set_effects("x1=0.5, x2=0.3, x1:x2=0.2")
Factor variables – assign different effects per level using bracket notation:
# Integer-indexed levels (no uploaded data or named levels)
model.set_effects("group[2]=0.4, group[3]=0.6")
# Named levels (after set_factor_levels or upload_data)
model.set_effects("group[drug_a]=0.4, group[drug_b]=0.6")
Updating Effects¶
After running an analysis, a new set_effects() updates (merges with) the previously applied effects:
model.set_effects("x1=0.5, x2=0.3")
model.find_power(sample_size=100)
model.set_effects("x2=0.4") # x1 remains 0.5, x2 is now 0.4
model.find_power(sample_size=100) # uses x1=0.5, x2=0.4
Examples¶
from mcpower import MCPower
model = MCPower("y = treatment + motivation + treatment:motivation")
model.set_simulations(400)
model.set_variable_type("treatment=binary")
model.set_effects("treatment=0.5, motivation=0.3, treatment:motivation=0.2")
model.find_power(sample_size=100)
Notes¶
Effect sizes are standardized – they represent the change in outcome (in SDs) per 1 SD change in the predictor.
For binary predictors, the effect size represents the difference between the two groups in standard deviation units (equivalent to Cohen’s d).
For factor variables, each dummy’s effect size represents the difference between that level and the reference level.
A common guideline: 0.2 = small, 0.5 = medium, 0.8 = large (Cohen’s conventions).
The method raises
TypeErrorif the argument is not a string andValueErrorif it is empty.
See Also¶
Effect Sizes – Understanding and choosing effect sizes
set_factor_levels() – Define named factor levels for bracket notation
set_variable_type() – Declare binary/factor variables before setting effects
Tutorial: Your First Power Analysis – Complete walkthrough including effect sizes
set_variable_type()¶
- MCPower.set_variable_type(variable_types_string)[source]¶
Set distribution types for predictor variables.
Variables default to
"normal"(standard Gaussian). Use this method to specify alternative distributions.This setting is deferred until
apply()is called.- Parameters:
variable_types_string (str) –
Comma-separated
name=typeassignments. Supported types:"normal"— standard normal (default)."binary"or"binary(p)"— Bernoulli with proportion p (default 0.5)."right_skewed"— positively skewed distribution."left_skewed"— negatively skewed distribution."high_kurtosis"— heavy-tailed (t-distribution, df=3)."uniform"— uniform distribution."factor(k)"— categorical with k levels (creates k-1 dummy variables)."factor(k, p1, p2, ...)"— factor with custom level proportions.
Example:
"x1=binary, x2=right_skewed, x3=factor(3)".- Returns:
For method chaining.
- Return type:
self
- Raises:
TypeError – If variable_types_string is not a string.
ValueError – If types are unrecognised or proportions invalid (checked at apply time).
Supported Types¶
Type String |
Description |
Generated Distribution |
|---|---|---|
|
Standard normal (default) |
N(0, 1) |
|
Binary variable with 50/50 split |
Bernoulli(0.5) |
|
Binary with custom proportion |
Bernoulli(p), where 0 < p < 1 |
|
Factor with k levels, equal proportions |
k-1 dummy variables |
|
Factor with custom level proportions |
k-1 dummies; proportions are normalized to sum to 1 |
|
Right-skewed (heavy right tail) |
Chi-squared-like transform |
|
Left-skewed (heavy left tail) |
Mirrored right-skew |
|
Heavy-tailed (leptokurtic) |
t-distribution (df=3) |
|
Uniform distribution |
U(0, 1) transformed |
Examples¶
from mcpower import MCPower
# Basic type declarations
model = MCPower("y = treatment + condition + income")
model.set_simulations(400)
model.set_variable_type("treatment=binary, condition=(factor,3), income=right_skewed")
model.set_effects("treatment=0.5, condition[2]=0.3, condition[3]=0.4, income=0.2")
model.find_power(sample_size=150)
Binary with custom proportion:
model.set_variable_type("treatment=(binary,0.3)") # 30% in treatment group
Factor with equal proportions:
model.set_variable_type("condition=(factor,3)") # 3 levels, ~33% each
Factor with custom proportions:
# 3 levels with proportions 20%, 50%, 30% -- must sum to 1.0
model.set_variable_type("group=(factor,0.2,0.5,0.3)")
Updating types – calling again updates existing entries without clearing others:
model.set_variable_type("x1=binary, x2=right_skewed")
model.set_variable_type("x2=normal") # x1 remains binary, x2 is now normal
Notes¶
Factor variables create k-1 dummy variables (level 1 is the reference by default). After declaring a factor, use bracket notation in
set_effects()to assign effects to each dummy.When
upload_data()is used, variable types are auto-detected and typically do not need to be set manually. Useset_variable_type()to override auto-detection.Validation of types and proportions happens when
find_power()orfind_sample_size()is called.
See Also¶
Variable Types – Full guide to variable types and distributions
set_effects() – Setting effect sizes for typed variables
set_factor_levels() – Define named factor levels
upload_data() – Automatic type detection from empirical data
set_correlations()¶
- MCPower.set_correlations(correlations_input)[source]¶
Set correlations between predictor variables.
Correlations are only defined for non-factor (continuous/binary) predictors. Factor dummies are generated independently.
This setting is deferred until
apply()is called.- Parameters:
correlations_input – Either a comma-separated string of pair-wise assignments (e.g.
"x1:x2=0.3, x1:x3=-0.1") or a full NumPy correlation matrix whose dimensions match the number of non-factor predictors.- Returns:
For method chaining.
- Return type:
self
- Raises:
TypeError – If correlations_input is not a string or ndarray.
ValueError – If the matrix is not positive semi-definite or has wrong dimensions (checked at apply time).
Input Formats¶
String format – full syntax:
model.set_correlations("corr(x1, x2)=0.3, corr(x1, x3)=-0.2")
String format – shorthand (the corr() wrapper is optional):
model.set_correlations("(x1, x2)=0.3, (x1, x3)=-0.2")
NumPy matrix – dimensions must match the number of non-factor predictors, in formula order:
import numpy as np
# For a model with predictors x1, x2, x3 (all continuous)
model.set_correlations(np.array([
[1.0, 0.3, -0.2],
[0.3, 1.0, 0.1],
[-0.2, 0.1, 1.0],
]))
Correlation Values¶
Valid range: -1 to 1 (exclusive of exact -1 and 1 for off-diagonal entries)
Diagonal entries must be 1.0 (for matrix input)
The matrix must be symmetric
The matrix must be positive semi-definite (PSD) – MCPower validates this and raises an error if the matrix is not PSD
Examples¶
from mcpower import MCPower
model = MCPower("y = x1 + x2 + x3")
model.set_simulations(400)
model.set_effects("x1=0.5, x2=0.3, x3=0.2")
model.set_correlations("(x1, x2)=0.4, (x2, x3)=0.2")
model.find_power(sample_size=100)
Notes¶
Factor variables cannot be correlated. Correlations are defined only between continuous and binary predictors.
Unspecified pairs default to zero correlation (independence).
When using
upload_data()withpreserve_correlation="partial", correlations are computed from the data and merged with any user-specified values. Withpreserve_correlation="strict"(the default), the full row-bootstrap approach preserves the empirical correlation structure automatically.
See Also¶
Correlations – Full guide to predictor correlations
upload_data() – Automatic correlation preservation from empirical data
Tutorial: Your First Power Analysis – Adding correlations to a model
set_alpha()¶
- MCPower.set_alpha(alpha)[source]¶
Set the significance level for hypothesis testing.
- Parameters:
alpha (float) – Type-I error rate (0–0.25). Default is 0.05.
- Returns:
For method chaining.
- Return type:
self
- Raises:
ValueError – If alpha is outside the valid range.
Common Alpha Levels¶
Alpha |
Use Case |
|---|---|
0.05 |
Standard threshold (default) |
0.01 |
Stricter threshold, common in some fields |
0.005 |
Proposed “redefine statistical significance” threshold |
0.10 |
Exploratory research, pilot studies |
Examples¶
from mcpower import MCPower
# Use stricter significance threshold
model = MCPower("y = x1 + x2")
model.set_simulations(400)
model.set_effects("x1=0.5, x2=0.3")
model.set_alpha(0.01)
model.find_power(sample_size=100)
# Chained
model = (
MCPower("y = x1 + x2")
.set_effects("x1=0.5, x2=0.3")
.set_alpha(0.01)
)
set_power()¶
- MCPower.set_power(power)[source]¶
Set the target statistical power level.
Used by
find_sample_sizeto determine when power is sufficient.- Parameters:
power (float) – Target power as a percentage (0–100). Default is 80.
- Returns:
For method chaining.
- Return type:
self
- Raises:
ValueError – If power is outside the valid range.
Common Power Targets¶
Power |
Use Case |
|---|---|
80% |
Standard target (default). Accepted in most fields. |
90% |
Higher confidence. Common for clinical trials and well-funded studies. |
95% |
Very conservative. Requires substantially larger samples. |
Examples¶
from mcpower import MCPower
# Require 90% power instead of the default 80%
model = MCPower("y = x1 + x2")
model.set_simulations(400)
model.set_effects("x1=0.5, x2=0.3")
model.set_power(90)
model.find_sample_size(from_size=50, to_size=300, by=30)
set_seed()¶
- MCPower.set_seed(seed=None)[source]¶
Set random seed for reproducibility.
- Parameters:
seed (int | None) – Non-negative integer up to 3,000,000,000. Pass
Noneto enable fully random seeding.- Returns:
For method chaining.
- Return type:
self
- Raises:
TypeError – If seed is not an integer or
None.ValueError – If seed is negative or exceeds the maximum.
Examples¶
from mcpower import MCPower
model = MCPower("y = x1 + x2")
model.set_simulations(400)
model.set_effects("x1=0.5, x2=0.3")
# Reproducible results
model.set_seed(42)
model.find_power(sample_size=100) # Always produces the same output
# Random seeding (different results each run)
model.set_seed(None)
model.find_power(sample_size=100)
Notes¶
The C++ backend uses
std::mt19937, while Python usesnumpy.random. The same seed produces different random sequences across backends, but statistical properties (power estimates) are comparable.The default seed is
2137. Change it if you want a different reproducible sequence.
set_simulations()¶
- MCPower.set_simulations(n_simulations, model_type=None)[source]¶
Set the number of Monte Carlo simulations.
More simulations yield more precise power estimates at the cost of longer runtime. The default is 1600 for OLS and 800 for mixed models.
- Parameters:
- Returns:
For method chaining.
- Return type:
self
- Raises:
ValueError – If n_simulations is not a positive integer or model_type is unrecognised.
Default Simulation Counts¶
Model Type |
Default Count |
|---|---|
OLS (linear regression) |
1,600 |
Mixed-effects models |
800 |
Mixed-effects models use fewer simulations by default because each simulation is more computationally expensive (LME fitting vs. OLS).
Precision vs. Runtime¶
Simulations |
Approximate SE of Power Estimate |
Use Case |
|---|---|---|
400 |
~2.5% |
Quick exploration |
800 |
~1.8% |
Mixed-model default |
1,600 |
~1.2% |
OLS default; good for most analyses |
5,000 |
~0.7% |
High-precision estimates |
10,000 |
~0.5% |
Publication-quality precision |
The standard error of a power estimate at 80% power is approximately sqrt(0.8 * 0.2 / n_sims).
Examples¶
from mcpower import MCPower
# Set both OLS and mixed to the same count
model = MCPower("y = x1 + x2")
model.set_simulations(3200)
model.set_effects("x1=0.5, x2=0.3")
model.find_power(sample_size=100)
# Set OLS and mixed independently
model = MCPower("satisfaction ~ treatment + (1|school)")
model.set_simulations(2000, model_type="linear")
model.set_simulations(1000, model_type="mixed")
# Method chaining
model = (
MCPower("y = x1 + x2")
.set_simulations(3200)
.set_effects("x1=0.5, x2=0.3")
)
set_parallel()¶
- MCPower.set_parallel(enable=True, n_cores=None)[source]¶
Enable or disable parallel processing.
Requires
joblibto be installed. Falls back to sequential processing with a warning ifjoblibis unavailable.
Parallelization Modes¶
Value |
Behavior |
|---|---|
|
Parallel processing for all analyses (OLS and mixed). |
|
Sequential processing only. |
|
Parallel only for mixed-model analyses; OLS stays sequential. (Default) |
When to Use Each Mode¶
Scenario |
Recommended Mode |
Reasoning |
|---|---|---|
OLS with default 1,600 sims |
|
C++ backend is fast enough; parallel overhead not worthwhile. |
OLS with 5,000+ sims |
|
High simulation count justifies parallelization overhead. |
Mixed models |
|
LME fitting is expensive; parallel processing helps substantially. |
Debugging / profiling |
|
Sequential execution is easier to reason about and profile. |
Resource-constrained environment |
|
Avoid saturating shared machines. |
Examples¶
from mcpower import MCPower
# Default: parallel for mixed models only
model = MCPower("y ~ treatment + (1|school)")
model.set_simulations(400)
model.set_cluster("school", ICC=0.2, n_clusters=20)
model.set_effects("treatment=0.5")
model.find_power(sample_size=1000) # Runs in parallel automatically
# Force parallel for OLS (useful with very high simulation counts)
model = MCPower("y = x1 + x2 + x3 + x4 + x5")
model.set_effects("x1=0.5, x2=0.3, x3=0.2, x4=0.1, x5=0.4")
model.set_simulations(10000)
model.set_parallel(True, n_cores=4)
model.find_power(sample_size=200)
# Disable parallelization entirely
model.set_parallel(False)
Notes on n_cores¶
When
n_coresisNone, MCPower uses half the available CPU cores (cpu_count // 2).Setting
n_cores=1is equivalent toenabled=False.Using more cores than physically available provides no benefit and may hurt performance due to context-switching overhead.
Requires
joblibto be installed. Ifjoblibis not available, MCPower falls back to sequential processing with a warning.
See Also¶
Performance & Backends – Runtime considerations
Mixed-Effects Models – Complete guide to mixed-effects support