Configuration¶

Methods for configuring effect sizes, variable types, correlations, and simulation parameters.

set_effects()¶

MCPower.set_effects(effects_string)[source]¶

Set standardised effect sizes for predictors.

Effect sizes are expressed as standardised regression coefficients (beta weights). Each assignment maps an effect name to its size. Interaction effects use : notation. For factor variables, specify effects for each dummy level with bracket notation.

This setting is deferred until apply() is called.

Parameters:

effects_string (str) – Comma-separated name=value pairs. Examples: "x1=0.5, x2=0.3, x1:x2=0.2", "treatment=0.4, cyl[2]=0.2, cyl[3]=0.5".

Returns:

For method chaining.

Return type:

self

Raises:

TypeError – If effects_string is not a string.
ValueError – If effects_string is empty or contains invalid assignments (checked at apply time).

String Format¶

Comma-separated name=value pairs:

model.set_effects("x1=0.5, x2=0.3")

Interaction effects – use : notation:

model.set_effects("x1=0.5, x2=0.3, x1:x2=0.2")

Factor variables – assign different effects per level using bracket notation:

# Integer-indexed levels (no uploaded data or named levels)
model.set_effects("group[2]=0.4, group[3]=0.6")

# Named levels (after set_factor_levels or upload_data)
model.set_effects("group[drug_a]=0.4, group[drug_b]=0.6")

Updating Effects¶

After running an analysis, a new set_effects() updates (merges with) the previously applied effects:

model.set_effects("x1=0.5, x2=0.3")
model.find_power(sample_size=100)
model.set_effects("x2=0.4")  # x1 remains 0.5, x2 is now 0.4
model.find_power(sample_size=100)  # uses x1=0.5, x2=0.4

Examples¶

from mcpower import MCPower

model = MCPower("y = treatment + motivation + treatment:motivation")
model.set_simulations(400)
model.set_variable_type("treatment=binary")
model.set_effects("treatment=0.5, motivation=0.3, treatment:motivation=0.2")
model.find_power(sample_size=100)

Notes¶

Effect sizes are standardized – they represent the change in outcome (in SDs) per 1 SD change in the predictor.
For binary predictors, the effect size represents the difference between the two groups in standard deviation units (equivalent to Cohen’s d).
For factor variables, each dummy’s effect size represents the difference between that level and the reference level.
A common guideline: 0.2 = small, 0.5 = medium, 0.8 = large (Cohen’s conventions).
The method raises TypeError if the argument is not a string and ValueError if it is empty.

set_variable_type()¶

MCPower.set_variable_type(variable_types_string)[source]¶

Set distribution types for predictor variables.

Variables default to "normal" (standard Gaussian). Use this method to specify alternative distributions.

This setting is deferred until apply() is called.

Parameters:

variable_types_string (str) –

Comma-separated name=type assignments. Supported types:

"normal" — standard normal (default).
"binary" or "binary(p)" — Bernoulli with proportion p (default 0.5).
"right_skewed" — positively skewed distribution.
"left_skewed" — negatively skewed distribution.
"high_kurtosis" — heavy-tailed (t-distribution, df=3).
"uniform" — uniform distribution.
"factor(k)" — categorical with k levels (creates k-1 dummy variables).
"factor(k, p1, p2, ...)" — factor with custom level proportions.

Example: "x1=binary, x2=right_skewed, x3=factor(3)".

Returns:

For method chaining.

Return type:

self

Raises:

TypeError – If variable_types_string is not a string.
ValueError – If types are unrecognised or proportions invalid (checked at apply time).

Supported Types¶

Type String	Description	Generated Distribution
`normal`	Standard normal (default)	N(0, 1)
`binary`	Binary variable with 50/50 split	Bernoulli(0.5)
`(binary, p)`	Binary with custom proportion	Bernoulli(p), where 0 < p < 1
`(factor, k)`	Factor with k levels, equal proportions	k-1 dummy variables
`(factor, p1, p2, ..., pk)`	Factor with custom level proportions	k-1 dummies; proportions are normalized to sum to 1
`right_skewed`	Right-skewed (heavy right tail)	Chi-squared-like transform
`left_skewed`	Left-skewed (heavy left tail)	Mirrored right-skew
`high_kurtosis`	Heavy-tailed (leptokurtic)	t-distribution (df=3)
`uniform`	Uniform distribution	U(0, 1) transformed

Examples¶

from mcpower import MCPower

# Basic type declarations
model = MCPower("y = treatment + condition + income")
model.set_simulations(400)
model.set_variable_type("treatment=binary, condition=(factor,3), income=right_skewed")
model.set_effects("treatment=0.5, condition[2]=0.3, condition[3]=0.4, income=0.2")
model.find_power(sample_size=150)

Binary with custom proportion:

model.set_variable_type("treatment=(binary,0.3)")  # 30% in treatment group

Factor with equal proportions:

model.set_variable_type("condition=(factor,3)")  # 3 levels, ~33% each

Factor with custom proportions:

# 3 levels with proportions 20%, 50%, 30% -- must sum to 1.0
model.set_variable_type("group=(factor,0.2,0.5,0.3)")

Updating types – calling again updates existing entries without clearing others:

model.set_variable_type("x1=binary, x2=right_skewed")
model.set_variable_type("x2=normal")  # x1 remains binary, x2 is now normal

Notes¶

Factor variables create k-1 dummy variables (level 1 is the reference by default). After declaring a factor, use bracket notation in set_effects() to assign effects to each dummy.
When upload_data() is used, variable types are auto-detected and typically do not need to be set manually. Use set_variable_type() to override auto-detection.
Validation of types and proportions happens when find_power() or find_sample_size() is called.

set_correlations()¶

MCPower.set_correlations(correlations_input)[source]¶

Set correlations between predictor variables.

Correlations are only defined for non-factor (continuous/binary) predictors. Factor dummies are generated independently.

This setting is deferred until apply() is called.

Parameters:

correlations_input – Either a comma-separated string of pair-wise assignments (e.g. "x1:x2=0.3, x1:x3=-0.1") or a full NumPy correlation matrix whose dimensions match the number of non-factor predictors.

Returns:

For method chaining.

Return type:

self

Raises:

TypeError – If correlations_input is not a string or ndarray.
ValueError – If the matrix is not positive semi-definite or has wrong dimensions (checked at apply time).

Input Formats¶

String format – full syntax:

model.set_correlations("corr(x1, x2)=0.3, corr(x1, x3)=-0.2")

String format – shorthand (the corr() wrapper is optional):

model.set_correlations("(x1, x2)=0.3, (x1, x3)=-0.2")

NumPy matrix – dimensions must match the number of non-factor predictors, in formula order:

import numpy as np

# For a model with predictors x1, x2, x3 (all continuous)
model.set_correlations(np.array([
    [1.0, 0.3, -0.2],
    [0.3, 1.0,  0.1],
    [-0.2, 0.1, 1.0],
]))

Correlation Values¶

Valid range: -1 to 1 (exclusive of exact -1 and 1 for off-diagonal entries)
Diagonal entries must be 1.0 (for matrix input)
The matrix must be symmetric
The matrix must be positive semi-definite (PSD) – MCPower validates this and raises an error if the matrix is not PSD

Examples¶

from mcpower import MCPower

model = MCPower("y = x1 + x2 + x3")
model.set_simulations(400)
model.set_effects("x1=0.5, x2=0.3, x3=0.2")
model.set_correlations("(x1, x2)=0.4, (x2, x3)=0.2")
model.find_power(sample_size=100)

Notes¶

Factor variables cannot be correlated. Correlations are defined only between continuous and binary predictors.
Unspecified pairs default to zero correlation (independence).
When using upload_data() with preserve_correlation="partial", correlations are computed from the data and merged with any user-specified values. With preserve_correlation="strict" (the default), the full row-bootstrap approach preserves the empirical correlation structure automatically.

set_alpha()¶

MCPower.set_alpha(alpha)[source]¶

Set the significance level for hypothesis testing.

Parameters:: alpha (float) – Type-I error rate (0–0.25). Default is 0.05.
Returns:: For method chaining.
Return type:: self
Raises:: ValueError – If alpha is outside the valid range.

Common Alpha Levels¶

Alpha	Use Case
0.05	Standard threshold (default)
0.01	Stricter threshold, common in some fields
0.005	Proposed “redefine statistical significance” threshold
0.10	Exploratory research, pilot studies

Examples¶

from mcpower import MCPower

# Use stricter significance threshold
model = MCPower("y = x1 + x2")
model.set_simulations(400)
model.set_effects("x1=0.5, x2=0.3")
model.set_alpha(0.01)
model.find_power(sample_size=100)

# Chained
model = (
    MCPower("y = x1 + x2")
    .set_effects("x1=0.5, x2=0.3")
    .set_alpha(0.01)
)

set_power()¶

MCPower.set_power(power)[source]¶

Set the target statistical power level.

Used by find_sample_size to determine when power is sufficient.

Parameters:: power (float) – Target power as a percentage (0–100). Default is 80.
Returns:: For method chaining.
Return type:: self
Raises:: ValueError – If power is outside the valid range.

Common Power Targets¶

Power	Use Case
80%	Standard target (default). Accepted in most fields.
90%	Higher confidence. Common for clinical trials and well-funded studies.
95%	Very conservative. Requires substantially larger samples.

Examples¶

from mcpower import MCPower

# Require 90% power instead of the default 80%
model = MCPower("y = x1 + x2")
model.set_simulations(400)
model.set_effects("x1=0.5, x2=0.3")
model.set_power(90)
model.find_sample_size(from_size=50, to_size=300, by=30)

set_seed()¶

MCPower.set_seed(seed=None)[source]¶

Set random seed for reproducibility.

Parameters:

seed (int | None) – Non-negative integer up to 3,000,000,000. Pass None to enable fully random seeding.

Returns:

For method chaining.

Return type:

self

Raises:

TypeError – If seed is not an integer or None.
ValueError – If seed is negative or exceeds the maximum.

Examples¶

from mcpower import MCPower

model = MCPower("y = x1 + x2")
model.set_simulations(400)
model.set_effects("x1=0.5, x2=0.3")

# Reproducible results
model.set_seed(42)
model.find_power(sample_size=100)  # Always produces the same output

# Random seeding (different results each run)
model.set_seed(None)
model.find_power(sample_size=100)

Notes¶

The C++ backend uses std::mt19937, while Python uses numpy.random. The same seed produces different random sequences across backends, but statistical properties (power estimates) are comparable.
The default seed is 2137. Change it if you want a different reproducible sequence.

set_simulations()¶

MCPower.set_simulations(n_simulations, model_type=None)[source]¶

Set the number of Monte Carlo simulations.

More simulations yield more precise power estimates at the cost of longer runtime. The default is 1600 for OLS and 800 for mixed models.

Parameters:

n_simulations (int) – Number of simulations (positive integer).
model_type (str | None) – Which simulation count to update: - None (default): sets both OLS and mixed-model counts. - "linear": sets only the OLS count. - "mixed": sets only the mixed-model count.

Returns:

For method chaining.

Return type:

self

Raises:

ValueError – If n_simulations is not a positive integer or model_type is unrecognised.

Default Simulation Counts¶

Model Type	Default Count
OLS (linear regression)	1,600
Mixed-effects models	800

Mixed-effects models use fewer simulations by default because each simulation is more computationally expensive (LME fitting vs. OLS).

Precision vs. Runtime¶

Simulations	Approximate SE of Power Estimate	Use Case
400	~2.5%	Quick exploration
800	~1.8%	Mixed-model default
1,600	~1.2%	OLS default; good for most analyses
5,000	~0.7%	High-precision estimates
10,000	~0.5%	Publication-quality precision

The standard error of a power estimate at 80% power is approximately sqrt(0.8 * 0.2 / n_sims).

Examples¶

from mcpower import MCPower

# Set both OLS and mixed to the same count
model = MCPower("y = x1 + x2")
model.set_simulations(3200)
model.set_effects("x1=0.5, x2=0.3")
model.find_power(sample_size=100)

# Set OLS and mixed independently
model = MCPower("satisfaction ~ treatment + (1|school)")
model.set_simulations(2000, model_type="linear")
model.set_simulations(1000, model_type="mixed")

# Method chaining
model = (
    MCPower("y = x1 + x2")
    .set_simulations(3200)
    .set_effects("x1=0.5, x2=0.3")
)

set_parallel()¶

MCPower.set_parallel(enable=True, n_cores=None)[source]¶

Enable or disable parallel processing.

Requires joblib to be installed. Falls back to sequential processing with a warning if joblib is unavailable.

Parameters:

enable (bool | str) – Parallel mode — True for all analyses, False for sequential processing, or "mixedmodels" for mixed-model analyses only (default).
n_cores (int | None) – Number of CPU cores to use. Defaults to cpu_count // 2.

Returns:

For method chaining.

Return type:

self

Parallelization Modes¶

Value	Behavior
`True`	Parallel processing for all analyses (OLS and mixed).
`False`	Sequential processing only.
`"mixedmodels"`	Parallel only for mixed-model analyses; OLS stays sequential. (Default)

When to Use Each Mode¶

Scenario	Recommended Mode	Reasoning
OLS with default 1,600 sims	`"mixedmodels"` (default)	C++ backend is fast enough; parallel overhead not worthwhile.
OLS with 5,000+ sims	`True`	High simulation count justifies parallelization overhead.
Mixed models	`"mixedmodels"` (default)	LME fitting is expensive; parallel processing helps substantially.
Debugging / profiling	`False`	Sequential execution is easier to reason about and profile.
Resource-constrained environment	`False` or low `n_cores`	Avoid saturating shared machines.

Examples¶

from mcpower import MCPower

# Default: parallel for mixed models only
model = MCPower("y ~ treatment + (1|school)")
model.set_simulations(400)
model.set_cluster("school", ICC=0.2, n_clusters=20)
model.set_effects("treatment=0.5")
model.find_power(sample_size=1000)  # Runs in parallel automatically

# Force parallel for OLS (useful with very high simulation counts)
model = MCPower("y = x1 + x2 + x3 + x4 + x5")
model.set_effects("x1=0.5, x2=0.3, x3=0.2, x4=0.1, x5=0.4")
model.set_simulations(10000)
model.set_parallel(True, n_cores=4)
model.find_power(sample_size=200)

# Disable parallelization entirely
model.set_parallel(False)

Notes on n_cores¶

When n_cores is None, MCPower uses half the available CPU cores (cpu_count // 2).
Setting n_cores=1 is equivalent to enabled=False.
Using more cores than physically available provides no benefit and may hurt performance due to context-switching overhead.
Requires joblib to be installed. If joblib is not available, MCPower falls back to sequential processing with a warning.

Configuration¶

set_effects()¶

String Format¶

Updating Effects¶

Examples¶

Notes¶

See Also¶

set_variable_type()¶

Supported Types¶

Examples¶

Notes¶

See Also¶

set_correlations()¶

Input Formats¶

Correlation Values¶

Examples¶

Notes¶

See Also¶

set_alpha()¶

Common Alpha Levels¶

Examples¶

set_power()¶

Common Power Targets¶

Examples¶

set_seed()¶

Examples¶

Notes¶

set_simulations()¶

Default Simulation Counts¶

Precision vs. Runtime¶

Examples¶

set_parallel()¶

Parallelization Modes¶

When to Use Each Mode¶

Examples¶

Notes on n_cores¶

See Also¶