Tutorial: Using Custom Scenarios

Goal: You want to customize the scenario analysis to match the specific assumption violations you expect in your field.


Table of Contents


Full Working Example

You are planning a study in educational psychology where heterogeneity in effect sizes is a major concern but distribution violations are unlikely (you have good measurement instruments). The default scenarios apply too much distribution noise and too little effect heterogeneity for your situation.

from mcpower import MCPower

# 1. Define the model
model = MCPower("achievement = teaching_method + motivation + ses")
model.set_simulations(400)
model.set_variable_type("teaching_method=binary")
model.set_effects("teaching_method=0.40, motivation=0.30, ses=0.20")

# 2. Customize scenarios for your field
model.set_scenario_configs({
    "realistic": {
        "heterogeneity": 0.35,            # More effect heterogeneity
        "distribution_change_prob": 0.10,  # Less distribution noise
    },
    "doomer": {
        "heterogeneity": 0.60,            # Much more effect heterogeneity
        "distribution_change_prob": 0.20,  # Still mild distribution noise
        "correlation_noise_sd": 0.3,       # Moderate correlation instability
    },
})

# 3. Find required sample sizes under custom scenarios
model.find_sample_size(
    target_test="teaching_method",
    from_size=50, to_size=400, by=40,
    scenarios=True,
)
Custom scenario configs set: realistic, doomer
Variable types: teaching_method=(binary,0.5)
Effects: teaching_method=0.4, motivation=0.3, ses=0.2
Model settings applied successfully

================================================================================
SCENARIO-BASED MONTE CARLO POWER ANALYSIS RESULTS
================================================================================
================================================================================
SCENARIO SUMMARY
================================================================================

Uncorrected Sample Sizes:
Test                                     Optimistic   Realistic    Doomer      
-------------------------------------------------------------------------------
teaching_method                          250          210          250         
================================================================================

Step-by-Step Walkthrough

Lines 1-3: Model setup

model = MCPower("achievement = teaching_method + motivation + ses")
model.set_simulations(400)
model.set_variable_type("teaching_method=binary")
model.set_effects("teaching_method=0.40, motivation=0.30, ses=0.20")

A standard three-predictor model. Nothing changes about model setup when using custom scenarios.

Line 4: Customize scenario parameters

model.set_scenario_configs({
    "realistic": {
        "heterogeneity": 0.35,
        "distribution_change_prob": 0.10,
    },
    "doomer": {
        "heterogeneity": 0.60,
        "distribution_change_prob": 0.20,
        "correlation_noise_sd": 0.3,
    },
})

This is the key step. set_scenario_configs() takes a dictionary mapping scenario names to configuration overrides. The merge rules are:

  1. You only specify what you want to change. Any key you omit keeps its default value.

  2. Existing scenarios are updated. Here, "realistic" and "doomer" already exist, so only the specified keys are overridden.

  3. The optimistic scenario is never modified. It always uses your exact settings with zero perturbations.

For example, the realistic scenario above changes heterogeneity from 0.2 (default) to 0.35 and distribution_change_prob from 0.3 to 0.10, but keeps heteroskedasticity at 0.1, correlation_noise_sd at 0.2, and all other keys at their default values.

Line 5: Run with scenarios

model.find_sample_size(
    target_test="teaching_method",
    from_size=50, to_size=400, by=40,
    scenarios=True,
)
Variable types: teaching_method=(binary,0.5)
Effects: teaching_method=0.4, motivation=0.3, ses=0.2
Model settings applied successfully

================================================================================
SCENARIO-BASED MONTE CARLO POWER ANALYSIS RESULTS
================================================================================
================================================================================
SCENARIO SUMMARY
================================================================================

Uncorrected Sample Sizes:
Test                                     Optimistic   Realistic    Doomer      
-------------------------------------------------------------------------------
teaching_method                          250          210          250         
================================================================================

The scenarios=True flag runs the analysis under all configured scenarios (optimistic, realistic, doomer, plus any custom scenarios you’ve added). This works the same way with find_power(). You can also pass a list of scenario names to run only specific ones — see Running Selected Scenarios.


Output Interpretation

Test                                     Optimistic   Realistic    Doomer
-------------------------------------------------------------------------------
teaching_method                          80           100          160
  • Optimistic (N=80): Under ideal conditions, 80 participants suffice.

  • Realistic (N=100): With your field-specific heterogeneity, you need 100 participants.

  • Doomer (N=160): Under worst-case conditions, you need 160 participants.

Recommendation: Plan for N=100 (the realistic estimate). If the study is high-stakes, plan for N=160 to be safe under worst-case conditions.


Adding Entirely New Scenarios

You are not limited to the three built-in scenarios. Add custom scenarios with any name:

model.set_scenario_configs({
    "field_specific": {
        "heterogeneity": 0.30,
        "heteroskedasticity": 0.05,
        "correlation_noise_sd": 0.15,
        "distribution_change_prob": 0.10,
    },
})

model.find_power(sample_size=150, target_test="treatment", scenarios=True)
# Runs: optimistic, realistic, doomer, AND field_specific
Custom scenario configs set: field_specific
Variable types: treatment=(binary,0.5)
Effects: treatment=0.4, covariate=0.25
Model settings applied successfully

================================================================================
SCENARIO-BASED MONTE CARLO POWER ANALYSIS RESULTS
================================================================================
================================================================================
SCENARIO SUMMARY
================================================================================

Uncorrected Power:
Test                                     Optimistic   Realistic    Doomer       Field_Specific
----------------------------------------------------------------------------------------------
treatment                                66.8         65.2         64.5         64.2        
================================================================================

New scenarios inherit all keys from the optimistic baseline (all zeros), so you only specify the perturbations you want. The analysis now runs four scenarios instead of three.

Multiple custom scenarios

model.set_scenario_configs({
    "mild": {
        "heterogeneity": 0.10,
        "heteroskedasticity": 0.05,
    },
    "moderate": {
        "heterogeneity": 0.25,
        "heteroskedasticity": 0.15,
        "correlation_noise_sd": 0.20,
    },
    "severe": {
        "heterogeneity": 0.50,
        "heteroskedasticity": 0.30,
        "correlation_noise_sd": 0.40,
        "distribution_change_prob": 0.50,
    },
})

model.find_power(sample_size=200, scenarios=True)
# Runs: optimistic, realistic, doomer, mild, moderate, AND severe
Custom scenario configs set: mild, moderate, severe

================================================================================
SCENARIO-BASED MONTE CARLO POWER ANALYSIS RESULTS
================================================================================
================================================================================
SCENARIO SUMMARY
================================================================================

Uncorrected Power:
Test                                     Optimistic   Realistic    Doomer       Mild         Moderate     Severe      
----------------------------------------------------------------------------------------------------------------------
overall                                  99.0         94.2         88.2         98.8         98.2         94.2        
treatment                                81.8         81.0         78.5         82.0         81.5         80.5        
covariate                                93.0         77.8         64.2         93.2         91.8         75.0        
================================================================================

Note that the built-in realistic and doomer scenarios still run alongside your custom ones.

Running Selected Scenarios

You don’t have to run all scenarios every time. Pass a list of scenario names (case-insensitive) to run only the ones you need:

# Run only optimistic and your custom "severe" scenario
model.find_power(sample_size=200, scenarios=["optimistic", "severe"])
================================================================================
SCENARIO-BASED MONTE CARLO POWER ANALYSIS RESULTS
================================================================================
================================================================================
SCENARIO SUMMARY
================================================================================

Uncorrected Power:
Test                                     Optimistic   Severe      
------------------------------------------------------------------
overall                                  99.0         94.2        
treatment                                81.8         80.5        
covariate                                93.0         75.0        
================================================================================

This saves computation when you only want to compare specific scenarios. Unknown names raise a ValueError listing all available scenarios.


Custom Scenarios for Mixed Models

Mixed models have additional perturbation keys that control random effect and ICC violations. These keys are ignored for standard OLS models.

model = MCPower("score ~ treatment + (1|school)")
model.set_simulations(400)
model.set_cluster("school", ICC=0.15, n_clusters=25)
model.set_effects("treatment=0.40")

model.set_scenario_configs({
    "realistic": {
        "heterogeneity": 0.25,
        "icc_noise_sd": 0.20,               # More ICC instability
        "random_effect_dist": "normal",      # Keep random effects normal
    },
    "doomer": {
        "heterogeneity": 0.45,
        "icc_noise_sd": 0.40,               # Heavy ICC instability
        "random_effect_dist": "heavy_tailed",
        "random_effect_df": 3,               # Very heavy tails
        "residual_change_prob": 0.6,
        "residual_df": 4,
    },
})
model.find_power(sample_size=750, scenarios=True)

Mixed model configuration keys

Key

What It Does

Default (Realistic)

Default (Doomer)

icc_noise_sd

Jitters the ICC per simulation (multiplicative noise on tau-squared)

0.15

0.30

random_effect_dist

Distribution of random effects: "normal", "heavy_tailed", or "skewed"

"heavy_tailed"

"heavy_tailed"

random_effect_df

Degrees of freedom when random effects are non-normal

5

3

Lower random_effect_df means heavier tails (more extreme cluster deviations). These keys are silently ignored for models without cluster specifications.


Common Patterns

When effect sizes are well-established

If your field has strong meta-analytic estimates and the main concern is non-normality:

model.set_scenario_configs({
    "realistic": {
        "heterogeneity": 0.05,              # Effects are well-known
        "residual_change_prob": 0.5,         # But distributions may be non-normal
        "residual_df": 8,
    },
    "doomer": {
        "heterogeneity": 0.10,
        "residual_change_prob": 0.8,
        "residual_df": 4,
    },
})

When correlations are uncertain

If predictor correlations come from a small pilot study:

model.set_scenario_configs({
    "realistic": {
        "correlation_noise_sd": 0.35,        # More noise than default (0.2)
    },
    "doomer": {
        "correlation_noise_sd": 0.50,        # Much more noise than default (0.4)
    },
})

Disabling a specific perturbation

Set a key to its optimistic value (usually 0.0 or "normal") to disable it:

model.set_scenario_configs({
    "realistic": {
        "distribution_change_prob": 0.0,     # No distribution swaps
        "heteroskedasticity": 0.0,           # No heteroskedasticity
    },
})

Combine custom scenarios with uploaded data

Custom scenarios work alongside upload_data(). The scenario perturbations (correlation noise, distribution swaps) are applied on top of the data-driven generation:

import pandas as pd

data = pd.read_csv("pilot_data.csv")

model = MCPower("outcome = treatment + age + severity")
model.upload_data(data[["age", "severity"]])
model.set_variable_type("treatment=binary")
model.set_effects("treatment=0.50, age=0.20, severity=0.30")

model.set_scenario_configs({
    "realistic": {"heterogeneity": 0.30},
})

model.find_sample_size(
    target_test="treatment",
    from_size=50, to_size=300,
    scenarios=True,
)

Configuration Reference

All available configuration keys with their defaults:

Key

Type

Optimistic

Realistic

Doomer

Description

heterogeneity

float

0.0

0.2

0.4

SD of per-simulation effect size multiplier

heteroskedasticity

float

0.0

0.15

0.35

Correlation between predicted values and error SD

correlation_noise_sd

float

0.0

0.15

0.30

SD of noise added to predictor correlations

distribution_change_prob

float

0.0

0.5

0.8

Probability a variable’s distribution is swapped

new_distributions

list

["right_skewed", "left_skewed", "uniform"]

same

same

Pool of replacement distributions

residual_change_prob

float

0.0

0.5

0.8

Probability residuals use non-normal distribution

residual_df

int

10

8

5

Degrees of freedom for non-normal residuals

residual_dists

list

["heavy_tailed", "skewed"]

same

same

Pool of non-normal residual distributions

icc_noise_sd

float

0.0

0.15

0.30

SD of multiplicative noise on ICC (mixed models only)

random_effect_dist

str

"normal"

"heavy_tailed"

"heavy_tailed"

Random effect distribution (mixed models only)

random_effect_df

int

5

10

3

DF for non-normal random effects (mixed models only)


Next Steps