Tutorial: Finding the Right Sample Size¶

Goal: You want to find the minimum sample size needed for adequate power.

Table of Contents¶

Full Working Example
Step-by-Step Walkthrough
Output Interpretation
Adding Scenario Analysis
Raising the Target to 90% Power
Common Variations
Next Steps

Full Working Example¶

You are designing an educational intervention study. A new teaching method is tested against the standard approach, while controlling for students’ prior achievement. You need to determine the minimum number of students to recruit.

from mcpower import MCPower

# 1. Define the model
model = MCPower("test_score = teaching_method + prior_achievement")
model.set_simulations(400)

# 2. Set variable types
model.set_variable_type("teaching_method=binary")

# 3. Set expected effect sizes
model.set_effects("teaching_method=0.40, prior_achievement=0.25")

# 4. Search for the minimum sample size
model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=300,
    by=28,
)

Variable types: teaching_method=(binary,0.5)
Effects: teaching_method=0.4, prior_achievement=0.25
Model settings applied successfully

================================================================================
SAMPLE SIZE ANALYSIS RESULTS
================================================================================

Sample Size Requirements:
Test                                     Required N  
-----------------------------------------------------
teaching_method                          190         

Step-by-Step Walkthrough¶

Lines 1-3: Model setup¶

model = MCPower("test_score = teaching_method + prior_achievement")
model.set_simulations(400)
model.set_variable_type("teaching_method=binary")
model.set_effects("teaching_method=0.40, prior_achievement=0.25")

This is the same setup as a basic power analysis (see Tutorial: Your First Power Analysis). We define a model with one binary predictor (teaching_method) and one continuous covariate (prior_achievement), then set their expected effect sizes.

teaching_method=0.40 – a small-to-medium effect for a binary predictor (between 0.20 small and 0.50 medium)
prior_achievement=0.25 – a medium effect for a continuous predictor

Line 4: Search for the minimum sample size¶

model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=300,
    by=28,
)

Instead of checking power at a single sample size, find_sample_size evaluates power across a search grid:

Parameter	Value	Meaning
`target_test`	`"teaching_method"`	Find the sample size needed to detect this specific effect
`from_size`	`50`	Start searching at N=50
`to_size`	`300`	Stop searching at N=300
`by`	`10`	Evaluate power every 10 participants: N=50, 60, 70, …, 300

MCPower runs 1,600 simulations at each sample size in the grid and reports the smallest N where power first reaches the target (default: 80%).

The search grid concept¶

The search grid is the set of sample sizes that MCPower evaluates:

N=50 → N=60 → N=70 → ... → N=200 (first ≥ 80%) → ... → N=300

MCPower evaluates every point in the grid. Smaller by values give a more precise answer but take longer. Larger by values run faster but might overshoot the true minimum.

Practical advice:

Start with by=10 or by=25 for a rough estimate
Narrow the range and reduce by for a precise answer
The default grid (from_size=30, to_size=200, by=5) works well for many common designs

Output Interpretation¶

Sample Size Requirements:
Test                                     Required N
-----------------------------------------------------
teaching_method                          200

Column	Meaning
Test	The coefficient being tested
Required N	The smallest sample size in the grid where power meets the target

Verdict: You need at least 200 students (100 per group) to detect the teaching method effect at alpha = 0.05.

If the search range is exhausted without reaching the target, increase to_size or reconsider your effect size assumptions.

Adding Scenario Analysis¶

Use scenarios=True to see how the required sample size changes under realistic conditions:

model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=300,
    by=28,
    scenarios=True,
)

================================================================================
SCENARIO-BASED MONTE CARLO POWER ANALYSIS RESULTS
================================================================================

================================================================================
SCENARIO SUMMARY
================================================================================

Uncorrected Sample Sizes:
Test                                     Optimistic   Realistic    Doomer      
-------------------------------------------------------------------------------
teaching_method                          190          218          218         
================================================================================

Scenario	Min N	Interpretation
Optimistic	200	If everything goes perfectly
Realistic	200	Under mild violations of assumptions
Doomer	220	Under strong violations of assumptions

Planning advice: Use the Realistic estimate (N=200) for your primary justification. If you can afford N=220 (the Doomer estimate), your study is robust against substantial assumption violations.

Raising the Target to 90% Power¶

The default target is 80%. To use 90% power instead:

model.set_power(90)

model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=400,
    by=40,
)

================================================================================
SAMPLE SIZE ANALYSIS RESULTS
================================================================================

Sample Size Requirements:
Test                                     Required N  
-----------------------------------------------------
teaching_method                          250         

Moving from 80% to 90% power typically requires 20-30% more participants.

Common Variations¶

Detailed output with power curve¶

model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=300,
    by=28,
    summary="long",
)

The summary="long" option prints a full table showing power at every sample size in the grid, and displays a power curve plot (when running in an environment that supports plotting). This is useful for seeing how power grows across the range, rather than just the single threshold crossing.

Test all effects¶

model.find_sample_size(
    target_test="all",
    from_size=50,
    to_size=300,
    by=28,
)

This reports the minimum N for each predictor and the overall F-test. Different effects may require different sample sizes.

Combine scenarios with corrections¶

model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=400,
    by=40,
    scenarios=True,
    correction="bonferroni",
)

When testing multiple effects with a correction for multiple comparisons, the required sample size increases. Scenario analysis then shows how much additional buffer you need for assumption violations.

Coarse-to-fine search¶

For efficient searching, start coarse and then zoom in:

# Step 1: Rough scan
model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=500,
    by=50,
)
# Result: Min N ≈ 200

# Step 2: Precise scan around the rough estimate
model.find_sample_size(
    target_test="teaching_method",
    from_size=150,
    to_size=250,
    by=12,
)
# Result: Min N = 200

Get results programmatically¶

result = model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=300,
    by=28,
    return_results=True,
    print_results=False,
)

min_n = result["results"]["first_achieved"]["teaching_method"]
print(f"Minimum sample size: {min_n}")

Next Steps¶

Tutorial: Your First Power Analysis – Check power at a specific sample size
Tutorial: Testing Interactions – Interactions typically require larger samples
Tutorial: Correlated Predictors – Predictor correlations affect sample size requirements
Effect Sizes – Guidelines for choosing appropriate effect sizes
Scenario Analysis – Understanding the three scenarios in depth
API Reference – Full parameter documentation for find_sample_size