Tutorial: Finding the Right Sample Size

Goal: You want to find the minimum sample size needed for adequate power.


Table of Contents


Full Working Example

You are designing an educational intervention study. A new teaching method is tested against the standard approach, while controlling for students’ prior achievement. You need to determine the minimum number of students to recruit.

from mcpower import MCPower

# 1. Define the model
model = MCPower("test_score = teaching_method + prior_achievement")
model.set_simulations(400)

# 2. Set variable types
model.set_variable_type("teaching_method=binary")

# 3. Set expected effect sizes
model.set_effects("teaching_method=0.40, prior_achievement=0.25")

# 4. Search for the minimum sample size
model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=300,
    by=28,
)
Variable types: teaching_method=(binary,0.5)
Effects: teaching_method=0.4, prior_achievement=0.25
Model settings applied successfully
================================================================================
SAMPLE SIZE ANALYSIS RESULTS
================================================================================

Sample Size Requirements:
Test                                     Required N  
-----------------------------------------------------
teaching_method                          190         

Step-by-Step Walkthrough

Lines 1-3: Model setup

model = MCPower("test_score = teaching_method + prior_achievement")
model.set_simulations(400)
model.set_variable_type("teaching_method=binary")
model.set_effects("teaching_method=0.40, prior_achievement=0.25")

This is the same setup as a basic power analysis (see Tutorial: Your First Power Analysis). We define a model with one binary predictor (teaching_method) and one continuous covariate (prior_achievement), then set their expected effect sizes.

  • teaching_method=0.40 – a small-to-medium effect for a binary predictor (between 0.20 small and 0.50 medium)

  • prior_achievement=0.25 – a medium effect for a continuous predictor

Line 4: Search for the minimum sample size

model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=300,
    by=28,
)

Instead of checking power at a single sample size, find_sample_size evaluates power across a search grid:

Parameter

Value

Meaning

target_test

"teaching_method"

Find the sample size needed to detect this specific effect

from_size

50

Start searching at N=50

to_size

300

Stop searching at N=300

by

10

Evaluate power every 10 participants: N=50, 60, 70, …, 300

MCPower runs 1,600 simulations at each sample size in the grid and reports the smallest N where power first reaches the target (default: 80%).

The search grid concept

The search grid is the set of sample sizes that MCPower evaluates:

N=50 → N=60 → N=70 → ... → N=200 (first ≥ 80%) → ... → N=300

MCPower evaluates every point in the grid. Smaller by values give a more precise answer but take longer. Larger by values run faster but might overshoot the true minimum.

Practical advice:

  • Start with by=10 or by=25 for a rough estimate

  • Narrow the range and reduce by for a precise answer

  • The default grid (from_size=30, to_size=200, by=5) works well for many common designs


Output Interpretation

Sample Size Requirements:
Test                                     Required N
-----------------------------------------------------
teaching_method                          200

Column

Meaning

Test

The coefficient being tested

Required N

The smallest sample size in the grid where power meets the target

Verdict: You need at least 200 students (100 per group) to detect the teaching method effect at alpha = 0.05.

If the search range is exhausted without reaching the target, increase to_size or reconsider your effect size assumptions.


Adding Scenario Analysis

Use scenarios=True to see how the required sample size changes under realistic conditions:

model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=300,
    by=28,
    scenarios=True,
)
================================================================================
SCENARIO-BASED MONTE CARLO POWER ANALYSIS RESULTS
================================================================================
================================================================================
SCENARIO SUMMARY
================================================================================

Uncorrected Sample Sizes:
Test                                     Optimistic   Realistic    Doomer      
-------------------------------------------------------------------------------
teaching_method                          190          218          218         
================================================================================

Scenario

Min N

Interpretation

Optimistic

200

If everything goes perfectly

Realistic

200

Under mild violations of assumptions

Doomer

220

Under strong violations of assumptions

Planning advice: Use the Realistic estimate (N=200) for your primary justification. If you can afford N=220 (the Doomer estimate), your study is robust against substantial assumption violations.


Raising the Target to 90% Power

The default target is 80%. To use 90% power instead:

model.set_power(90)

model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=400,
    by=40,
)
================================================================================
SAMPLE SIZE ANALYSIS RESULTS
================================================================================

Sample Size Requirements:
Test                                     Required N  
-----------------------------------------------------
teaching_method                          250         

Moving from 80% to 90% power typically requires 20-30% more participants.


Common Variations

Detailed output with power curve

model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=300,
    by=28,
    summary="long",
)

The summary="long" option prints a full table showing power at every sample size in the grid, and displays a power curve plot (when running in an environment that supports plotting). This is useful for seeing how power grows across the range, rather than just the single threshold crossing.

Test all effects

model.find_sample_size(
    target_test="all",
    from_size=50,
    to_size=300,
    by=28,
)

This reports the minimum N for each predictor and the overall F-test. Different effects may require different sample sizes.

Combine scenarios with corrections

model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=400,
    by=40,
    scenarios=True,
    correction="bonferroni",
)

When testing multiple effects with a correction for multiple comparisons, the required sample size increases. Scenario analysis then shows how much additional buffer you need for assumption violations.

Get results programmatically

result = model.find_sample_size(
    target_test="teaching_method",
    from_size=50,
    to_size=300,
    by=28,
    return_results=True,
    print_results=False,
)

min_n = result["results"]["first_achieved"]["teaching_method"]
print(f"Minimum sample size: {min_n}")

Next Steps