---
jupytext:
  text_representation:
    format_name: myst
kernelspec:
  name: python3
---

# Tutorial: Your First Power Analysis

**Goal:** You want to check if your planned sample size provides enough statistical power.

```{code-cell} ipython3
:tags: [remove-input, remove-output]
import numpy as np
np.random.seed(42)
import warnings
warnings.filterwarnings("ignore", message="Low simulation")
```

---

## Table of Contents

- [Full Working Example](#full-working-example)
- [Step-by-Step Walkthrough](#step-by-step-walkthrough)
- [Output Interpretation](#output-interpretation)
- [Adding Scenario Analysis](#adding-scenario-analysis)
- [Common Variations](#common-variations)
- [Next Steps](#next-steps)

---

## Full Working Example

You are running a clinical trial to test whether a new drug improves patient recovery, while controlling for baseline health. You plan to recruit 100 participants and want to know if that is enough to detect the drug's effect.

```{code-cell} ipython3
:tags: [remove-stderr]
from mcpower import MCPower

# 1. Define the model
model = MCPower("recovery = drug + baseline_health")
model.set_simulations(400)

# 2. Declare the treatment as a binary variable (drug vs. placebo)
model.set_variable_type("drug=binary")

# 3. Set expected effect sizes (standardized coefficients)
model.set_effects("drug=0.50, baseline_health=0.25")

# 4. Check power at N=100 for the drug effect
model.find_power(sample_size=100, target_test="drug")
```

---

## Step-by-Step Walkthrough

### Line 1: Import MCPower

```{code-cell} ipython3
:tags: [remove-output]
from mcpower import MCPower
```

`MCPower` is the only class you need. It handles model definition, simulation, and analysis.

### Line 2: Define the model

```{code-cell} ipython3
:tags: [remove-output]
model = MCPower("recovery = drug + baseline_health")
model.set_simulations(400)
```

This creates a regression model where `recovery` is the outcome and `drug` and `baseline_health` are predictors. The formula uses R-style syntax -- you can also write `recovery ~ drug + baseline_health`.

### Line 3: Set variable types

```{code-cell} ipython3
:tags: [remove-output]
model.set_variable_type("drug=binary")
```

By default, MCPower treats all predictors as continuous (standard normal). Since `drug` is a 0/1 treatment assignment, we declare it as binary. `baseline_health` remains continuous.

### Line 4: Set effect sizes

```{code-cell} ipython3
:tags: [remove-output]
model.set_effects("drug=0.50, baseline_health=0.25")
```

Effect sizes are standardized regression coefficients (betas):

- **drug=0.50** -- receiving the drug increases recovery by 0.50 SD compared to placebo. This is a medium effect for a binary predictor.
- **baseline_health=0.25** -- each 1 SD increase in baseline health increases recovery by 0.25 SD. This is a medium effect for a continuous predictor.

### Line 5: Run the analysis

```{code-cell} ipython3
:tags: [remove-output, remove-stderr]
model.find_power(sample_size=100, target_test="drug")
```

MCPower runs 1,600 Monte Carlo simulations (the default for OLS models). In each simulation it:

1. Generates a dataset of 100 participants
2. Fits the regression model
3. Tests whether `drug` is statistically significant at alpha = 0.05

The proportion of simulations where `drug` is significant is the estimated power.

The `target_test="drug"` parameter focuses the output on the drug coefficient. Without it, MCPower reports power for every predictor plus the overall F-test.

---

## Output Interpretation

```
Test                                     Power    Target   Status
-------------------------------------------------------------------
drug                                     70.7     80       ✗
```

| Column | Meaning |
|---|---|
| **Test** | The coefficient being tested (here, `drug`) |
| **Power** | Percentage of 1,600 simulations where the test was significant. 70.7% means 70.7% of simulated datasets detected the drug effect. |
| **Target** | The target power level (default: 80%) |
| **Status** | `✓` if power meets the target; `✗` if it falls short |

**Verdict:** With N=100 and a medium effect (0.50), you have 70.7% power -- below the 80% threshold. You may need a larger sample size.

---

## Adding Scenario Analysis

The basic analysis assumes perfect conditions. In reality, effects might be slightly different, distributions might not be exactly normal, and variance might not be constant. Use `scenarios=True` to test robustness:

```{code-cell} ipython3
:tags: [remove-stderr]
model.find_power(sample_size=100, target_test="drug", scenarios=True)
```

| Scenario | What It Simulates |
|---|---|
| **Optimistic** | Your exact settings -- best case |
| **Realistic** | Mild effect variations, small assumption violations |
| **Doomer** | Larger variations, stronger violations -- worst case |

Here, none of the scenarios meet 80% power, and the **Doomer** scenario drops to 66.5%. If you want to be safe even under pessimistic conditions, consider increasing your sample size.

---

## Common Variations

### Test all effects at once

```{code-cell} ipython3
:tags: [remove-output, remove-stderr]
model.find_power(sample_size=100, target_test="all")
```

This reports power for `drug`, `baseline_health`, and the overall F-test.

### Test multiple specific effects

```{code-cell} ipython3
:tags: [remove-output, remove-stderr]
model.find_power(sample_size=100, target_test="drug, baseline_health")
```

### Test only the overall model

```{code-cell} ipython3
:tags: [remove-output, remove-stderr]
model.find_power(sample_size=100, target_test="overall")
```

The overall F-test checks whether the model as a whole explains significant variance. It is almost always more powerful than individual coefficient tests.

### Get detailed output

```{code-cell} ipython3
:tags: [remove-output, remove-stderr]
model.find_power(sample_size=100, target_test="drug", summary="long")
```

The `summary="long"` option prints additional detail about the model configuration and results.

### Get results as a Python dictionary

```{code-cell} ipython3
:tags: [remove-output, remove-stderr]
result = model.find_power(
    sample_size=100,
    target_test="drug",
    return_results=True,
    print_results=False,
)

power_drug = result["results"]["individual_powers"]["drug"]
print(f"Power for drug effect: {power_drug}%")
```

### Change the significance level

```{code-cell} ipython3
:tags: [remove-output, remove-stderr]
model.set_alpha(0.01)
model.find_power(sample_size=100, target_test="drug")
```

A stricter alpha (0.01 instead of 0.05) requires stronger evidence, so power will be lower.

### Set a reproducible seed

```{code-cell} ipython3
:tags: [remove-output, remove-stderr]
model.set_seed(42)
model.find_power(sample_size=100, target_test="drug")
```

Setting a seed ensures you get the same results each time you run the analysis.

---

## Next Steps

- **[Tutorial: Finding the Right Sample Size](sample-size.md)** -- Search for the minimum N that achieves your target power
- **[Tutorial: Testing Interactions](interactions.md)** -- Check if the effect of one variable depends on another
- **[Tutorial: Correlated Predictors](correlations.md)** -- Account for correlations between predictors
- **[Effect Sizes](../concepts/effect-sizes.md)** -- Guidelines for choosing appropriate effect sizes
- **[Variable Types](../concepts/variable-types.md)** -- Binary, factor, and distribution options
- **[Scenario Analysis](../concepts/scenario-analysis.md)** -- Understanding the three scenarios in depth
- **[API Reference](../api/index.md)** -- Full parameter documentation for `find_power` and all configuration methods