Statistical Analysis

Advanced statistical analysis automation and integration for data-driven decision making and research insights

Source: K-Dense-AI/claude-scientific-skills

Statistical Analysis is a community skill for data analysis using statistical methods, covering hypothesis testing, regression analysis, descriptive statistics, probability distributions, and experimental design for data-driven decision making.

What Is This?

Overview

Statistical Analysis provides guidance on applying statistical methods to extract insights from data. It covers hypothesis testing that evaluates claims using sample data, regression analysis that models relationships between variables, descriptive statistics that summarize datasets with central tendency and dispersion, probability distributions that model data generation processes, and experimental design that structures studies for causal inference. The skill helps analysts draw valid conclusions from data.

Who Should Use This

This skill serves data analysts performing exploratory and confirmatory analysis, researchers designing experiments and testing hypotheses, and product teams running A/B tests to measure feature impact.

Why Use It?

Problems It Solves

Drawing conclusions from data without statistical rigor leads to false discoveries. Small sample sizes produce unreliable estimates that fail to replicate. Multiple comparisons without correction inflate false positive rates. Confounding variables create spurious correlations that mislead decision making.

Core Highlights

Hypothesis tester evaluates statistical significance with proper corrections. Regression modeler fits and validates predictive relationships. Distribution fitter identifies data-generating processes. Experiment designer structures studies for causal inference.

How to Use It?

Basic Usage

import numpy as np
from scipy import stats

class StatAnalyzer:
  def __init__(
    self, data: list
  ):
    self.data = np.array(
      data)

  def describe(self):
    return {
      'mean': float(
        np.mean(self.data)),
      'median': float(
        np.median(
          self.data)),
      'std': float(
        np.std(
          self.data,
          ddof=1)),
      'n': len(self.data),
      'ci_95': self.ci(
        0.95)}

  def ci(
    self, level: float
  ) -> tuple:
    n = len(self.data)
    mean = np.mean(
      self.data)
    se = stats.sem(
      self.data)
    h = se * stats.t.ppf(
      (1 + level) / 2,
      n - 1)
    return (
      round(mean - h, 3),
      round(mean + h, 3))

  def t_test(
    self,
    other: list,
    alpha: float = 0.05
  ) -> dict:
    t_stat, p_val = (
      stats.ttest_ind(
        self.data,
        np.array(other)))
    return {
      't_stat': round(
        t_stat, 4),
      'p_value': round(
        p_val, 4),
      'significant':
        p_val < alpha}

control = [4.2, 3.8, 4.5,
  4.1, 3.9, 4.3, 4.0]
treatment = [4.8, 5.1,
  4.6, 5.0, 4.9, 4.7]

analyzer = StatAnalyzer(
  control)
desc = analyzer.describe()
print(f'Mean: {desc["mean"]}')
result = analyzer.t_test(
  treatment)
print(
  f'p={result["p_value"]}, '
  f'sig={result["significant"]}')

Real-World Examples

import numpy as np
from scipy import stats

class ABTestAnalyzer:
  def __init__(
    self,
    control_conv: int,
    control_n: int,
    treat_conv: int,
    treat_n: int
  ):
    self.p_c = (
      control_conv
      / control_n)
    self.p_t = (
      treat_conv
      / treat_n)
    self.n_c = control_n
    self.n_t = treat_n

  def z_test(self) -> dict:
    p_pool = (
      (self.p_c * self.n_c
       + self.p_t
       * self.n_t)
      / (self.n_c
         + self.n_t))
    se = np.sqrt(
      p_pool
      * (1 - p_pool)
      * (1/self.n_c
         + 1/self.n_t))
    z = (self.p_t
      - self.p_c) / se
    p_val = 2 * (
      1 - stats.norm.cdf(
        abs(z)))
    return {
      'control_rate':
        round(self.p_c, 4),
      'treatment_rate':
        round(self.p_t, 4),
      'lift': round(
        (self.p_t
         - self.p_c)
        / self.p_c
        * 100, 2),
      'z_stat': round(
        z, 4),
      'p_value': round(
        p_val, 4)}

  def sample_size(
    self,
    mde: float,
    alpha: float = 0.05,
    power: float = 0.8
  ) -> int:
    z_a = stats.norm.ppf(
      1 - alpha / 2)
    z_b = stats.norm.ppf(
      power)
    p = self.p_c
    n = (
      (z_a + z_b) ** 2
      * 2 * p * (1 - p)
      / mde ** 2)
    return int(
      np.ceil(n))

ab = ABTestAnalyzer(
  control_conv=120,
  control_n=1000,
  treat_conv=145,
  treat_n=1000)
result = ab.z_test()
print(
  f'Lift: {result["lift"]}%')
needed = ab.sample_size(
  mde=0.02)
print(
  f'Need {needed} per group')

Advanced Tips

Apply Bonferroni or Benjamini-Hochberg corrections when running multiple hypothesis tests to control false discovery rates. Use power analysis before experiments to determine required sample sizes. Check assumptions like normality and homoscedasticity before applying parametric tests.

When to Use It?

Use Cases

Run an A/B test comparing conversion rates between control and treatment groups with statistical significance. Fit a regression model predicting customer lifetime value from behavioral features. Calculate sample sizes needed to detect a target effect with adequate power.

Important Notes

Requirements

Python with NumPy and SciPy for statistical computations. Data in numeric format suitable for the chosen statistical methods. Understanding of the assumptions underlying each test.

Usage Recommendations

Do: check test assumptions before applying parametric methods and use non-parametric alternatives when assumptions are violated. Report effect sizes alongside p-values to communicate practical significance. Pre-register hypotheses and analysis plans before running experiments.

Don't: interpret correlation as causation without controlled experimental design. Continue collecting data after peeking at results since this inflates false positive rates. Use p-value thresholds as the sole criterion for decision making.

Limitations

Statistical significance does not imply practical importance since large samples detect trivially small effects. Parametric tests produce unreliable results when assumptions are violated. Observational studies cannot establish causation regardless of sophistication.

More Skills You Might Like

Explore similar skills to enhance your workflow