Screenshot

Automate screenshot capture and integrate it into your pipelines

Screenshot is a community skill for capturing, processing, and managing screenshots programmatically in automation workflows, covering browser captures, desktop screenshots, annotation, comparison, and integration with testing and documentation pipelines.

What Is This?

Overview

Screenshot provides patterns for automating screenshot capture across browser and desktop environments. It covers full-page and element-specific browser captures, viewport configuration for responsive testing, image comparison for visual regression detection, annotation for documentation, and batch processing for generating screenshot sets across multiple pages or states. The skill integrates screenshot operations into CI, testing, and documentation generation workflows.

Who Should Use This

This skill serves QA engineers building visual regression test suites, documentation teams automating screenshot generation for user guides, and developers integrating screen capture into monitoring or reporting tools.

Why Use It?

Problems It Solves

Manual screenshot capture does not scale when documentation covers dozens of application states. Visual regression bugs go undetected without automated before-and-after comparison. Screenshots taken at different viewport sizes and device configurations require systematic automation to maintain consistency. Processing and annotating captured images by hand adds significant time to documentation workflows.

Core Highlights

Browser-based capture supports full-page, viewport-clipped, and element-specific screenshots through automation libraries. Viewport configuration generates responsive captures at mobile, tablet, and desktop dimensions from a single script. Image comparison detects pixel-level differences between baseline and current screenshots for visual regression alerts. Batch processing captures multiple pages and states in a single automated run with organized output directories.

How to Use It?

Basic Usage

from playwright.sync_api import sync_playwright
from pathlib import Path
from dataclasses import dataclass

@dataclass
class CaptureConfig:
    url: str
    output_dir: str = "screenshots"
    viewports: list = None

    def __post_init__(self):
        if self.viewports is None:
            self.viewports = [
                {"width": 1920, "height": 1080, "name": "desktop"},
                {"width": 768, "height": 1024, "name": "tablet"},
                {"width": 375, "height": 812, "name": "mobile"}
            ]
        Path(self.output_dir).mkdir(exist_ok=True)

def capture_responsive(config: CaptureConfig):
    with sync_playwright() as p:
        browser = p.chromium.launch()
        for vp in config.viewports:
            context = browser.new_context(
                viewport={"width": vp["width"], "height": vp["height"]}
            )
            page = context.new_page()
            page.goto(config.url, wait_until="networkidle")
            path = f"{config.output_dir}/{vp['name']}.png"
            page.screenshot(path=path, full_page=True)
            context.close()
        browser.close()

Real-World Examples

from PIL import Image, ImageChops
import math

class VisualDiff:
    def __init__(self, threshold: float = 0.01):
        self.threshold = threshold

    def compare(self, baseline_path: str, current_path: str) -> dict:
        baseline = Image.open(baseline_path)
        current = Image.open(current_path)
        if baseline.size != current.size:
            return {"match": False, "reason": "size mismatch"}
        diff = ImageChops.difference(baseline, current)
        pixels = list(diff.getdata())
        total = len(pixels)
        changed = sum(1 for p in pixels if sum(p[:3]) > 30)
        ratio = changed / total
        return {
            "match": ratio <= self.threshold,
            "diff_ratio": round(ratio, 4),
            "changed_pixels": changed,
            "total_pixels": total
        }

    def generate_diff_image(self, baseline: str, current: str, output: str):
        b = Image.open(baseline)
        c = Image.open(current)
        diff = ImageChops.difference(b, c)
        diff.save(output)

Advanced Tips

Wait for network idle state before capturing to avoid partial renders from lazy-loaded content. Use element-specific screenshots when only a component needs visual testing rather than capturing the full page. Store baseline images in version control with clear naming conventions that include viewport size and application state.

When to Use It?

Use Cases

Generate documentation screenshots automatically when application UI changes are deployed. Build visual regression test suites that compare current renders against approved baselines. Capture responsive layout variations across device viewports for design review sessions.

Related Topics

Playwright browser automation, visual regression testing tools, image comparison algorithms, responsive design testing, and automated documentation generation.

Important Notes

Requirements

A browser automation library such as Playwright or Puppeteer for web captures. Python with PIL or Pillow for image comparison operations. Sufficient disk space for storing screenshot archives across multiple viewport configurations.

Usage Recommendations

Do: use consistent viewport sizes and wait conditions across all captures for reliable comparisons. Set appropriate pixel difference thresholds that catch real changes while ignoring anti-aliasing variations. Organize output directories by date and configuration for easy retrieval.

Don't: compare screenshots taken at different viewport sizes or rendering settings. Store large screenshot archives without compression or cleanup policies. Rely solely on pixel comparison without human review for screenshots flagged as different.

Limitations

Font rendering differences across operating systems produce false positives in pixel comparison. Dynamic content like timestamps and ads change between captures regardless of code changes. Full-page screenshots of long pages produce large files that increase storage and comparison processing time.