Introduction to Obvyr

The Testing Confidence Problem

Do your tests actually protect production? Or are you just assuming they do?

Most engineering teams deploy with hope, not evidence:

Hope that flaky tests aren't hiding real bugs
Hope that local environment matches CI behaviour
Hope that your 3,000 tests are all providing value
Hope that AI-generated code has AI-reliable tests

Obvyr replaces hope with proof.

What Obvyr Does

Obvyr is a testing insights platform that proves test reliability through comprehensive data collection and pattern recognition. Instead of showing you point-in-time test results, Obvyr analyses patterns across thousands of test executions to reveal:

Which tests are truly flaky vs. which are genuinely broken
Where your environments diverge between local, CI, and production
Which tests catch bugs vs. which ones just slow down CI
Whether AI-generated tests actually validate behaviour or just assume it

You move from assumption-based testing ("we think our tests are good") to evidence-based testing ("we can prove our tests are reliable").

How Obvyr Organises Your Data

To prove test reliability, Obvyr collects and organises test execution data through a flexible hierarchy designed around how engineering teams actually work:

Organisations

Your organisation account represents your company or team. This is where billing is managed and where you control user access across all your projects.

Why it matters: Multi-tenant isolation ensures your test data is completely separate from other organisations, providing both security and clarity in pattern analysis.

Projects

Projects are logical groupings that make sense for your workflow. You might organise by:

Codebase (one project per repository) - Compare test patterns across different repositories
Service (frontend, API, mobile app) - Understand test reliability per service
Team (platform team, product team) - Track team-specific testing practices
Environment (staging, production) - Analyse environmental test differences

Why it matters: Flexible project organisation lets you analyse test patterns at the granularity that makes sense for your team, whether that's service-level, team-level, or environment-level insights.

Organise for Insights

There's no single "right" way. Use whatever structure helps you analyse your testing data most effectively. The goal is evidence-based insights, not rigid hierarchy.

CLI Agents

Within each project, you'll create CLI agents to collect data from specific types of testing activity. Each CLI agent has its own API key and wraps different commands to capture execution data.

Why it matters: Separate CLI agents for different test types (unit, integration, linting) let you analyse patterns specific to each quality check. You can identify which test types are flaky, which catch the most bugs, and which provide the best ROI.

Example setup:

Wyrd Tech (Organisation)
├── Obvyr API (Project)
│   ├── Typecheck (CLI Agent) - Tracks mypy execution patterns
│   ├── Lint (CLI Agent) - Monitors ruff/black reliability
│   └── Test (CLI Agent) - Analyses pytest behaviour
├── Obvyr CLI (Project)
│   ├── Typecheck (CLI Agent) - Mypy pattern tracking
│   └── Test (CLI Agent) - Pytest execution analysis
└── Obvyr UI (Project)
    ├── Lint (CLI Agent) - ESLint pattern monitoring
    └── Test (CLI Agent) - Vitest execution insights

Observations

Every time you run a command wrapped by the Obvyr CLI, it creates an observation. This captures:

Command output (stdout/stderr)
Execution duration and timing
User who ran the command
Environment context and variables
Test results and framework metadata

Why it matters: Individual observations are data points. Thousands of observations become patterns. Obvyr analyses these patterns to reveal:

Flaky tests: Tests that fail inconsistently across observations
Environment drift: Systematic differences between local and CI observations
Test value: Which tests catch bugs vs. which never fail
Performance trends: Tests getting slower over time

The Obvyr Difference

What Traditional Testing Shows You

✅ Test passed (but was it reliable or just lucky?)
❌ Test failed (but is it broken or flaky?)
📊 85% coverage (but does that coverage catch bugs?)
⏱️ 45-minute CI (but which tests provide value?)

What Obvyr Shows You

✅ "This test passed in 847/847 executions across all environments (100% reliable)"
❌ "This test failed in 23/150 executions, 91% correlated with CI runner 'ci-3' (environmental issue, not code issue)"
📊 "These 234 tests caught 94% of your bugs over 6 months (actual value, not assumed coverage)"
⏱️ "These 1,200 tests have never caught a bug and account for 63% of CI time (safe to remove)"

The Obvyr Workflow: From Setup to Insights

1. Set Up Your Structure

Create projects in the Obvyr dashboard that match how your team organises testing

Time: 2 minutes Value: Clear data organisation for targeted insights

2. Create CLI Agents

Within each project, create CLI agents for different test types you want to monitor

Time: 3 minutes Value: Separate pattern analysis for unit tests, integration tests, linting, type checking

3. Install and Configure the Obvyr CLI

Install the CLI and configure it with your CLI agent API keys

Time: 2 minutes Value: Start capturing comprehensive test execution data

4. Wrap Your Commands

Replace pytest tests/ with obvyr pytest tests/ (same for any test command)

Time: 1 minute Value: Zero workflow disruption, immediate data collection

5. Analyse the Insights

View patterns, trends, and evidence-based test reliability in the dashboard

Time: Ongoing Value: Prove test reliability, identify flaky tests, optimize CI, prevent incidents

What's Next?

Ready to prove your tests are reliable?

Why Obvyr? - Understand the full value proposition and what makes Obvyr different
Problems Solved - See detailed scenarios of specific testing challenges Obvyr solves
Getting Started - Set up your first project and start collecting evidence in 10 minutes

From Hope to Proof in 10 Minutes

Stop assuming your tests are reliable. Start proving it. Get started now.

Introduction to Obvyr ​

The Testing Confidence Problem ​

What Obvyr Does ​

How Obvyr Organises Your Data ​

Organisations ​

Projects ​

CLI Agents ​

Observations ​

The Obvyr Difference ​

What Traditional Testing Shows You ​

What Obvyr Shows You ​

The Obvyr Workflow: From Setup to Insights ​

1. Set Up Your Structure ​

2. Create CLI Agents ​

3. Install and Configure the Obvyr CLI ​

4. Wrap Your Commands ​

5. Analyse the Insights ​

What's Next? ​