vowl¶
vowl (vee-owl 🦉): a validation engine for Open Data Contract Standard (ODCS) data contracts. Define your validation rules once in a declarative YAML contract and get rich, actionable reports on your data's quality.
Key Features¶
- Extensible Check Engine: Ships with a SQL check engine out of the box, with the architecture designed to support custom check types beyond SQL.
- Auto-Generated Rules: Checks are automatically derived from contract metadata (
logicalType,logicalTypeOptions,required,unique,primaryKey) and library metrics (nullValues,missingValues,invalidValues,duplicateValues,rowCount). - Any DataFrame, Any Backend: Load any Narwhals-compatible DataFrame type (pandas, Polars, PySpark, etc.) or connect to 20+ backends via Ibis. SQL dialect translation is handled by SQLGlot.
- Server-Side Execution: SQL checks run server-side through Ibis without materialising tables on the client.
- Multi-Source Validation: Validate across tables in different source systems with cross-database joins.
- Declarative ODCS Contracts: Define validation rules in YAML following the Open Data Contract Standard.
- Flexible Filtering: Filter conditions with wildcard pattern matching, ideal for incremental validation of new data.
- Rich Reporting: Detailed summaries, row-level failure analysis, saveable reports, and a chainable
ValidationResultAPI. - No Silent Gaps: Unimplemented or unrecognised checks surface as
ERROR, not quietly skipped, so nothing slips through the cracks.
Quick Start¶
import pandas as pd
from vowl import validate_data
df = pd.read_csv("data.csv")
result = validate_data("contract.yaml", df=df)
result.display_full_report()
Optional extras: vowl[spark], vowl[all].
License¶
This project is licensed under the MIT License.