Setting up Custom Data Quality Checks for a Data Product

Data Quality System Overview

The Foundation backend provides a comprehensive data quality system that allows users to configure both automatic and custom data quality rules for their data products. The system supports:

  • Automatic Checks: Generated based on schema definitions (column types, constraints, etc.)

  • Custom Checks: User-defined rules using Great Expectations syntax

  • Quality scoring: Weighted scoring system with configurable thresholds

  • Validation execution: Automated quality checks and reporting

Key API Endpoints

1. Custom Expectation Management

Base URL: /api/data/data_product/quality/expectation/custom

Add Custom Expectation

POST /api/data/data_product/quality/expectation/custom?identifier={data_product_id}

Request Body (ExpectationItem):

{
  "type": "expect_column_values_to_be_between",
  "kwargs": {
    "column": "year",
    "min_value": 1980,
    "max_value": 2020
  },
  "meta": {
    "description": "Expect a year min max values"
  }
}

Update Custom Expectation

Delete Custom Expectation

2. Quality Configuration

Get Current Expectations

Update Quality Weights

Request Body:

Update Quality Thresholds

Request Body:

3. Quality Execution and Results

Run Quality Checks

Get Validation Results

Get Quality Overview

Custom Expectation Types

The system supports all Great Expectations expectation types. Here are common examples:

Column Value Expectations

Column Type Expectations

Uniqueness Expectations

Null Value Expectations

Regex Pattern Expectations

Complete Workflow

1. Configure Custom Expectations

2. Set Quality Weights (Optional)

3. Run Quality Checks

4. Review Results

Authentication & Permissions

All quality management endpoints require:

  • Manage permissions for creating/updating/deleting expectations

  • Read permissions for viewing results

  • Browse permissions for quality overview

The system uses the IAM framework to control access to data products and their quality configurations.

Last updated