Monitoring Data Quality

Foundation provides continuous, automated data quality monitoring that ensures your data products maintain high standards of reliability and trustworthiness throughout their lifecycle.

How Data Quality Checks Work

Foundation continuously validates data against configured quality rules as it flows through the platform:

  • Rules execute automatically during data ingestion and transformation

  • Each data product can have multiple quality checkpoints

  • Validation occurs before data reaches downstream consumers

  • Failed validations can trigger alerts or halt data pipelines based on severity

Comprehensive Rule Library

The platform offers an extensive set of pre-built quality rules that users can configure:

  • Column-level validations: Check that values match expected formats, ranges, or patterns

  • Cross-column rules: Validate relationships between multiple fields (e.g., end_date > start_date)

  • Dataset completeness: Ensure expected records are present and accounted for

  • Referential integrity: Verify foreign key relationships and data consistency across products

  • Business logic checks: Implement custom rules specific to your domain requirements

Default Data Quality Checks

Foundation automatically defines a series of data quality validations for every new data product in order to verify its basic health. These tests do not include the kind of logic validation that could only come from a subject matter expert. If no custom checks are defined, all that Foundation can assure is that no rows will have been deleted unintentionally, that no columns will be lost in the process, that the data types will be the maintained unless changed, and so on.

Integration with Data Contracts

Quality monitoring seamlessly integrates with Foundation's data contracts:

  • SLA enforcement: Monitor whether data products meet promised quality level

  • Automated documentation: Quality metrics are automatically included in data product metadata

  • Trust scores: Generate composite quality scores that factor into data product discovery

This comprehensive quality monitoring ensures that data consumers can trust the data products they use, while data producers maintain visibility into their data quality performance, creating a culture of data excellence across the organization.

Monitoring Data Quality through the Data Quality Hub

  1. Navigate to the Data Quality page in the Navigation Bar

This UI allows you to have a 10,000 feet view of data quality across the data products you have access to. On the Overview page, you can observe:

  • Number of data products included in dashboard

  • Last time the dashboard was updated

  • Average score for the data products which have data quality check set up (and number of data products excluded)

  • Number of data products with a Low data quality score (defined as below 90%)

  • Number of data products with Healthy data quality (defined as above 90% score)

  • Evolution of average score over the last 30 days

  • Distribution of data products across scores

  • Total rows over time (added for all data products)

  • Total columns over time (added for all data products)

  • Percentage of data products with low score for each dimension (defined as below 90%)

    • You can click on the number of data products which have the low score (under the %) to list them in the Data Products tab

When you click on the "Data Products" tab, you retrieve the list of all data products you have access to.

This view allows users to sort the data products by quality score and to filter them by Health status and by Data Quality Dimension issue.

For each data product, the user can monitor:

  • Number of rows

  • Number of columns

  • Last time it was updated

  • Quality score

  • Quality status (Healthy, Unknown, Low Score)

  • Quality score for each dimension

  • Description

  • Row count over time

  • Column count over time

You can click on "view more" to visit the Data Product page and review additional details.

You can also click on each dimension's box to review the quality checks that passed or failed:

Monitoring Data Quality through the Data Catalog

  1. Navigate to the Data Catalog page

  2. Search for the Data Product you want to monitor the data quality for

  3. Click on the Quality tab

You will be able to find:

  • Overall Score (a percentage based on how many data quality checks passed successfully)

  • Last time the data quality checks were run

  • Score per data quality dimension

  • Failed expectations for each dimension, if any.

N/A will be returned when no expectations were defined for a particular dimension.

Monitoring Data Quality through the API

Foundation offers API endpoints to read the Profile of a Data Product:

/api/data/data_product/quality/validations
/api/data/data_product/

Read the Using the Foundation APIs page to leverage this option in the best way possible.

Last updated