Managing Master & Reference Data

Overview

Master Data Management (MDM) is essential for maintaining data quality, consistency, and governance across your organization. Foundation provides built-in capabilities to categorize and manage your most critical data assets through Master and Reference data product categories, enabling you to establish clear data ownership, lineage, and quality standards.

This guide explains how to implement effective master data management practices using Foundation's data product categorization features.

Understanding Master vs Reference Data

Master Data

Master data represents the core business entities that are shared across your organization and require consistent management. These are your critical business objects that multiple systems and processes depend on.

Common examples include:

Customer records (accounts, contacts, demographics)
Product information (SKUs, specifications, pricing)
Employee data (personnel records, organizational structure)
Asset information (equipment, facilities, inventory)
Supplier and vendor details

Characteristics:

High business value and impact
Shared across multiple domains and use cases
Requires strict governance and quality controls
Changes infrequently but needs careful management
Single source of truth for the business entity

Reference Data

Reference data consists of standardized lookup values, codes, and classifications that provide context and consistency across your data ecosystem. This data categorizes and validates other data.

Common examples include:

Country and currency codes
Status codes (order status, account status)
Product categories and hierarchies
Industry classifications
Units of measurement
Time zones and calendar data

Characteristics:

Relatively static and standardized
Used for validation and categorization
Often industry-standard or regulatory-defined
Lower volume but high reuse across systems
Provides consistent terminology

Why Categorize Data Products

Properly categorizing data products as Master or Reference types provides several governance and operational benefits:

Data Governance

Establishes clear ownership and accountability for critical data
Enables appropriate access controls and security policies
Supports compliance with data protection regulations
Creates audit trails for sensitive business entities

Data Quality

Focuses quality improvement efforts on high-value data products
Establishes an expectation for stricter validation rules for master data
Stronger data quality rules reduce data duplication and inconsistency
Facilitates data stewardship activities

Operational Efficiency

Helps users quickly identify authoritative data sources
Improves data discovery through meaningful categorization
Guides integration and architecture decisions
Supports impact analysis when changes are needed

Decision Intelligence

Ensures AI models use trusted, high-quality data
Provides context for analytics and reporting
Enables consistent business metrics across the organization

Creating Master and Reference Data Products in Foundation

Prerequisites

Before starting, ensure you have:

The right permissions
Existing data products created in Foundation
Understanding of your organization's critical business entities
Defined data governance policies and ownership structure

Step 1: Identify Critical Entities

Review your existing data products and identify which ones represent master or reference data:

For Master Data, ask:

Is this a core business entity used across multiple domains?
Does this data require strict governance and quality controls?
Would inconsistencies in this data significantly impact business operations?
Is this the authoritative source for this entity?

For Reference Data, ask:

Does this provide standardized codes or classifications?
Is this used primarily for validation or categorization?
Is this data relatively static and standardized?
Does this support data consistency across systems?

If you identify some of them need to be joined together to create the full picture for that entity, then you will need to create a new Data Product and can proceed to step 2. If, on the other hand, there are Data Products that meet the requirements of Master Data and just need stronger quality controls or a change in ownership, you can go straight to step 3 or step 4.

Step 2: Build the Master or Reference Data Product with the Right Transformations

To learn how to create data products, please visit Creating and Managing Data Products

To learn how to use the transformations, please visit Configuring Data Transformations through the UI or Using the API to implement Data Transformations.

Considerations for Master Data Products

When combining data from multiple sources, establish rules for which source takes precedence:

Source Priority: Specify which source is authoritative for each field
- Example: Salesforce for customer contact info, SAP for billing address
Most Recent Wins: Use the latest updated value across sources
Most Complete Wins: Choose the record with fewest null values
Custom Business Rules: Apply specific logic (e.g., "Use ERP price unless promotional price exists in e-commerce")

Best Practice: Start with a minimal viable transformation pipeline and add complexity iteratively. Test thoroughly at each stage before adding more transformations.

Performance Tip: For large datasets, consider using incremental processing transformations that only process changed records rather than the full dataset on each refresh.

Step 3: Access Data Product Configuration

Please visit Managing Data Product Metadatato understand how to configure or edit a data product.

Step 4: Set the Data Product Category and other Metadata

In the Data Product Category field, select one of the following:
- Master - For core business entities
- Reference - For lookup values and classifications
Add supporting metadata.
Configure governance settings specific to master/reference data:
- Data Quality Rules: Define completeness, accuracy, and validity checks
- Update Frequency: Specify expected refresh schedules
- Access Controls: Implement stricter permissions if needed
Click Save to apply the categorization

Step 5: Understand Data Lineage

For master and reference data products, thorough lineage documentation is critical:

Navigate to the Lineage tab of your data product
Review the automatically generated lineage graph showing:
- All connected data sources
- Source-aligned data products
- Each transformation step applied
- The final master data product
Review downstream dependencies:
- Use the Lineage UI to identify which data products and applications consume this master data
- Understand the impact radius for potential changes
Document the transformation pipeline:
- List each transformation applied and why
- Explain how data quality improves through the pipeline
- Note any data loss or filtering that occurs

Read Exploring Data Lineageto understand more.

Best Practices

Start with High-Impact Entities

Focus your initial MDM efforts on the master data that has the highest business impact:

Begin with customer or product data if you're in retail
Start with asset or equipment data if you're in logistics or manufacturing
Prioritize employee data if you're implementing HR analytics

Establish Clear Ownership

For each master data product:

Assign an owner from the domain that knows the data best
Designate a data steward responsible for day-to-day quality
Document escalation paths for data issues
Create a RACI matrix for data governance activities

Design Transformation Pipelines for Maintainability

When building transformation pipelines for master data:

Use descriptive names for each transformation step
Document the business rationale for each transformation
Keep transformations modular and reusable
Test transformations independently before chaining
Version your transformation logic alongside the data product
Favor library transformations over custom code for maintainability

Implement Progressive Governance

Don't try to enforce perfect governance from day one:

Phase 1: Categorize and document existing master data
Phase 2: Implement basic quality rules and monitoring
Phase 3: Add approval workflows and access controls
Phase 4: Establish formal data contracts and SLAs
Phase 5: Continuously improve based on usage patterns

Monitor and Maintain

Master data management is an ongoing process:

Review quality metrics weekly or monthly
Conduct quarterly reviews of categorizations
Update documentation as business needs evolve
Gather feedback from data consumers
Track and resolve quality issues promptly
Refine transformation logic based on discovered data issues

Common Use Cases

Customer Master Data

Scenario: Creating a single customer view across CRM, ERP, and support systems

Implementation:

Connect data sources: Salesforce (CRM), SAP (ERP), Zendesk (Support)
Create source-aligned data products for each system
Build a "Customer Master" consumer-aligned data product
Apply transformations from the library:
- Standardize customer names and addresses
- Normalize phone numbers and email addresses
- Deduplicate records using fuzzy matching on name + address
- Merge records with golden record logic (Salesforce for contact info, SAP for billing)
- Enrich with geography data (state, country from postal code)
- Validate email addresses and flag invalid entries
Categorize the Data Product as Master Data
Configure data quality rules for:
- Unique customer IDs
- Valid email formats and phone numbers
- Complete address information
Review the lineage graphs showing transformation pipeline
Implement access controls for PII compliance
Monitor usage and quality metrics

Product Reference Data

Scenario: Maintaining standardized product categories and hierarchies

Implementation:

Connect to PIM system and e-commerce platform
Create source-aligned data products
Build "Product Categories" consumer-aligned data product
Apply transformations:
- Standardize category names (title case, trim whitespace)
- Build hierarchical relationships (parent-child)
- Validate hierarchy completeness (no orphaned categories)
- Add category descriptions from multiple sources
Categorize as Reference Data
Configure quality checks for:
- Complete category hierarchies
- Standardized naming conventions
- No orphaned categories
Review the lineage graphs showing transformation pipeline
Implement access controls for PII compliance
Monitor usage and quality metrics

Employee Master Data

Scenario: Centralizing HR data for analytics and operations

Implementation:

Connect to HRIS (Workday), Active Directory, and Payroll system
Create source-aligned data products for each
Build "Employee Master" consumer-aligned data product
Apply transformations:
- Standardize employee names
- Deduplicate based on employee ID
- Build organizational hierarchy from Active Directory
- Join compensation data from payroll (with strict access controls)
- Calculate tenure and other derived fields
- Validate required fields (manager, department, hire date)
Categorize as Master Data with high sensitivity
Configure strict access controls and data masking for compensation fields
Implement quality rules for required fields

PreviousMonitoring the Execution of Transformation Jobs NextAuditing Logs

Last updated 3 months ago

hashtagOverview

hashtagUnderstanding Master vs Reference Data

hashtagMaster Data

hashtagReference Data

hashtagWhy Categorize Data Products

hashtagCreating Master and Reference Data Products in Foundation

hashtagPrerequisites

hashtagStep 1: Identify Critical Entities

hashtagStep 2: Build the Master or Reference Data Product with the Right Transformations

hashtagConsiderations for Master Data Products

hashtagStep 3: Access Data Product Configuration

hashtagStep 4: Set the Data Product Category and other Metadata

hashtagStep 5: Understand Data Lineage

hashtagBest Practices

hashtagStart with High-Impact Entities

hashtagEstablish Clear Ownership

hashtagDesign Transformation Pipelines for Maintainability

hashtagImplement Progressive Governance

hashtagMonitor and Maintain

hashtagCommon Use Cases

hashtagCustomer Master Data

hashtagProduct Reference Data

hashtagEmployee Master Data

Overview

Understanding Master vs Reference Data

Master Data

Reference Data

Why Categorize Data Products

Creating Master and Reference Data Products in Foundation

Prerequisites

Step 1: Identify Critical Entities

Step 2: Build the Master or Reference Data Product with the Right Transformations

Considerations for Master Data Products

Step 3: Access Data Product Configuration

Step 4: Set the Data Product Category and other Metadata

Step 5: Understand Data Lineage

Best Practices

Start with High-Impact Entities

Establish Clear Ownership

Design Transformation Pipelines for Maintainability

Implement Progressive Governance

Monitor and Maintain

Common Use Cases

Customer Master Data

Product Reference Data

Employee Master Data