Data objects store raw data ingested into Foundation without any transformation. They function as SQL tables containing initial data exactly as it comes from source systems before transformation is applied.
When to Use Data Objects
Raw data preservation: Store original data for audit, compliance, or historical purposes
Multiple product feeds: Make raw data available to multiple data products
Decoupled processing: Separate data ingestion from transformation processes
Source integrity: Maintain data exactly as received from external systems
Creating a Data Object
Step 1: Create Data Object
Endpoint: POST /api/data/data_object
{"entity":{"name":"Customer Transactions","entity_type":"data_object","label":"CTX","description":"Raw customer transaction data from payment system"},"entity_info":{"owner":"[email protected]","contact_ids":["Data Object contact"],"links":["example.com"]}}
Step 2: Link to Data Source
Endpoint: POST /api/data/link/data_source/data_object
Parameters:
identifier: Data source identifier
child_identifier: Data object identifier
Step 3: Configure Data Object
Endpoint: PUT /api/data/data_object/config?identifier={data_object_id}
Python Functions
Configuration Options
Supported Resource Types
"csv" - Comma-separated values
"json" - JSON format
"parquet" - Parquet format
"avro" - Avro format
"jdbc" - Database tables
CSV Configuration Fields
path: File location in the data source
has_header: First row contains column names
delimiter: Value separator (,, ;, |)
quote_char: Character for quoting values
escape_char: Character for escaping special characters