Auditing Logs

Foundation emits audit logs for every incoming event and also derives detailed logs for user API interactions and entity-level actions. These logs help with troubleshooting, governance, and analytics.

Approach

Foundation treats Audit Logs as data products. These data products have key characteristics:

  • Data Product Type: System as they are automatically generated by Foundation

  • Organization: Root

  • Sensitivity: These data products contain personal data and confidential information, but cannot be classified before the Classification and Sensitivity levels are defined, as part of the Data Governance set up.

  • Access: Only Admin roles can access them (as they are the only roles with access to the Root Organization)

  • Consumption: in order to check the logs, users can query them following the same processes as set in Querying Data inside Foundation

  • Storage: These data products are not stored in the Object Storage or using Iceberg (as the rest). They are stored in a database.

What gets logged

  • Raw event audit log

    • Data Product Name: _foundation_events_audit_log

    • Table for queries:

      processor.public."foundation_events_audit_log"
    • One row per event (upserted by span_id)

    • Contains full serialized event, message metadata, timestamps

  • User API interaction log

    • Data Product Name: _user_api_interaction_log

    • Table for queries:

      processor.public."user_api_interaction_log"
    • One row per API event

    • Contains HTTP method/path, query/body, response status, user/session context (IP, agent, org, roles)

  • Entity events log

    • Data Product Name: _entity_events_logs

    • Table for queries:

      processor.public."entity_events_logs"
    • One row per entity action event

    • Contains actor user_id, event_timestamp, message_type (action), entity_id, and entity_details

Update cadence

  • Raw audit: upserted on every event by span_id

  • Detailed logs: appended on every event; created_at is set server-side

  • End-to-end latency is the event processing time (near real-time)

Common use cases

  • Troubleshooting and replay: join detailed logs to raw event for exact payloads

  • Security and compliance: who did what to which entity and when

  • API analytics: path/method usage, response codes, error rates, session/role insights

  • Change timelines: per entity_id across time

How to query

  • Recent API calls by a user (last 7 days)

SELECT
  event_timestamp,
  request_http_method,
  request_path,
  response_status,
  user_ip_address,
  x_session_id
FROM processor.public."user_api_interaction_log"
WHERE user_id = $1
  AND event_timestamp >= NOW() - INTERVAL '7 days'
ORDER BY event_timestamp DESC
LIMIT 200;
  • Entity changes over a period

SELECT
  event_timestamp,
  user_id AS actor,
  message_type AS action,
  entity_id,
  entity_details
FROM processor.public."entity_events_logs"
WHERE entity_id = $1
  AND event_timestamp BETWEEN $2 AND $3
ORDER BY event_timestamp DESC;
  • Join detailed logs with the raw event by span_id

SELECT
  feal.event_timestamp,
  feal.message_type,
  feal.message_scope,
  eelog.user_id,
  eelog.entity_id,
  feal.event  -- full JSON
FROM processor.public."foundation_events_audit_log" feal
JOIN processor.public."entity_events_logs" eelog
  ON eelog.span_id = feal.span_id
WHERE eelog.entity_id = $1
ORDER BY feal.event_timestamp DESC
LIMIT 100;
  • API error-rate per path (last 24h)

SELECT
  request_path,
  COUNT(*) FILTER (WHERE response_status >= 500) AS server_errors,
  COUNT(*) FILTER (WHERE response_status BETWEEN 400 AND 499) AS client_errors,
  COUNT(*) AS total,
  ROUND(100.0 * COUNT(*) FILTER (WHERE response_status >= 400) / NULLIF(COUNT(*),0), 2) AS error_rate_pct
FROM processor.public."user_api_interaction_log"
WHERE event_timestamp >= NOW() - INTERVAL '24 hours'
GROUP BY request_path
ORDER BY error_rate_pct DESC NULLS LAST
LIMIT 50;
  • API usage by role (if roles provided)

SELECT
  (user_roles->>'primary') AS primary_role,
  COUNT(*) AS calls
FROM processor.public."user_api_interaction_log"
WHERE event_timestamp >= NOW() - INTERVAL '30 days'
GROUP BY 1
ORDER BY calls DESC;
  • Find slow or failing endpoints by session

SELECT
  x_session_id,
  COUNT(*) FILTER (WHERE response_status >= 500) AS server_errors,
  COUNT(*) FILTER (WHERE response_status BETWEEN 400 AND 499) AS client_errors,
  COUNT(*) AS total
FROM processor.public."user_api_interaction_log"
WHERE x_session_id IS NOT NULL
  AND event_timestamp >= NOW() - INTERVAL '7 days'
GROUP BY x_session_id
HAVING COUNT(*) FILTER (WHERE response_status >= 400) > 0
ORDER BY server_errors DESC, client_errors DESC, total DESC;

Last updated