Warehouse Writeback

Census can provide row-level details on the data you're syncing to a destination. With these logs, you can answer common questions like:

When was my data updated in the destination?
Why did the destination's API reject records that I tried to sync?
What is the most common reason that the destination's API rejects my data?
Which users were a member of this segment at this time?

Warehouse Writeback is available for Enterprise Plan accounts. If you would like logging enabled please contact our team at [email protected].

🗳️ Supported data sources

Census can provide detailed logging for all data warehouse sources using the Advanced Sync Engine:

Snowflake
BigQuery
Redshift
Databricks

🖥️ Configuring Warehouse Writeback

To enable Warehouse Writeback on a supported source connection:

Visit the Sources page.
Click + New Source to create a new source connection, or click Edit on an existing source connection.
Toggle on Warehouse Writeback
Optionally, specify how long logs should be retained for (7 days by default). Census will automatically clean up logs after the specified number of days.

Please note Census will only clean up the Warehouse Writeback tables according to the retention period for Active syncs. Census will not make any changes or delete logs for syncs that have been Disabled or Deleted.

That's it! Logs will start populating for all syncs in this connection on their subsequent runs.

🧮 Log Data

Census exposes detailed logging information in a view called sync_log in your data warehouse. By warehouse, this view can be found as follows:

Snowflake: CENSUS.CENSUS.SYNC_LOG
BigQuery: census.sync_log
Redshift: census.sync_log
Databricks: census.census.sync_log

`sync_log` View Schema

Column

Description

log_id

Unique identifier for the log

sync_id

Unique identifier for the sync configuration. You can find it in the URL of your sync configurations as follows:

https://app.getcensus.com/syncs/[sync_id]/overview

sync_run_id

Unique identifier for the sync run. Use this value to identify a particular occasion when Census sends data as specified for a given sync configuration.

record_identifier

The value of the identifier specified in your sync configuration, identifying which record in your source you are trying to send to a destination.

record_payload

The exact data that Census was attempting to send to a given destination. It is formatted as a JSON object.

batch_started_at

The time when the batch containing this data was sent to the destination.

batch_completed_at

The time when the batch containing this data completed.

operation

The operation performed by Census. Either: 'upsert', 'update', 'create', or 'delete'... depending on the sync behavior you specified.

status

Either 'succeeded' or 'rejected'

status_message

If the status is 'rejected', this field will contain the reason returned by the destination's API.

_census_logged_at

When Census loaded this log record into your data warehouse.

destination_id

Unique identifier for the destination (i.e. service) connection that this sync is writing to.

destination_object_id

Unique identifier for the destination (i.e. service) object that this sync is writing to.

source_id

Unique identifier for the source connection (e.g. your warehouse) that this sync is sending data from.

source_object_id

Unique identifier for the specific object in the source connection that this sync is sending data from. This could be a table, model, entity, or segment.

The destination_id, destination_object_id, source_id, and source_object_id columns are not yet supported when using Warehouse Writeback with Databricks.

Metadata Tables

In addition to row level sync logs, Warehouse Writeback will create metadata tables about source objects and destinations involved in syncs. These tables can be joined to the sync_log table on their id column in order to add additional context.

For example, imagine you have a mirror sync from a Segment to an ads destination like Google. The sync_log table will log attempts to send new records (i.e. those that entered the segment) to the destination. It will also log attempts to delete records (i.e. those that left the segment) from the destination. If you join those logs with the source objects table (described below) you can get full insight into who is entering and leaving what segments, by name, and when.

Metadata Tables are refreshed every six hours, separate from sync runs history. That means you may see a delay on records appearing in metadata for syncs that are using brand new sources or destinations.

Source Objects Table

Source objects are tables, models, datasets, or segments. These are what you send data from during a sync. Continue reading the schema section below for more information.

Where

Metadata tables for source objects can be found in the following tables, by warehouse:

Snowflake: CENSUS.CENSUS.SOURCE_OBJECTS
BigQuery: census.source_objects
Redshift: census.source_objects
Databricks: not yet supported

Schema

Column

Description

Unique identifier for the source object. This joins to the source_object_id column in the sync_log table.

type

Type of data set. The options with their meaning are: - DataWarehouse::FilterSegmentSource -> A segment - DataWarehouse::CohortSource -> A cohort - DataWarehouse::Query -> A model - DataWarehouse::BusinessObjectSource -> An entity - DataWarehouse::Table -> A table

name

Name of the data set.

model_id

For a source object with type DataWarehouse::Query, this points to the SQL, Looker, or dbt model associated with it. The model is what you see in the Census UI and is what is responsible for storing a SQL query, dbt reference, etc. The DataWarehouse::Query source object lives between the model and your source and is responsible for translating the model definition into rows and columns.

business_object_id

For a source object with type DataWarehouse::BusinessObjectSource, this points to the Dataset associated with it. The Dataset (which is called a businessObject internally)

filter_segment_id

For a source object with type DataWarehouse::FilterSegmentSource, this points to the segment associated with it. The segment is what you see in the Census UI and is where you configure conditional logic to segment your data.

cohort_id

For a source object with type DataWarehouse::CohortSource, this points to the segment experiment cohort associated with the sync. The cohort is the subset of a segment created when running an experiment.

Destinations Table

Destinations are where you send data during a sync. An example is Salesforce.

Where

Metadata tables for destinations can be found in the following tables, by warehouse:

Snowflake: CENSUS.CENSUS.DESTINATIONS
BigQuery: census.destinations
Redshift: census.destinations
Databricks: not yet supported

Schema

Column

Description

Unique identifier for the destination. This joins to the destination_id column in the sync_log table.

type

Type of the destination. This can be any of the various destinations we support, in the format <Destination name>::Connection

name

Name of the destination.

Destination Objects Table

Destination objects are the specific objects within a destination that you send data to during a sync. An example is a Salesforce Contact.

Where

Metadata tables for destinations can be found in the following tables, by warehouse:

Snowflake: CENSUS.CENSUS.DESTINATION_OBJECTS
BigQuery: census.destination_objects
Redshift: census.destination_objects
Databricks: not yet supported

Schema

Column

Description

Unique identifier for the destination object. This joins to thedestination_object_id column in the sync_log table.

type

Type of the destination object. This can be any of the various destination objects we support, in the format <Destination name>::ObjectTypes::<Destination object name>

name

Name of the destination object.

Example SQL Queries

Here are a few examples of SQL queries you can use as starting point to explore your logged data once Warehouse Writeback has been enabled.

For example, you may want to see all rows that have been synced by a specific sync configuration ID. You can find it in the URL of your sync configurations as follows: https://app.getcensus.com/syncs/[sync_id]/overview. For this example, we'll look at Sync ID 1234.

SELECT * 
FROM CENSUS.CENSUS.SYNC_LOG log
LEFT JOIN CENSUS.CENSUS.SOURCE_OBJECTS source
ON log.source_object_id = source.id
WHERE log.sync_id = 1234;

The Sync History page also includes an ID for each sync run which allows you to limit the query to that run. For example, if you'd like to see records that Census deleted on a recent sync run 5678,

SELECT * 
FROM CENSUS.CENSUS.SYNC_LOG log 
LEFT JOIN CENSUS.CENSUS.SOURCE_OBJECTS source 
ON log.source_object_id = source.id
WHERE log.sync_run_id = 5678
  AND log.operation = 'delete';

PreviousDatadog Integration NextSync Lifecycle Webhooks

Last updated 3 months ago

Was this helpful?

🗳️ Supported data sources

🖥️ Configuring Warehouse Writeback

🧮 Log Data

sync_log View Schema

Metadata Tables

Source Objects Table

Where

Schema

Destinations Table

Where

Schema

Destination Objects Table

Where

Schema

Example SQL Queries

`sync_log` View Schema