This guide shows you how to use Census to connect your S3 account to your data warehouse. This type of connection can sync data between the warehouse and your S3 bucket by mirroring data into a CSV file.
S3 account: You'll need an S3 bucket that's ready for use, along with credentials for an AWS user that can write to that bucket. See AWS documentation for setting up an S3 bucket. The specific permissions Census may need are covered later in 🔐 Required permissions.
Credentials for your data warehouse: For details, see the guide for your specific technology.
After setting up your warehouse, your Connections page should look something like this: 👇
Connections page with data source and S3 service
Step 3: Create your model
When defining models, you'll write SQL queries to select the data you want to sync. This can be as simple as selecting everything in a specific database table or as complex as creating new calculated values.
From inside your Census account, navigate to the Models page.
Enter a name for your model. You'll use this to select the model later.
Enter your SQL query. If you want to test the query, use the Preview button.
Click Save Model.
Basic SQL query for a new model
Step 4: Create your first sync
The sync will move data from your warehouse to your S3 bucket. In this step, you'll define how that will work.
From inside your Census account, navigate to the Syncs page and click Add Sync.
Under What data do you want to sync?, choose your data warehouse as the Connection and your model as the Source.
Under Where do you want to sync data to?, choose the name you assigned in Step 1 (we used S3) as the Connection. Enter the File Path for the CSV file where data will sync. The path can accept variables that will populate when the sync runs. See File Path Variables. Confirm the file path in the Template Preview field.
Under How should changes to the source be synced?, Mirror will be automatically selected. This is the only supported sync behavior for S3.
Under Which properties should be updated?, choose whether to sync only Selected Properties or Sync All Properties. Syncing all properties will automatically add new properties to the sync if the model or database table changes.
To test your sync without actually syncing data, click Run Test and verify the results.
Click Next. This will open the Confirm Details page where you can see a recap of your setup.
If you want to start a sync immediately, set the Run a sync now? checkbox.
Click Create Sync.
When configuring your sync, the page should look something like this: 👇
Example sync setup for S3
Step 5: Confirm the synced data
Once your sync is complete, it's time to check your data. Open your CSV file from the S3 bucket and check that the file was created or updated correctly.
If everything went well, that's it! You've started syncing data from your warehouse to S3! 🥳️
S3 bucket showing the new CSV file created by the Census sync
When defining the File Path for an S3 sync, you can use variables that will be set when the sync runs. This allows you to create and sync to new CSV files in the S3 bucket that reflect the date and time of the sync.
Let us know if you want Census to support additional sync behaviors for S3.
💡Things to know
We support syncing files up to 100GB. Files larger than 5GB may require some additional permissions, see 🔐 Required permissions below.
Data arrives in one file to the designated S3 bucket and file path.
Files are written as a CSV with headers.
We highly recommend adding default server-side encryption to your S3 buckets. Census supports syncing to buckets with encryption policies as long as the bucket uses an AWS provided key type like the Amazon S3 key (SSE-S3) or the AWS Key Management Service key (SSE-KMS). If the bucket uses SSE-KMS, make sure the IAM role credentials associated with the S3 connection have access to the AWS KMS key used for encryption. We do not support syncing to buckets using a customer-provided encryption key.
Contact us if your use cases don't work with these limitations. We plan on addressing at least a few of these in the future!
🔐 Required permissions
For most S3 uploads, the only permission that we require is the s3:PutObject action.
For files larger than 5GB, Census makes use of S3's Multi-part upload which requires the additional permissions:
For more details on how Multipart Uploads use these permissions, see the S3 documentation.