LogoLogo
  • 🦩Overview
  • 💾Datasets
    • Overview
    • Core Concepts
      • Columns & Annotations
      • Type & Property Mappings
      • Relationships
    • Basic Datasets
      • dbt Integration
      • Sigma Integration
      • Looker Integration
    • SaaS Datasets
    • CSV Datasets
    • Streaming Datasets
    • Entity Resolution
    • AI Columns
      • AI Prompts Recipe Book
    • Enrichment Columns
      • Quick Start
      • HTTP Request Enrichments
    • Computed Columns
    • Version Control
  • 📫Syncs
    • Overview
    • Triggering & Scheduling
    • Retry Handling
    • Live Syncs
    • Audience Syncs
    • Observability
      • Current Sync Run Overview
      • Sync History
      • Sync Tracking
      • API Inspector
      • Sync Alerts
      • Observability Lake
      • Datadog Integration
      • Warehouse Writeback
      • Sync Lifecycle Webhooks
      • Sync Dry Runs
    • Structuring Data
      • Liquid Templates
      • Event Syncs
      • Arrays and Nested Objects
  • 👥Audience Hub
    • Overview
    • Creating Segments
      • Segment Priorities
      • Warehouse-Managed Audiences
    • Experiments and Analysis
      • Audience Match Rates
    • Activating Segments
    • Calculated Columns
    • Data Preparation
      • Profile Explorer
      • Exclusion Lists
  • 🧮Data Sources
    • Overview
    • Available Sources
      • Amazon Athena
      • Amazon Redshift
      • Amazon S3
      • Azure Synapse
      • ClickHouse
      • Confluent Cloud
      • Databricks
      • Elasticsearch
      • Kafka
      • Google AlloyDB
      • Google BigQuery
      • Google Cloud SQL for PostgreSQL
      • Google Pub/Sub
      • Google Sheets
      • Greenplum
      • HTTP Request
      • HubSpot
      • Materialize
      • Microsoft Fabric
      • MotherDuck
      • MySQL
      • PostgreSQL
      • Rockset
      • Salesforce
      • SingleStore
      • Snowflake
      • SQL Server
      • Trino
  • 🛫Destinations
    • Overview
    • Available Destinations
      • Accredible
      • ActiveCampaign
      • Adobe Target
      • Aha
      • Airship
      • Airtable
      • Algolia
      • Amazon Ads DSP (AMC)
      • Amazon DynamoDB
      • Amazon EventBridge
      • Amazon Pinpoint
      • Amazon Redshift
      • Amazon S3
      • Amplitude
      • Anaplan
      • Antavo
      • Appcues
      • Apollo
      • Asana
      • AskNicely
      • Attentive
      • Attio
      • Autopilot Journeys
      • Azure Blob Storage
      • Box
      • Bloomreach
      • Blackhawk
      • Braze
      • Brevo (formerly Sendinblue)
      • Campaign Monitor
      • Canny
      • Channable
      • Chargebee
      • Chargify
      • ChartMogul
      • ChatGPT Retrieval Plugin
      • Chattermill
      • ChurnZero
      • CJ Affiliate
      • CleverTap
      • ClickUp
      • Constant Contact
      • Courier
      • Criteo
      • Crowd.dev
      • Customer.io
      • Databricks
      • Delighted
      • Discord
      • Drift
      • Drip
      • Eagle Eye
      • Emarsys
      • Enterpret
      • Elasticsearch
      • Facebook Ads
      • Facebook Product Catalog
      • Freshdesk
      • Freshsales
      • Front
      • FullStory
      • Gainsight
      • GitHub
      • GitLab
      • Gladly
      • Google Ads
        • Customer Match Lists (Audiences)
        • Offline Conversions
      • Google AlloyDB
      • Google Analytics 4
      • Google BigQuery
      • Google Campaign Manager 360
      • Google Cloud Storage
      • Google Datastore
      • Google Display & Video 360
      • Google Drive
      • Google Search Ads 360
      • Google Sheets
      • Heap.io
      • Help Scout
      • HTTP Request
      • HubSpot
      • Impact
      • Insider
      • Insightly
      • Intercom
      • Iterable
      • Jira
      • Kafka
      • Kevel
      • Klaviyo
      • Kustomer
      • Labelbox
      • LaunchDarkly
      • LinkedIn
      • LiveIntent
      • Loops
      • Mailchimp
      • Mailchimp Transactional (Mandrill)
      • Mailgun
      • Marketo
      • Meilisearch
      • Microsoft Advertising
      • Microsoft Dynamics
      • Microsoft SQL Server
      • Microsoft Teams
      • Mixpanel
      • MoEngage
      • Mongo DB
      • mParticle
      • MySQL
      • NetSuite
      • Notion
      • OneSignal
      • Optimizely
      • Oracle Database
      • Oracle Eloqua
      • Oracle Fusion
      • Oracle Responsys
      • Orbit
      • Ortto
      • Outreach
      • Pardot
      • Partnerstack
      • Pendo
      • Pinterest
      • Pipedrive
      • Planhat
      • PostgreSQL
      • PostHog
      • Postscript
      • Productboard
      • Qualtrics
      • Radar
      • Reddit Ads
      • Rokt
      • RollWorks
      • Sailthru
      • Salesforce
      • Salesforce Commerce Cloud
      • Salesforce Marketing Cloud
      • Salesloft
      • Segment
      • SendGrid
      • Sense
      • SFTP
      • Shopify
      • Singular
      • Slack
      • Snapchat
      • Snowflake
      • Split
      • Sprig
      • Stripe
      • The Trade Desk
      • TikTok
      • Totango
      • Userflow
      • Userpilot
      • Vero Cloud
      • Vitally
      • Webhooks
      • Webflow
      • X Ads (formerly Twitter Ads)
      • Yahoo Ads (DSP)
      • Zendesk
      • Zoho CRM
      • Zuora
    • Custom & Partner Destinations
  • 📎Misc
    • Credits
    • Census Embedded
    • Data Storage
      • Census Store
        • Query Census Store from Snowflake
        • Query Census Store locally using DuckDB
      • General Object Storage
      • Bring Your Own Bucket
        • Bring your own S3 Bucket
        • Bring your own GCS Bucket
        • Bring your own Azure Bucket
    • Developers
      • GitLink
      • Dataset API
      • Custom Destination API
      • Management API
    • Security & Privacy
      • Login & SSO Settings
      • Workspaces
      • Role-based Access Controls
      • Network Access Controls
      • SIEM Log Forwarding
      • Secure Storage of Customer Credentials
      • Digital Markets Act (DMA) Consent for Ad Platforms
    • Health and Usage Reporting
      • Workspace Homepage
      • Product Usage Dashboard
      • Observability Toolkit
      • Alerts
    • FAQs
Powered by GitBook
On this page
  • Key Benefits of CSV Datasets
  • Getting Started
  • Creating CSV Datasets
  • File Requirements
  • Working with CSV Datasets
  • Data Types
  • Refreshing Data
  • Use Cases for CSV Datasets
  • Best Practices for CSV Datasets
  • Limitations

Was this helpful?

  1. Datasets

CSV Datasets

CSV datasets allow you to upload and use CSV (Comma-Separated Values) files directly in Census. This feature is particularly useful when you need to work with data that isn't stored in your data warehouse or when you want to quickly test a sync without setting up complex data infrastructure.

Key Benefits of CSV Datasets

  • No Data Warehouse Required - Use Census without connecting to a data warehouse

  • Quick Testing - Rapidly prototype and test syncs before implementing in production

  • Supplemental Data - Add data that may not exist in your primary data sources

  • Simple Collaboration - Share and collaborate on datasets using familiar CSV files

  • Easy Migration Path - Start with CSV files and seamlessly transition to basic datasets as you scale

Getting Started

To create your first CSV dataset:

  1. Prepare your CSV file with clean, well-structured data

  2. Navigate to the Datasets section in Census

  3. Click "New Dataset" and select "CSV Dataset"

  4. Upload your file and configure as needed

  5. Click "Create Dataset" to finalize

Creating CSV Datasets

File Requirements

Census accepts CSV files with the following specifications:

  • File format: .csv (comma-separated values)

  • Maximum file size: 100MB

  • First row must contain column headers

Example of a valid CSV file:

id,first_name,last_name,email,company,title,created_at
1,John,Doe,john.doe@example.com,Acme Inc,CEO,2023-01-15
2,Jane,Smith,jane.smith@example.com,XYZ Corp,CTO,2023-02-20
3,Robert,Johnson,robert.j@example.com,123 Industries,VP Sales,2023-03-05

Working with CSV Datasets

Data Types

During the upload process, you can manually set the data type for each column in your CSV dataset:

  • Text - For string values

  • Integer - For whole numbers without decimals

  • Float - For decimal numbers

  • Date - For calendar dates (e.g., 2023-01-15)

  • Timestamp - For specific points in time (e.g., 2023-01-15 12:00:00)

  • Boolean - For true/false values

Setting the correct data type for each column ensures proper handling of your data when using it in syncs and helps Census validate the data during import.

Refreshing Data

CSV datasets are static by default, but you can update them by:

  1. Navigating to the dataset details page

  2. Clicking "Update Dataset"

  3. Uploading a new CSV file

  4. Confirming the update

This action completely replaces the existing dataset with the new file.

Use Cases for CSV Datasets

Quick Prototyping

CSV datasets provide an excellent way to test and prototype your data workflows without making changes to your production data warehouse:

  • Test new sync configurations with sample data before implementing in your warehouse

  • Validate field mappings and transformations with controlled test data

  • Experiment with different data structures and formats to find the optimal approach

  • Create proof-of-concept syncs to demonstrate value before investing in full implementation

One-time Data Loads

For data that doesn't need regular updates, CSV datasets offer a straightforward solution:

  • Upload customer lists for one-time marketing campaigns or outreach

  • Import event attendees or webinar registrants for specific follow-up activities

  • Load historical data for backfilling systems or analytics

  • Import lead lists from trade shows or other offline events

  • Upload contest or promotion participants for special communications

Supplemental Reference Data

CSV datasets are perfect for reference data that complements your warehouse data:

  • Upload product catalogs or price lists that change infrequently

  • Import geographic or demographic reference data for segmentation

  • Add mapping tables for code translations or categorizations

  • Import industry benchmarks or standards for comparison

  • Upload postal code or region mappings for territory management

Data Enrichment

Enhance your existing data with additional information from external sources:

  • Combine CSV data with warehouse data using Census's data enrichment features

  • Upload third-party data that isn't available in your warehouse

  • Add manual classifications or segments created by business teams

  • Import scoring data or rankings from external systems

  • Add supplemental attributes for more precise targeting

Best Practices for CSV Datasets

CSV files are a simple but powerful way to bring data into Census. Here are some tips to help you work effectively with CSV datasets:

  • Include a unique identifier column whenever possible to make updates and syncs more reliable

  • Set types for your columns to ensure proper parsing and avoid errors

  • Use consistent date formats (we require YYYY-MM-DD) to ensure proper date parsing

  • Document the source and purpose of each CSV dataset for your team

  • Keep backups of your uploaded files in case you need to reference or restore them

  • Check your column headers to ensure they're clear, consistent, and don't contain special characters

  • Preview your data after upload to verify it was parsed correctly

Remember that CSV datasets are best for relatively static data or one-time imports. For data that changes frequently, consider setting up a basic dataset with automated refreshes instead.

Limitations

  • Maximum file size: 100MB

  • Not suitable for real-time data that changes frequently

  • Manual refresh process required for updates

  • Limited to tabular data formats

PreviousSaaS DatasetsNextStreaming Datasets

Last updated 2 months ago

Was this helpful?

For more information on working with datasets in Census, see our .

💾
core concepts documentation