LogoLogo
  • 🦩Overview
  • 💾Datasets
    • Overview
    • Core Concepts
      • Columns & Annotations
      • Type & Property Mappings
      • Relationships
    • Basic Datasets
      • dbt Integration
      • Sigma Integration
      • Looker Integration
    • SaaS Datasets
    • CSV Datasets
    • Streaming Datasets
    • Entity Resolution
    • AI Columns
      • AI Prompts Recipe Book
    • Enrichment Columns
      • Quick Start
      • HTTP Request Enrichments
    • Computed Columns
    • Version Control
  • 📫Syncs
    • Overview
    • Triggering & Scheduling
    • Retry Handling
    • Live Syncs
    • Audience Syncs
    • Observability
      • Current Sync Run Overview
      • Sync History
      • Sync Tracking
      • API Inspector
      • Sync Alerts
      • Observability Lake
      • Datadog Integration
      • Warehouse Writeback
      • Sync Lifecycle Webhooks
      • Sync Dry Runs
    • Structuring Data
      • Liquid Templates
      • Event Syncs
      • Arrays and Nested Objects
  • 👥Audience Hub
    • Overview
    • Creating Segments
      • Segment Priorities
      • Warehouse-Managed Audiences
    • Experiments and Analysis
      • Audience Match Rates
    • Activating Segments
    • Calculated Columns
    • Data Preparation
      • Profile Explorer
      • Exclusion Lists
  • 🧮Data Sources
    • Overview
    • Available Sources
      • Amazon Athena
      • Amazon Redshift
      • Amazon S3
      • Azure Synapse
      • ClickHouse
      • Confluent Cloud
      • Databricks
      • Elasticsearch
      • Kafka
      • Google AlloyDB
      • Google BigQuery
      • Google Cloud SQL for PostgreSQL
      • Google Pub/Sub
      • Google Sheets
      • Greenplum
      • HTTP Request
      • HubSpot
      • Materialize
      • Microsoft Fabric
      • MotherDuck
      • MySQL
      • PostgreSQL
      • Rockset
      • Salesforce
      • SingleStore
      • Snowflake
      • SQL Server
      • Trino
  • 🛫Destinations
    • Overview
    • Available Destinations
      • Accredible
      • ActiveCampaign
      • Adobe Target
      • Aha
      • Airship
      • Airtable
      • Algolia
      • Amazon Ads DSP (AMC)
      • Amazon DynamoDB
      • Amazon EventBridge
      • Amazon Pinpoint
      • Amazon Redshift
      • Amazon S3
      • Amplitude
      • Anaplan
      • Antavo
      • Appcues
      • Apollo
      • Asana
      • AskNicely
      • Attentive
      • Attio
      • Autopilot Journeys
      • Azure Blob Storage
      • Box
      • Bloomreach
      • Blackhawk
      • Braze
      • Brevo (formerly Sendinblue)
      • Campaign Monitor
      • Canny
      • Channable
      • Chargebee
      • Chargify
      • ChartMogul
      • ChatGPT Retrieval Plugin
      • Chattermill
      • ChurnZero
      • CJ Affiliate
      • CleverTap
      • ClickUp
      • Constant Contact
      • Courier
      • Criteo
      • Crowd.dev
      • Customer.io
      • Databricks
      • Delighted
      • Discord
      • Drift
      • Drip
      • Eagle Eye
      • Emarsys
      • Enterpret
      • Elasticsearch
      • Facebook Ads
      • Facebook Product Catalog
      • Freshdesk
      • Freshsales
      • Front
      • FullStory
      • Gainsight
      • GitHub
      • GitLab
      • Gladly
      • Google Ads
        • Customer Match Lists (Audiences)
        • Offline Conversions
      • Google AlloyDB
      • Google Analytics 4
      • Google BigQuery
      • Google Campaign Manager 360
      • Google Cloud Storage
      • Google Datastore
      • Google Display & Video 360
      • Google Drive
      • Google Search Ads 360
      • Google Sheets
      • Heap.io
      • Help Scout
      • HTTP Request
      • HubSpot
      • Impact
      • Insider
      • Insightly
      • Intercom
      • Iterable
      • Jira
      • Kafka
      • Kevel
      • Klaviyo
      • Kustomer
      • Labelbox
      • LaunchDarkly
      • LinkedIn
      • LiveIntent
      • Loops
      • Mailchimp
      • Mailchimp Transactional (Mandrill)
      • Mailgun
      • Marketo
      • Meilisearch
      • Microsoft Advertising
      • Microsoft Dynamics
      • Microsoft SQL Server
      • Microsoft Teams
      • Mixpanel
      • MoEngage
      • Mongo DB
      • mParticle
      • MySQL
      • NetSuite
      • Notion
      • OneSignal
      • Optimizely
      • Oracle Database
      • Oracle Eloqua
      • Oracle Fusion
      • Oracle Responsys
      • Orbit
      • Ortto
      • Outreach
      • Pardot
      • Partnerstack
      • Pendo
      • Pinterest
      • Pipedrive
      • Planhat
      • PostgreSQL
      • PostHog
      • Postscript
      • Productboard
      • Qualtrics
      • Radar
      • Reddit Ads
      • Rokt
      • RollWorks
      • Sailthru
      • Salesforce
      • Salesforce Commerce Cloud
      • Salesforce Marketing Cloud
      • Salesloft
      • Segment
      • SendGrid
      • Sense
      • SFTP
      • Shopify
      • Singular
      • Slack
      • Snapchat
      • Snowflake
      • Split
      • Sprig
      • Stripe
      • The Trade Desk
      • TikTok
      • Totango
      • Userflow
      • Userpilot
      • Vero Cloud
      • Vitally
      • Webhooks
      • Webflow
      • X Ads (formerly Twitter Ads)
      • Yahoo Ads (DSP)
      • Zendesk
      • Zoho CRM
      • Zuora
    • Custom & Partner Destinations
  • 📎Misc
    • Credits
    • Census Embedded
    • Data Storage
      • Census Store
        • Query Census Store from Snowflake
        • Query Census Store locally using DuckDB
      • General Object Storage
      • Bring Your Own Bucket
        • Bring your own S3 Bucket
        • Bring your own GCS Bucket
        • Bring your own Azure Bucket
    • Developers
      • GitLink
      • Dataset API
      • Custom Destination API
      • Management API
    • Security & Privacy
      • Login & SSO Settings
      • Workspaces
      • Role-based Access Controls
      • Network Access Controls
      • SIEM Log Forwarding
      • Secure Storage of Customer Credentials
      • Digital Markets Act (DMA) Consent for Ad Platforms
    • Health and Usage Reporting
      • Workspace Homepage
      • Product Usage Dashboard
      • Observability Toolkit
      • Alerts
    • FAQs
Powered by GitBook
On this page
  • Prerequisites
  • Step 1: Create Census Store Iceberg REST Catalog Credentials
  • Step 2: Install Iceberg Extension
  • Step 3: Attach catalog
  • Step 4: Query Iceberg Tables

Was this helpful?

  1. Misc
  2. Data Storage
  3. Census Store

Query Census Store locally using DuckDB

Query Census Store locally using DuckDB Apache Iceberg integration for Zero-ETL access to SaaS and CSV datasets.

PreviousQuery Census Store from SnowflakeNextGeneral Object Storage

Last updated 13 hours ago

Was this helpful?

Prerequisites

To query Census Store from DuckDB, you will need:

  • DuckDB version ≥ 1.3.0

  • A Census workspace with one or more SaaS or CSV datasets materialized in Census Store

Step 1: Create Census Store Iceberg REST Catalog Credentials

In Census, create OAuth2 client credentials that Snowflake will use to access your workspace's Iceberg catalog:

  1. Open Census and navigate to Workspace settings.

  2. Select the Census Store tab.

  3. Under Iceberg Catalog, click Create Client Credentials.

  4. Copy the newly-created client ID and secret. The secret is only visible when you first create a set of client credentials. If you lose the secret, you will need to create a new set of client credentials.

Step 2: Install Iceberg Extension

  1. Open DuckDB.

  2. Install Iceberg Extension and load it

    FORCE INSTALL iceberg FROM core_nightly;
    LOAD iceberg;

Step 3: Attach catalog

In DuckDB attach catalog integration to the Iceberg catalog for your workspace’s Managed Storage:

  1. Choose a unique name for your catalog integration and substitute it for <catalog integration name>

  2. Substitute your Census workspace’s Iceberg catalog endpoint for <catalog endpoint> . You can find your workspace’s Iceberg catalog endpoint on the Census Store tab of Workspace settings under Iceberg Catalog.

  3. Substitute your Census workspace’s Iceberg catalog name for <catalog name>. You can find your workspace’s Iceberg catalog endpoint on the Census Store tab of Workspace settings under Iceberg Catalog.

  4. Substitute the client ID and secret for the Iceberg catalog credentials you created in the previous step for <client id> and <client secret>.

Run the following SQL statement.

ATTACH '<catalog name>' as <catalog integration name> (
  TYPE ICEBERG,
  ENDPOINT '<catalog endpoint>',
  CLIENT_ID '<client id>',
  CLIENT_SECRET '<client secret>',
  OAUTH2_SCOPE 'PRINCIPAL_ROLE:ALL'
);

Run the following SQL statement to verify connectivity and get a list of tables:

SHOW ALL TABLES;

Step 4: Query Iceberg Tables

  1. Substitute the name of the catalog integration you created in the previous step for <catalog integration name>.

  2. Substitute the namespace and table name your Census Store dataset is materialized in for <dataset namespace> and <dataset table name> respectively. You can find these names on the dataset details page for your dataset in Census.

SELECT * FROM <catalog integration name>.<dataset namespace>.<dataset table name> LIMIT 5;
📎
Download