LogoLogo
  • 🦩Overview
  • 💾Datasets
    • Overview
    • Core Concepts
      • Columns & Annotations
      • Type & Property Mappings
      • Relationships
    • Basic Datasets
      • dbt Integration
      • Sigma Integration
      • Looker Integration
    • SaaS Datasets
    • CSV Datasets
    • Streaming Datasets
    • Entity Resolution
    • AI Columns
      • AI Prompts Recipe Book
    • Enrichment Columns
      • Quick Start
      • HTTP Request Enrichments
    • Computed Columns
    • Version Control
  • 📫Syncs
    • Overview
    • Triggering & Scheduling
    • Retry Handling
    • Live Syncs
    • Audience Syncs
    • Observability
      • Current Sync Run Overview
      • Sync History
      • Sync Tracking
      • API Inspector
      • Sync Alerts
      • Observability Lake
      • Datadog Integration
      • Warehouse Writeback
      • Sync Lifecycle Webhooks
      • Sync Dry Runs
    • Structuring Data
      • Liquid Templates
      • Event Syncs
      • Arrays and Nested Objects
  • 👥Audience Hub
    • Overview
    • Creating Segments
      • Segment Priorities
      • Warehouse-Managed Audiences
    • Experiments and Analysis
      • Audience Match Rates
    • Activating Segments
    • Calculated Columns
    • Data Preparation
      • Profile Explorer
      • Exclusion Lists
  • 🧮Data Sources
    • Overview
    • Available Sources
      • Amazon Athena
      • Amazon Redshift
      • Amazon S3
      • Azure Synapse
      • ClickHouse
      • Confluent Cloud
      • Databricks
      • Elasticsearch
      • Kafka
      • Google AlloyDB
      • Google BigQuery
      • Google Cloud SQL for PostgreSQL
      • Google Pub/Sub
      • Google Sheets
      • Greenplum
      • HTTP Request
      • HubSpot
      • Materialize
      • Microsoft Fabric
      • MotherDuck
      • MySQL
      • PostgreSQL
      • Rockset
      • Salesforce
      • SingleStore
      • Snowflake
      • SQL Server
      • Trino
  • 🛫Destinations
    • Overview
    • Available Destinations
      • Accredible
      • ActiveCampaign
      • Adobe Target
      • Aha
      • Airship
      • Airtable
      • Algolia
      • Amazon Ads DSP (AMC)
      • Amazon DynamoDB
      • Amazon EventBridge
      • Amazon Pinpoint
      • Amazon Redshift
      • Amazon S3
      • Amplitude
      • Anaplan
      • Antavo
      • Appcues
      • Apollo
      • Asana
      • AskNicely
      • Attentive
      • Attio
      • Autopilot Journeys
      • Azure Blob Storage
      • Box
      • Bloomreach
      • Blackhawk
      • Braze
      • Brevo (formerly Sendinblue)
      • Campaign Monitor
      • Canny
      • Channable
      • Chargebee
      • Chargify
      • ChartMogul
      • ChatGPT Retrieval Plugin
      • Chattermill
      • ChurnZero
      • CJ Affiliate
      • CleverTap
      • ClickUp
      • Constant Contact
      • Courier
      • Criteo
      • Crowd.dev
      • Customer.io
      • Databricks
      • Delighted
      • Discord
      • Drift
      • Drip
      • Eagle Eye
      • Emarsys
      • Enterpret
      • Elasticsearch
      • Facebook Ads
      • Facebook Product Catalog
      • Freshdesk
      • Freshsales
      • Front
      • FullStory
      • Gainsight
      • GitHub
      • GitLab
      • Gladly
      • Google Ads
        • Customer Match Lists (Audiences)
        • Offline Conversions
      • Google AlloyDB
      • Google Analytics 4
      • Google BigQuery
      • Google Campaign Manager 360
      • Google Cloud Storage
      • Google Datastore
      • Google Display & Video 360
      • Google Drive
      • Google Search Ads 360
      • Google Sheets
      • Heap.io
      • Help Scout
      • HTTP Request
      • HubSpot
      • Impact
      • Insider
      • Insightly
      • Intercom
      • Iterable
      • Jira
      • Kafka
      • Kevel
      • Klaviyo
      • Kustomer
      • Labelbox
      • LaunchDarkly
      • LinkedIn
      • LiveIntent
      • Loops
      • Mailchimp
      • Mailchimp Transactional (Mandrill)
      • Mailgun
      • Marketo
      • Meilisearch
      • Microsoft Advertising
      • Microsoft Dynamics
      • Microsoft SQL Server
      • Microsoft Teams
      • Mixpanel
      • MoEngage
      • Mongo DB
      • mParticle
      • MySQL
      • NetSuite
      • Notion
      • OneSignal
      • Optimizely
      • Oracle Database
      • Oracle Eloqua
      • Oracle Fusion
      • Oracle Responsys
      • Orbit
      • Ortto
      • Outreach
      • Pardot
      • Partnerstack
      • Pendo
      • Pinterest
      • Pipedrive
      • Planhat
      • PostgreSQL
      • PostHog
      • Postscript
      • Productboard
      • Qualtrics
      • Radar
      • Reddit Ads
      • Rokt
      • RollWorks
      • Sailthru
      • Salesforce
      • Salesforce Commerce Cloud
      • Salesforce Marketing Cloud
      • Salesloft
      • Segment
      • SendGrid
      • Sense
      • SFTP
      • Shopify
      • Singular
      • Slack
      • Snapchat
      • Snowflake
      • Split
      • Sprig
      • Stripe
      • The Trade Desk
      • TikTok
      • Totango
      • Userflow
      • Userpilot
      • Vero Cloud
      • Vitally
      • Webhooks
      • Webflow
      • X Ads (formerly Twitter Ads)
      • Yahoo Ads (DSP)
      • Zendesk
      • Zoho CRM
      • Zuora
    • Custom & Partner Destinations
  • 📎Misc
    • Credits
    • Census Embedded
    • Data Storage
      • Census Store
        • Query Census Store from Snowflake
      • General Object Storage
      • Bring Your Own Bucket
        • Bring your own S3 Bucket
        • Bring your own GCS Bucket
        • Bring your own Azure Bucket
    • Developers
      • GitLink
      • Dataset API
      • Custom Destination API
      • Management API
    • Security & Privacy
      • Login & SSO Settings
      • Workspaces
      • Role-based Access Controls
      • Network Access Controls
      • SIEM Log Forwarding
      • Secure Storage of Customer Credentials
      • Digital Markets Act (DMA) Consent for Ad Platforms
    • Health and Usage Reporting
      • Workspace Homepage
      • Product Usage Dashboard
      • Observability Toolkit
      • Alerts
    • FAQs
Powered by GitBook
On this page
  • Getting Started
  • Permissions
  • Configuring a new Databricks destination
  • ️ Supported Objects and Sync Behaviors
  • Allowed IP Addresses
  • Need help connecting to Databricks?

Was this helpful?

  1. Destinations
  2. Available Destinations

Databricks

This page describes how to sync data to your Databricks data warehouse.

PreviousCustomer.ioNextDelighted

Last updated 2 months ago

Was this helpful?

Databricks is a unified analytics platform that provides a collaborative environment for data engineering, data science, and machine learning. With Census, you can sync data into Databricks from any source we support.

Census supports a wide set of Databricks deployments including

  • Unity Catalog

  • SQL Warehouses (including Serverless)

  • All Databricks LTS versions up to and including 14.3, and new versions typically work without issue.

Getting Started

In this guide, we will show you how to connect Census to Databricks as a destination.

If you are configuring Databricks as a source (to query data from Databricks to sync elsewhere), that process is documented separately here:

Permissions

Census will require the following permissions on the tables you wish to sync to: SELECT, MODIFY.

Configuring a new Databricks destination

  1. First, you'll need to select which form of access credentials to use: (recommended, but a bit more work) or .

    • If you're using a Service Principal, within your Databricks Account Console, go to the .

      1. Create a new service principal with the Add service principal button. Give it a name you'll remember such as Census. You can also reuse an existing one.

      2. Once created, click Generate secret which will create a new Client ID and Secret pair. Keep this somewhere safe as you won't be able to access it again.

      3. Now you'll need to add the service principal as an admin on the specific Workspace you are connecting to. In your Databricks Account console, go the . Click on the name of your workspace and go to the Permissions tab.

      4. Select your new service principal and mark them as admin on the workspace.

    • If you're using Personal Access Token, you can create this for yourself. Alternatively, may want to create a new specific user account for Census to use for auditing and access control.

      1. You'll first need to navigate into the specific Workspace you are connecting to. In your Databricks Account console, go the . Select the workspace you'd like Census to connect to and then click Open workspace in the top right.

      2. Clicking on your Profile Icon in the top right and selecting Settings. Then click the Developer option in the left settings menu and click on Manage next to Access Tokens. We recommend you create a new Access Token:

        • Name: Census (or some other details)

        • Lifetime: (clear the box) - This will prevent the token from expiring

  2. If you're not already, go into your target workspace by visiting the and clicking the Open link next to it. Now within your selected Workspace, select Compute from the left menu. Census can connect to a SQL Warehouse or All Purpose Cluster. You can reuse an existing compute resource, or create a new one here. Click on the Compute you've decided to use.

If you're connecting to an All Purpose Compute Cluster:

  • Service Principals cannot be connected to an All Purpose Cluster that is in the Single User Access Mode.

  1. You'll need to collect three credentials to connect to your compute:

    • Hostname

    • Port

    • HTTP Path

    For SQL Warehouses, switch to the Connection details tab.

    For All Purpose Clusters, in the Configuration tab, open the Advanced Options section at the bottom, then select the JDBC/ODBC section.

    • Provide the connection credentials: Hostname, Port, HTTP Path

    • Select your credential type (Personal Access Token or Service Principal), and provide the corresponding Access Token, or Client ID & Secret.

    • Optionally, set the Catalog and Schema Allow lists. This will filter what Catalogs and Schemas appear in Census. Note that if you are using Unity Catalog, this filtering will apply across all catalogs.

  2. Click Connect.

️ Supported Objects and Sync Behaviors

Object Name

Supported?

Sync Keys

Behaviors

Table

✅

Any columns that are integers or strings

Update or Create, Update Only, Add

Allowed IP Addresses

Need help connecting to Databricks?

All Purpose Clusters on a single node cannot be moved off Single User Access Mode without .

Now you're ready to add the connection to Census. Visit the page in Census, and click New Destination, selecting Databricks from the menu.

Learn more about all of our sync behaviors in our documentation.

if you want Census to support more Databricks objects and/or behaviors.

If you're using Databricks's Allowed IPs network policy, you'll need to add these Census IP addresses to your list. You can find Census's set of IP address for your region in . Visit the for more details on how to specify these IPs as part of your network policy.

You can send our at support@getcensus.com or start a conversation from the in-app chat.

🛫
losing Unity Catalog access
Destinations
Syncs
Contact us
Regions & IP Addresses
Databricks Documentation
support team an email
Databricks as a Source
Service Principal
Personal Access Tokens
User management page and Service Principals tab
Workspaces page
Workspaces page
Workspaces page