Databricks Connector

Configure the Databricks connector to enable AI Team members to manage clusters, execute queries, and monitor jobs in your Databricks workspace.

Overview

The Databricks connector enables AI Team members to interact with your Databricks analytics platform, providing capabilities for SQL query execution, cluster management, job monitoring, and notebook access. By connecting Databricks to AI Team, AI teammates can execute analytical queries, track job execution status, analyze failures, and manage cluster resources.

The connector provides access to Delta Lake tables, SQL warehouses, cluster metrics, job execution details, and workspace notebooks, allowing AI teammates to help investigate data pipeline issues, optimize query performance, and monitor resource utilization.

Add the Databricks Connector

To add the Databricks connector, you obtain a personal API token from Databricks and configure authentication in Edge Delta.

Prerequisites

Before configuring the connector, ensure you have:

  • Databricks workspace with API access
  • Personal Access Token from Databricks
  • Appropriate workspace permissions for the operations you need
  • Network connectivity from Edge Delta to the workspace

Configuration Steps

  1. Navigate to AI Team > Connectors in the Edge Delta application
  2. Find the Databricks connector
  3. Click the connector card to open the configuration panel
  4. Enter your Databricks Token
  5. Click Save

The connector is now available for use by AI Team members who have been assigned this connector.

General Options

General tab configuration options

Databricks Token

Personal Access Token for authenticating with Databricks. To create a token, log into your Databricks workspace, navigate to User Settings > Developer > Access tokens, and generate a new token. The token should have appropriate permissions for workspace access, cluster management, SQL execution, and job monitoring based on your needs. See Databricks personal access tokens for detailed instructions.

Tools

Tools tab showing available Databricks operations

list_clusters

Lists all Databricks clusters in the workspace, showing cluster status, configuration, and resource utilization.

create_cluster

Creates a new Databricks cluster with specified configuration parameters.

terminate_cluster

Terminates a running Databricks cluster to free up resources.

get_cluster

Retrieves detailed information about a specific Databricks cluster including status, configuration, and metrics.

start_cluster

Starts a previously terminated Databricks cluster.

list_jobs

Lists all Databricks jobs in the workspace with their current status and recent run history.

run_job

Triggers execution of a Databricks job.

list_notebooks

Lists notebooks in a specified workspace directory path.

list_files

Lists files and directories in a DBFS (Databricks File System) path.

execute_sql

Executes SQL statements against Databricks SQL warehouses or clusters.

How to Use the Databricks Connector

The Databricks connector integrates with AI Team, enabling AI teammates to interact with your data platform based on natural language queries. Once configured, AI teammates can execute SQL queries, monitor job status, manage clusters, and investigate data pipeline issues.

Use Case: Data Pipeline Investigation

When data pipelines fail or produce unexpected results, AI teammates can retrieve job run details, analyze error logs, and identify the root cause. For example, when asked “Why did the nightly ETL job fail?”, the AI can check job status, retrieve failure logs, and provide context about what went wrong.

Use Case: Query Execution

AI teammates can execute SQL queries against Delta Lake tables to retrieve data for analysis. When investigating incidents or answering questions about data patterns, the AI can query tables directly and provide insights based on the results.

Use Case: Cluster Management

AI teammates can monitor cluster status, check resource utilization, and manage cluster lifecycle. When investigating performance issues or optimizing costs, the AI can check which clusters are running, review their configurations, and provide recommendations.

Troubleshooting

Connection errors: Verify your Databricks token is valid and hasn’t been revoked. Check that the workspace URL is correct and accessible from Edge Delta.

Permission errors: Ensure your token has the required permissions for the operations AI teammates need to perform (workspace access, cluster management, SQL execution, job monitoring).

Query failures: Verify SQL warehouse or cluster availability. Check that tables exist and the token has access to query them.

Next Steps

For additional help, visit AI Team Support.