Enqurious logo
TM
Request a Demo
Back to blog
Guides & Tutorials

What Happens When Claude Meets Databricks?

What Happens When Claude Meets Databricks? blog cover image
databricks
Sayli Nikumbh

The Problem

Meet the data engineering team at GlobalMart, a mid-size e-commerce company running its entire analytics platform on Databricks. They build and maintain a medallion architecture: raw CSV files land in a volume, get ingested into Bronze Delta tables, cleaned in Silver, and aggregated into Gold for the business team to query.

The pipeline works. But building it is slow. Every sprint looks like this:

1..jpg

The Databricks Field Engineering team recently introduced the AI Dev Kit, an open-source toolkit that connects Claude Code, Gemini, Cursor, etc., directly to your Databricks workspace. GlobalMart's team decided to try it for their next pipeline sprint. This is what they learned.

Prerequisites

Before installing AI Dev Kit, make sure the following are in place:

  • Python 3.10+python --version

  • Databricks CLI v0.200+ — used once for initial workspace configuration

  • Claude Code (latest) — the CLI that runs AI Dev Kit. Requires a Claude Pro subscription.

  • Git — to clone the AI Dev Kit repository

  • Databricks workspace — with Unity Catalog and Serverless compute enabled

  • Databricks PAT token — Settings → Developer → Access Tokens → Generate New Token

Databricks CLI — one-time use only.

You only need it to run databricks configure --profile DEFAULT.

databricks configure --profile DEFAULT

This saves your workspace URL and token to ~/.databrickscfg. After that, the MCP server reads this file automatically, and you never need the CLI again.

Installing AI Dev Kit

AI Dev Kit is open source and maintained by Databricks Field Engineering. The full documentation and installation guide is available on GitHub:

https://github.com/databricks-solutions/ai-dev-kit

Step-by-step setup guide used in this post:

https://github.com/mentorskool-org/databricks-ai-dev-kit-workshops/blob/main/00_SETUP_GUIDE.md

Installation Steps

  • Step 1 - Run the installer

    On Windows, open PowerShell and run this single command:

    powershell powershell -ExecutionPolicy Bypass -Command "irm https://raw.githubusercontent.com/databricks-solutions/ai-dev-kit/main/install.ps1 | iex"

This one command does everything automatically:

  • Creates a Python virtual environment at ~/.ai-dev-kit/.venv/ and installs all dependencies

  • Copies the 37 skill files into ~/.claude/skills/

  • Creates ~/.claude/mcp.json and registers the Databricks MCP server in it

Screen recording of running the PowerShell install command. Show the full terminal output cloning, installing dependencies, copying skills, and writing mcp.json until the installer completes successfully.

sidevkitinstall.gif

2.jpg

After installation, two folders are created on your machine:

3.jpg

What is inside mcp.json? The setup script creates ~/.claude/mcp.json and writes the Databricks server registration into it. Here is what the file contains:

{
  "mcpServers": {
    "databricks": {
      "command": "~/.ai-dev-kit/.venv/python",
      "args": ["~/.ai-dev-kit/repo/databricks-mcp-server/run_server.py"]
    }
  }
}

This file has one job: tell Claude how to start the MCP server. When you type claude Claude reads this file and executes:

~/.ai-dev-kit/.venv/python run_server.py

That starts the MCP server as a background process. The key "databricks" is the name of the server; it is how Claude identifies which server to route Databricks tool calls to. If you ever needed to add a second MCP server (for a different tool), you would add another key alongside "databricks" in this same file.

You never edit mcp.json manually. The setup script writes it for you. The only reason to look at it is to understand how Claude knows where the MCP server lives, or to troubleshoot if the MCP connection stops working.

How It Works — The 4 Components

AI Dev Kit has four components that work together every time you type a prompt in Claude Code:

5.jpg

  • databricks-skills - 37 markdown files loaded into Claude's memory at startup. They teach Claude which tools to use, when to use them, and Databricks best practices.

  • databricks-mcp-server - A Python server running in the background on your machine. Receives tool call requests from Claude and routes them to tools-core.

  • databricks-tools-core - The Python library that contains the actual functions. It uses the Databricks SDK and your saved token to call Databricks REST APIs.

  • databricks-builder-app - A React + FastAPI app for building Databricks Apps. Not covered in this post.

6.jpg

What is MCP?

MCP stands for Model Context Protocol — an open standard created by Anthropic. It is the communication format that allows Claude to call external tools running on your machine.

Think of it as a pipe between two processes on your laptop:

Claude Code process  ←──── stdio pipe ────→  MCP Server process

  (your terminal)         JSON messages       (run_server.py)

When Claude wants to do something in Databricks, it sends a JSON message through this pipe. The MCP server receives it, calls the right Python function, and sends the result back.

How do you know MCP is working?

When you see manage_sql_warehouse (MCP) In Claude's output, the full chain is connected. If you see a Bash command like databricks warehouses list Instead, MCP is not running.

listcatalogs.gif

List catalogs went through MCP. No copy-paste, no manual execution. Claude called Databricks directly from the terminal, authenticated with your saved token.

Coming in Part 2

Now that the stack is set up and the connection is verified, it's time to build. In Part 2, GlobalMart's team uses AI Dev Kit to build their full Bronze → Silver → Gold pipeline:

  • Why Claude hallucinates without context — and how to fix it

  • Building Bronze, Silver, and Gold layers with natural language prompts

  • How to prevent Claude from accidentally dropping your tables

  • Generating column metadata using AI

Ready to Experience the Future of Data?

Discover how Enqurious helps deliver an end-to-end learning experience
Curious how we're reshaping the future of data? Watch our story unfold
Get Free Snowpro Core Certification Skill Path

You Might Also Like

6 Errors I Hit Connecting Databricks Apps to Genie AI blog cover image
Guides & Tutorials
June 3, 2026
6 Errors I Hit Connecting Databricks Apps to Genie AI

Six errors, 6 hours of debugging, and the permission checklist that finally made Databricks Apps + Genie work. The full lessons-learned guide.

Mansi AI & ML Engineer
Where Did My Claude Code Session Go? How to Find Any Lost Session blog cover image
Guides & Tutorials
June 2, 2026
Where Did My Claude Code Session Go? How to Find Any Lost Session

Your Claude Code session isn't lost. It's on disk, in a folder /resume isn't scanning. Here's how to find any session in 30 seconds, with the exact commands.

Mansi AI & ML Engineer
What is Scenario Based Learning for Data Teams? blog cover image
Guides & Tutorials
May 15, 2026
What is Scenario Based Learning for Data Teams?

Scenario based learning replaces tutorials with realistic operational scenarios where engineers develop the hands on judgment classroom instruction cannot produce. How it works and why it matters.

Mandar Sr. Data Analyst
Data Engineering Roadmap 2026: What Companies Actually Hire blog cover image
Guides & Tutorials
May 5, 2026
Data Engineering Roadmap 2026: What Companies Actually Hire

The 2026 data engineering roadmap. SQL, Python, cloud, Airflow, dbt, streaming. What companies actually hire for and how to build a portfolio that gets shortlisted.

Mandar Sr. Data Analyst
Medallion Architecture: Why Most Data Pipelines Break Without It blog cover image
Guides & Tutorials
April 30, 2026
Medallion Architecture: Why Most Data Pipelines Break Without It

Medallion Architecture splits your data pipeline into Bronze, Silver, and Gold layers so a small business change never forces a full rebuild. Here's why it works.

Divyanshi Data Engineer
An Advanced Git Tutorial: Lessons from a Real-World Versioning Crisis blog cover image
Guides & Tutorials
March 7, 2026
An Advanced Git Tutorial: Lessons from a Real-World Versioning Crisis

I was working on a large content repository on Windows, and I needed to version some new work — campaign assets, workshop content, LinkedIn job descriptions, and some file deletions. Simple enough, right? What followed was a two-day journey through some of Git's more obscure corners.

Amit Co-founder & CEO
Data Quality Explained: Challenges, Best Practices, and Complete 2026 Guide blog cover image
Guides & Tutorials
January 23, 2026
Data Quality Explained: Challenges, Best Practices, and Complete 2026 Guide

A complete beginner’s guide to data quality, covering key challenges, real-world examples, and best practices for building trustworthy data.

Divyanshi Data Engineer
Data Lakehouse Demystified: Unlocking Databricks’ Hidden Powers in 2025 blog cover image
Guides & Tutorials
December 29, 2025
Data Lakehouse Demystified: Unlocking Databricks’ Hidden Powers in 2025

Explore the power of Databricks Lakehouse, Delta tables, and modern data engineering practices to build reliable, scalable, and high-quality data pipelines."

Divyanshi Data Engineer
Data Doesn’t Wait Anymore: A Guide to Streaming with Azure Databricks blog cover image
Guides & Tutorials
December 15, 2025
Data Doesn’t Wait Anymore: A Guide to Streaming with Azure Databricks

Data doesn’t wait - and neither should your insights. This blog breaks down streaming vs batch processing and shows, step by step, how to process real-time data using Azure Databricks.

Divyanshi Data Engineer
Unity Catalog Just Leveled Up: Meet your Data’s New Bodyguards blog cover image
Guides & Tutorials
December 8, 2025
Unity Catalog Just Leveled Up: Meet your Data’s New Bodyguards

This blog talks about Databricks’ Unity Catalog upgrades -like Governed Tags, Automated Data Classification, and ABAC which make data governance smarter, faster, and more automated.

Divyanshi Data Engineer
"Yeh Dosti" of AI: Claude & Nano Banana as Jai & Veeru! blog cover image
Guides & Tutorials
December 6, 2025
"Yeh Dosti" of AI: Claude & Nano Banana as Jai & Veeru!

Tired of boring images? Meet the 'Jai & Veeru' of AI! See how combining Claude and Nano Banana Pro creates mind-blowing results for comics, diagrams, and more.

Burhanuddin DevOps Engineer
The Day I Discovered Databricks Connect  blog cover image
Guides & Tutorials
December 1, 2025
The Day I Discovered Databricks Connect

This blog walks you through how Databricks Connect completely transforms PySpark development workflow by letting us run Databricks-backed Spark code directly from your local IDE. From setup to debugging to best practices this Blog covers it all.

Divyanshi Data Engineer
Building Bronze Layer: Using COPY INTO in Databricks blog cover image
Guides & Tutorials
September 12, 2025
Building Bronze Layer: Using COPY INTO in Databricks

Master the bronze layer foundation of medallion architecture with COPY INTO - the command that handles incremental ingestion and schema evolution automatically. No more duplicate data, no more broken pipelines when new columns arrive. Your complete guide to production-ready raw data ingestion

Sayli Sr. Data Engineer
Understanding the Power Law Distribution blog cover image
Guides & Tutorials
January 3, 2025
Understanding the Power Law Distribution

This blog talks about the Power Law statistical distribution and how it explains content virality

Amit Co-founder & CEO
An L&D Strategy to achieve 100% Certification clearance blog cover image
Guides & Tutorials
December 6, 2023
An L&D Strategy to achieve 100% Certification clearance

An account of experience gained by Enqurious team as a result of guiding our key clients in achieving a 100% success rate at certifications

Amit Co-founder & CEO