top of page

Databricks Workspace & UI Explained – A Beginner-Friendly Guide

If you’re new to Databricks, the first thing you’ll notice is Workspace UI. At first glance, it might look overwhelming — notebooks, clusters, jobs, folders, and many menu options.

Don’t worry! This guide breaks down the UI into bite-sized pieces so you can go from "lost" to "launch" in minutes.

 

What Is a Databricks Workspace?


Think of the Databricks Workspace as your Unified Command Center. It’s a cloud-based environment where data scientists, engineers, and analysts collaborate.


The Google Drive Analogy: Just like Google Drive stores your Docs, Sheets, and Slides in one place, Databricks stores your Notebooks (Docs), Clusters (The CPU power to run them), and Jobs (The schedule).


Databricks Workspace is your main working area where you:

  • Write code (notebooks)

  • Manage clusters

  • Schedule jobs

  • Collaborate with team members

UI Option

What it does

Real-World Equivalent

Workspace

Stores your folders and notebooks.

Your "My Documents" folder.

Compute

Where you create and manage Clusters.

Turning on your computer's engine.

Catalog

Manages your data, tables, and permissions.

A digital library of all your spreadsheets.

 

Navigating the Sidebar: Your Command Center


When you first log in, the sidebar on the left is your map.



Gemini said
The image shows the Microsoft Azure Databricks workspace interface in dark mode. The sidebar on the left displays navigation categories including Workspace, SQL, Data Engineering, and AI/ML. The main content area shows a workspace directory with a list of files or notebooks, featuring a "Created at" column with dates ranging from August 26, 2025, to September 03, 2025. At the top right, there are buttons for Share and Create.

Here is a quick, one-line breakdown of what each tool does:


  • Workspace: The central hub where you organize and store your notebooks and files.

  • Recents: A quick-access list of the notebooks and folders you’ve worked on most recently.

  • Catalog: The data management layer where you explore databases, tables, and schemas.

  • Jobs & Pipelines: Where you automate your work by scheduling notebooks to run as workflows.

  • Compute: The "Engine Room" where you create and manage the clusters that run your code.

  • Marketplace: A place to discover and access third-party datasets and solution accelerators.

  • SQL Editor: A professional, tabbed workspace designed for writing, running, and sharing SQL queries. It features autocomplete to help you find table and column names quickly.

  • Queries: This is your library of saved SQL scripts. From here, you can manage permissions to collaborate with your team.

  • Dashboards: A tool to transform your query results into visual reports. You can combine multiple charts and text boxes into one page for easy sharing.

  • Genie: A "no-code" interface that uses Generative AI to let you ask questions about your data in plain English. It then converts your question into a SQL query automatically.

  • Alerts: These are automated monitors that watch your data. For example, you can set an alert to email you if "Total Sales" falls below a certain threshold.

  • Query History: A chronological log of every SQL statement run in the workspace. It’s a "debugging powerhouse" where you can see who ran what, how long it took, and if it failed.

  • SQL Warehouses: These are specialized compute resources (engines) optimized for SQL performance. Unlike standard clusters, they are designed to handle many concurrent users and "auto-stop" when not in use to save costs.

  • Playground: A "no-code" chat interface to test and compare different Large Language Models (LLMs) like Llama or GPT.

  • Agents: The home for "AI Agents"—specialized AI programs designed to perform specific tasks, like answering customer questions or querying your data.

  • Experiments: A digital lab notebook that automatically tracks your model versions, parameters, and accuracy scores using MLflow.

  • Features: A "Feature Store" where you save and share cleaned data specifically prepared for training machine learning models.

  • Models: A central registry to manage the lifecycle of your models (from "Development" to "Production").

  • Serving: Where you turn your model into a "REST API" (an endpoint) so other apps can send data and get predictions in real-time.


Workspace Structure Explained


The Workspace is more than just a folder; it’s a collaborative environment designed for teams.

It is typically divided into three main sections:


1. Users (Personal Space)

This is your private "My Documents." Each team member has their own folder (usually named after their email).

  • Best for: Drafts, experiments, and individual tasks.

  • Safety Tip: Use this for your daily work to avoid cluttering team projects.


2. Shared (Collaboration Space)

The Shared folder is like a "Public" drive. Everyone in your workspace can view and, depending on permissions, edit these notebooks.

  • Best for: Common utility scripts, team-wide projects, and production-ready code that everyone needs to access.


3. Repos (Git Integration)

This is the most professional way to work. Repos allow you to connect your Databricks Workspace directly to a Git provider like GitHub, GitLab, or Azure DevOps.

  • Best for: Version control. Instead of just saving a file, you "Commit" and "Push" your code. This is essential for enterprise data engineering.

Tip for beginners:Start working inside Users → your email folder to avoid accidental changes in shared folders.

 

Databricks Notebooks – Your Coding Playground

Notebooks are where you write and run code.Databricks notebooks support:

  • Python

  • SQL

  • Scala

  • R

You can even mix languages in one notebook!

To change the language of a specific cell, simply type these at the very first line of that cell:

  • %sql – Switches the cell to SQL.

  • %python – Switches the cell to Python.

  • %scala – Switches the cell to Scala.

  • %r – Switches the cell to R.

  • %md – Switches the cell to Markdown (used for writing text, headings, and adding images like this blog!).


Keyboard Shortcuts for Speed

If you want to work like a pro, keep these in mind:

  • Shift + Enter: Runs the current cell and moves to the next one.

  • Ctrl + Enter: Runs the current cell and stays on the same one.

  • Alt + Enter: Runs the current cell and inserts a new empty cell below.


Your First Task: Creating a Folder & Running a Query


Step 1: Create a Folder

  1. Click on Workspace in the sidebar.

  2. Navigate to Users and click on your email folder.

  3. On the top right of the main panel, click the Create button and select Folder.

  4. Name it My_First_Project.(of your choice)


Step 2: Create a Notebook

  1. Open your new folder.

  2. Click Create again, but this time select Notebook.

  3. Name it First_Query_Notebook and ensure the default language is set to SQL or Python.


Step 3: Run Your First Query

Imagine you are a Data Engineer for a supermarket. You have a list of items and their prices. Your boss wants you to:

  1. Calculate the Total Price (Price × Quantity).

  2. Apply a 10% Discount for a holiday sale.

 

Step-by-Step Implementation in PySpark


In your Databricks Notebook, follow these steps:


1. Create the Data (The Table)


Initializing the Data: Creating a basic PySpark DataFrame with "Item," "Price," and "Quantity" columns.

 

2. Perform Multiplication (Calculate Total)


Now, we create a new column called Subtotal by multiplying Price and Quantity

 

Calculating the Subtotal: Using the withColumn function to multiply "Price" by "Quantity" to create a new "Subtotal" column.

3. Perform Percentage Calculation (The Discount)


Finally, we calculate a 10% discount. To do this, we multiply the Subtotal by 0.90 (which is the same as subtracting 10%).

 

Applying a Discount: Calculating a final "Discounted_Price" by reducing the subtotal by 10% and selecting specific columns for the output.

Why notebooks are powerful

  • Run code cell by cell

  • Visualize data easily

  • Collaborate in real-time


Think of notebooks as Jupyter Notebook + SQL Editor combined

 

Why did we do it this way?


  • spark.createDataFrame: This is how we "upload" our raw list into the Spark engine so it can be processed fast.

  • withColumn: This is one of the most important PySpark commands. Think of it like adding a new column in Excel.

  • col("Price") * col("Quantity"): Notice we don't just say Price Quantity. We use col() to tell Spark exactly which "Column" we are talking about in the table.


Interview Tip:

  • If an interviewer asks, "How do you perform arithmetic operations in PySpark?", you can explain that we use the withColumn function combined with the col module to perform transformations across the entire dataset at once, rather than one row at a time.

  • Question: What is the Databricks workspace?

    • Answer:The Databricks workspace is a collaborative environment where users create notebooks, manage clusters, schedule jobs, and organize data engineering workflows using an interactive UI.


This answer alone can clear many beginner-level interviews.

 

Quick Summary


  • Workspace = your main working environment

  • Notebooks = where you write and run code

  • UI navigation = easy once you know the basics

Comments


bottom of page