Go back
Implementing Medallion Architecture using Databricks
9 Scenarios
3 Hours 55 Minutes
Intermediate
.webp&w=3840&q=90)
Industry
e-commerce
Skills
batch-etl
data-storage
data-wrangling
approach
data-understanding
data-modelling
data-quality
programming
code-versioning
git-version-control
problem-understanding
performance-tuning
cloud-management
Tools
databricks
azure
spark
sql
github
google-cloud
airflow
Learning Objectives
Design and implement an ETL/ELT pipeline using Delta Lake, following the Medallion Architecture.
Manage secure access and credentials in the ETL pipeline.
Automate deployment with CI/CD, ensuring seamless integration and testing.
Perform unit testing for data pipelines to ensure data accuracy and reliability.
Orchestrate data pipeline using workflows
Optimize ETL performance using Spark and Delta Lake best practices.
Design and implement incremental data loading across the Medallion Architecture.
Overview
Prerequisites
- Familiarity with Databricks, PySpark & Python
- Knowledge of ETL/ELT Processes & Pipeline Management
- Basic Knowledge of CI/CD Pipelines
- Familiarity with Delta Lake
- Familiarity with Incremental Loading
