Enqurious logo
Go back

Design and Implement Reliable ETL pipeline for WeBank using Databricks- Full Version

9 Scenarios
8 Hours 30 Minutes
project poster
Industry
banking
Skills
quality
data-understanding
data-storage
data-quality
batch-etl
cloud-management
data-wrangling
stream-etl
Tools
databricks
azure
sql
python

Learning Objectives

Implementing an E2E Data Engineering Architecture
Building Data Governance Layer using Databricks
Designing notifications strategy based on the data in Gold Lake
Handling Incremental data and varying schema
How to ensure Idempotency and Quality of the pipeline

Overview

“What does it take to enable Customer Success Manager for your Banking Institution to drive better Customer relations?”
WeBank is a traditional Banking Institution that has been operating for a very long time . The customer Success team at WeBank were a crucial part of maintaining and delivering business impact. The current systems and processes at WeBank have become challenging and following are some of the problems
  • The Customer Success teams operates with older and outdated data available for them
  • Customer Success team has to rely heavily on their corporate wing for any data updates which led to missed opportunities
  • Non-availability of data led to in-appropriate customer targeting by the Success team which also reduced customer satisfaction scores
and much more which ultimately affected their core business products. With the everchanging technology and growing needs of the customers, the Senior Management of WeBank have decided to adopt the data strategy. This new found strategy will help them drive efficient customer success backed by data driven decision making.
The task of implementing the data strategy was given to the Data and AI wing of WeBank. The Data+AI team of WeBank after careful consideration came up with the following architecture and key benefits of the Data Pipeline Solution implemented as a part of Data Strategy

Alt text

  • The Reliable Centralized Repository of data created from disparate data sources
  • The Customer Success managers will receives a weekly notification regarding the customers they should talk
  • Discoverable data sources for Data Analysts and Data Scientists to build modern day solutions

Congratulations !! The Senior Management and the CDO were impressed by the proposal of the data pipeline and has tasked your team to build this for a region of customers

Prerequisites

  • Comprehensive understanding of how ADLS and Azure SQL works
  • Knowledge on capturing streaming data using Azure Event-Hub
  • Ingestion and Parsing Data using Databricks
  • Knowledge on how Delta live table and Unity catalog work in Databricks
Redefining the learning experience

Supercharge Your
Data+AI Teams with us!