Enqurious logo
Back to masterclass

SQL Essentials - Level 1

e-commerce
sql
date-processing
text-processing
data-manipulation-language
filter
conditional-logic
group-by-aggregate
SQL Essentials - Level 1 masterclass poster
This masterclass includes
4 activities

Learning objective

  • Get familiar with SQL Data Definition language
  • Be able to SELECT data in multiple ways
  • Be able to filter data with WHERE clause
  • Be able to write conditional logic in SQL
  • Be able to write sub-queries in SQL
  • Be able to perform simple aggregations, text processing, and date processing on data

Overview

For any beginner in Data Analytics, Excel is a great tool to store, clean, process, wrangle, and visualize data. However, Excel also comes with a wide range of problems, which makes a difficult to use tool for real-world, complex data analysis scenarios. In this project, we'll ponder over how Databases help tackle these challenges and what are the different types of databases available. We'll also explore ways to analyze data with a powerful language SQL.

Story

Rajiv, the Business Analyst at GlobalMart has played a pivotal role in providing actionable insights to business by performing thorough data analysis for the Customer Success Team.

 

However, all of his analysis till date has primarily been based out of Excel. While he has increasingly become proficient at Excel, off late he's been struggling with several issues which has forced him to question if Excel still remains as the tool of choice for him?

 

Rajiv dropped a mail to Sharon to seek help on the issues he's been facing off late. Let's read the mail below : 


Hi Sharon, 

 

Hope you are doing great!

 

Thanks for being such a supportive mentor for me. I've learnt a lot about data analysis under you and have become confident in handling data seamlessly with Excel. However, off late, I've been facing issues which has made me question the very tool I loved so much till now. Here's a quick summary of the issues :

  • Since GlobalMart has entered the growth phase, the data associated with customers, orders, products, transactions etc. has also grown exponentially. The data I now get in Excel is of the tune of 50,000…..1,00,000….10,00,000 records. I once received an Excel file which once opened gave the following error message :

Excel cannot insert the sheets into the destination workbook because it contains too many rows or columns

  • Off late, the analysis requirements have become more and more complex with me requiring to lookup data from multiple workbooks, performing lot of medium to complex pivots and using a lot of text, date, logical, mathematical functions across the sheets. Keeping track of the formulas and functions across so many sheets has become a huge pain point
  • Given small data size, I could easily treat them for data quality issues. However, with millions of records across 10s of 100s of sheets, the quality issues have simply exploded and no amount of scrutiny or fixes has helped me to eliminate common quality issues like duplication, missing values, accuracy, consistency and integrity
  • The analysis prepared by me is shared across many stakeholders where each one of them may have some specific requirements of analysis on top of what is already provided. This has been causing a lot of repetitions from my side leading to boredom and at times frustrations due to ad hoc errors caused due to some data getting accidentally deleted or mistyped
  • With the growth of GlobalMart, it longer deals with only structured, tabular data. At times, I have to deal with semi-structured data like JSON which stores data in a hierarchical format rather than a a tabular one

With rising expectations, every increasing data volume, repeat work and falling quality due to accidental deletions or mistypes, I am really frustrated and am desperately looking for help. Looking for a free time slot from you to discuss these issues. 


Sharon agreed to help Rajiv. Let's spend some time going through their conversation and figure out what would could be done to tackle the new challenges Rajiv has been facing off late.