
Learning Objectives
Overview
GlobalMart, a global e-commerce retailer, operates multiple warehouses to store and dispatch inventory efficiently. The Operations Team faces several challenges in managing warehouse inventory data effectively:
-
Semi-structured Data Format: The warehouse inventory data is available through an API endpoint, returning nested JSON responses that are difficult to process directly.
-
Need for Automation: Currently, analysts manually extract key insights from the warehouse inventory reports, leading to errors and inefficiencies.
-
Scalability Issues: Given the vast volume of data spanning thousands of products across multiple warehouses, processing it efficiently becomes complex. A structured, object-oriented approach is essential to manage and handle this data programmatically.
-
Reusable and Maintainable Code: The current scripts are monolithic and unstructured, making it difficult to modify and maintain.
Your Task:
As a Data Engineer, your job is to fetch and analyze warehouse inventory data from GlobalMart’s API.
Prerequisites
- Basic proficiency in Python (functions, loops, and error handling)
- Understanding of API requests using the requests library
- Experience working with JSON data (parsing, flattening, and processing)
- Familiarity with data lakes and working with files in cloud storage (ADLS)
- Knowledge of Pandas, Numpy, Matplotlib, etc., for data manipulation and transformation