A Simple Guide to Data Literacy

Ready to transform your data strategy with cutting-edge solutions?
70% of employees are expected to work heavily with data by 2025 - Forrester
85% of executives believe Data Literacy is going to be the most in-demand skill by 2030 - The Data Literacy Project
The numbers look intimidating and not so far away to their Day 0 impact. However, for any modern day data driven organization, it boils down to 3 key questions -
What is Data Literacy by the way?
Why now?
How to start?
We at Enqurious have been working with some of the world's largest Analytics Consulting companies for the past 3 years helping freshers/laterals get groomed to be deployed at production grade projects faster and better.
The words faster and better may look pretty harmless, even encouraging. However, given the exponential growth of data becoming the core competency to differential businesses, it has become quite evident that working with data is no longer going to be a niche skill.
The world of business witnessed it's first transformational change when office work shifted from paper to software.
It is witnessing it's next transformational change when office work shifts from software to data.
Irrespective of which department you belong to, which background you come from, techie or non-techie, being data driven for decision making is fast taking center stage.
Companies like Amazon has proved time and again that digitizing and measuring each phase of the business has exponential RoI in terms of achieving strategic goals like : Customer satisfaction, Higher operational efficiency and lower financial risks.
So, let's agree to one thing first : It's becoming increasingly difficult to do business just on hunch. Data is taking center stage. So, being data literate becomes the next imperative for organizations.
Great! so, let's address the 1st question :
What is Data Literacy by the way?
Gartner defines data literacy as the ability to read, write and communicate data in context, including an understanding of data sources and constructs, analytical methods and techniques applied, and the ability to describe the use case, application and resulting value.
Further, data literacy is an underlying component of digital dexterity — an employee’s ability and desire to use existing and emerging technologies to drive better business outcomes.
It simply means if you have data to deal with on your day-to-day life what skills you should have to ensure data is leveraged for operational, financial and business success.
And No! Data Literacy is not about the SQLs, Pythons, Data Lakes, Cloud ETLs, AI/MLs of the world. Tools are the means to empower data driven decision making. However, only a data literate employee would know how best data should be used for sound business decision making.
With the What of Data Literacy clear, let's focus on the 2nd question :
Why now?
Great, so let's move to the final and most important, 3rd question :
How to start?
Let's start with a disclaimer : Data Literacy in its truest sense is much broader than what we are going to present. We are only going to be presenting the how from the perspective of a regular employee who needs to take data driven decisions on a day to day basis using plain Excel sheets.
What we are not going to cover is the governance, cataloging, privacy and security aspects of Data Literacy.
So, let's start with the following illustration :
Let's decode :
What's the first thought comes to your mind when you look at some data in Excel?
What is this data all about? The idea is to first understand the business context, the industry domain the data is all about. The data can be about orders placed by customers in E-commerce industry or claims made by customers in insurance. Without business context, data is simply a bunch of text and numbers. This comes with a follow up question about what business processes exist and how are they measured? Are they digitized yet?
What do the bunch of numbers and texts even mean?
Leave aside different businesses, departments within the same organization interpret same data differently. The same sales column is interpreted differently by Sales team compared to Finance team. Also, when a new request for a report comes, an Analyst faces the next biggest hurdle. How do I even start if I don't what which data is where and what does it mean? This brings out the need for a org-wide, consistent data dictionary. Without a dictionary, we leave data interpretation to tribal knowledge leaving open a wide, risky gap of erroneous outcomes
The BI vs Statistics perspective of Data Literacy
Once the business context and meanings are taken care of, we arrive at a crossroad on whether to analyze data via Business Intelligence approach or Statistics approach? According to us, the answer lies in and not or. Both perspectives lead to a top notch data analysis and an insightful story. Let's look at the two separately
The BI Approach
The world of business intelligence breaks down any dataset into facts and dimensions. Ex. in a sales report, sales amount is a fact which can be measured and aggregated, while dimensions are slicers or perspectives which group/categorize data. Look at the following asks from business :
Prepare a category-wise, state-wise profit report
Prepare a ship mode wise report of deliveries done.
In the above two requirements, category, state, ship mode are dimensions which group the data and profit, count of delivery is fact. This is important to understand when creating pivots
However, even before you prepare any report, it is important to understand two ideas :
Granularity of data - What's the lowest level at which your fact is available? Is it country level or country-state level or country-state-district-area level? The 3rd one has lowest granularity of data
Uniqueness of data - How to ensure that data is not duplicated? Are there keys which uniquely define a record?
Other than this, there are few more thinking points for Analysts :
What new facts or metrics can be derived? Ex. Profit Margin, Return Rate etc.
What new dimensions can be created? Ex. Weekday/Weekend, Month-year etc.
The Statistics Approach
The world of Statistics divides data into 4 types : Nominal, Ordinal, Interval and Ratio. Labels with no order like color is nominal. Ranks or education level is ordinal, Temperature data is interval which sales is ratio. Once this is identified, Analysts are required to profile the data which means to do the following :
Measure Central Tendencies : Mean, Median, Mode
Measure Dispersion : Range, IQR, Variance, Standard Deviation
Measure Shape : Skewness, Kurtosis
The above steps help understand critical aspects of data like :
If the mean keeps rising, there is a clear upward trend in data
If the mean and median are far apart, there's skewness in data which means there are outliers which need attention
If the spread of data is too sharp, data has high predictability, too flat would mean has high variance, hence low predictability
Finally, this leads to univariate, bi-variate and multi-variate analysis (Not in scope of this article).
Gosh! Isn't that too much for an Analyst to consume to be just called Data Literate?
Well, Data Literacy is not simply a one time course or training. It's a journey which someone genuinely interested in building the skills which would make them highly desirable in modern data driven world.
85% of Chief Data Officers have placed Data Literacy skills to the top spot with a dismal 11% employees using data for decision making.
So, the field is new but one of the most exciting and future defining ones for the knowledge driven economy.
At Enqurious, we help organizations design and deliver a focused and highly effective Data Literacy program. Wish to see how you can build yours?
Connect with us here
Ready to Experience the Future of Data?
You Might Also Like

This is the first in a five-part series detailing my experience implementing advanced data engineering solutions with Databricks on Google Cloud Platform. The series covers schema evolution, incremental loading, and orchestration of a robust ELT pipeline.

Discover the 7 major stages of the data engineering lifecycle, from data collection to storage and analysis. Learn the key processes, tools, and best practices that ensure a seamless and efficient data flow, supporting scalable and reliable data systems.

This blog is troubleshooting adventure which navigates networking quirks, uncovers why cluster couldn’t reach PyPI, and find the real fix—without starting from scratch.

Explore query scanning can be optimized from 9.78 MB down to just 3.95 MB using table partitioning. And how to use partitioning, how to decide the right strategy, and the impact it can have on performance and costs.

Dive deeper into query design, optimization techniques, and practical takeaways for BigQuery users.

Wondering when to use a stored procedure vs. a function in SQL? This blog simplifies the differences and helps you choose the right tool for efficient database management and optimized queries.

This blog talks about the Power Law statistical distribution and how it explains content virality

Discover how BigQuery Omni and BigLake break down data silos, enabling seamless multi-cloud analytics and cost-efficient insights without data movement.

In this article we'll build a motivation towards learning computer vision by solving a real world problem by hand along with assistance with chatGPT

This blog explains how Apache Airflow orchestrates tasks like a conductor leading an orchestra, ensuring smooth and efficient workflow management. Using a fun Romeo and Juliet analogy, it shows how Airflow handles timing, dependencies, and errors.

The blog underscores how snapshots and Point-in-Time Restore (PITR) are essential for data protection, offering a universal, cost-effective solution with applications in disaster recovery, testing, and compliance.

The blog contains the journey of ChatGPT, and what are the limitations of ChatGPT, due to which Langchain came into the picture to overcome the limitations and help us to create applications that can solve our real-time queries

This blog simplifies the complex world of data management by exploring two pivotal concepts: Data Lakes and Data Warehouses.

An account of experience gained by Enqurious team as a result of guiding our key clients in achieving a 100% success rate at certifications

demystifying the concepts of IaaS, PaaS, and SaaS with Microsoft Azure examples

Discover how Azure Data Factory serves as the ultimate tool for data professionals, simplifying and automating data processes

Revolutionizing e-commerce with Azure Cosmos DB, enhancing data management, personalizing recommendations, real-time responsiveness, and gaining valuable insights.

Highlights the benefits and applications of various NoSQL database types, illustrating how they have revolutionized data management for modern businesses.

This blog delves into the capabilities of Calendar Events Automation using App Script.

Dive into the fundamental concepts and phases of ETL, learning how to extract valuable data, transform it into actionable insights, and load it seamlessly into your systems.

Teaching a Robot to Recognize Pastries with Neural Networks and artificial intelligence (AI)

Streamlining Storage Management for E-commerce Business by exploring Flat vs. Hierarchical Systems

Figuring out how Cloud help reduce the Total Cost of Ownership of the IT infrastructure

Understand the circumstances which force organizations to start thinking about migration their business to cloud