Guides & Tutorials

Basics of Langchain

Large Language Model

GenAI

Langchain

Burhanuddin NahargarwalaJr. Data Engineer

Ready to transform your data strategy with cutting-edge solutions?

Get key insights and all the details you need in one easy-to-access guide 🚀

I asked ChatGPT a business-specific question, and unfortunately, it couldn’t provide a satisfactory answer. Wondering why?

Let's pose a question to ChatGPT: "List all products with limited stocks."

The response is vague and unsatisfactory. But why is that? What other options do we have other than ChatGPT that can help us? Let’s explore all possibilities, starting with understanding ChatGPT:

The GenAI Revolution and ChatGPT: A Prelude

In the dynamic realm of GenAI, one name stands out: ChatGPT. A familiar companion in our digital interactions, ChatGPT is where we turn with our lots of questions, expecting precise and informed answers. Its understanding of context became unparalleled, enabling it to seamlessly navigate complex conversations with users. It could effortlessly switch between topics, understand humor, and adapt its tone to match the preferences of its questioner. So don’t underestimate the power of common GPT, I mean ChatGpt.

When we ask a question to ChatGPT, instead of directly passing it to ChatGPT, it is first converted into tokens via a tokenizer.

Large Language Models receive a text as input and generate a text as output. However, being statistical models, they work much better with numbers than text sequences. That’s why every input to the model is processed by a tokenizer, before being used by the core model.

A token is a chunk of text consisting of a variable number of characters, so the tokenizer's main task is splitting the input into an array of tokens. Then, each token is mapped to a token index, which is the integer encoding of the original text chunk. Click here to see how the text gets converted into tokens.

Challenges come in the situation when the inquiries are not just general but deeply rooted in specific contexts, like queries related to an e-commerce business, real-time queries, etc.

Imagine you've launched an e-commerce venture. Your goal is to offer an unparalleled customer experience with instant query resolution. But can ChatGPT, a generalist in its intelligence, be able to handle the business-specific questions that come its way?

We just saw that it’s not that efficient. Why? ChatGPT, despite its brilliance, is a general-purpose model. It's designed for broad queries and not context-specific questions critical to your business. This is where LangChain enters, a superhero with the powers to solve specialized queries 🤽.

Introduction to LangChain

LangChain is the bridge that connects ChatGPT's general intelligence with the specific needs of your context. It empowers ChatGPT to access external tools, from Wikipedia and search engines to databases, CSV files, and documents.

With LangChain, you can feed ChatGPT the context it needs. Imagine ChatGPT, now equipped with LangChain, being able to pull data from your business database or analyze customer trends from a CSV file. Suddenly, the answers become tailored, precise, and contextually aware.

Developers around the globe have harnessed the power of LangChain and Streamlit to build exceptional chatbots. Take a look at this remarkable example and see how LangChain elevates ChatGPT to new heights.

LangChain is not just a tool; it's a revolutionary framework transforming the way we interact with Large Language models (LLMs). At its heart, LangChain allows us to orchestrate complex AI tasks with remarkable simplicity and efficiency. Let's unpack the key components that make LangChain a game-changer in the world of AI.

Imagine LangChain as a sophisticated puzzle, where each piece is a component that, when connected, forms a more powerful and complex AI application. This "chaining" is what gives LangChain its unique ability to handle advanced use cases involving LLMs. Chains may consist of multiple components from several modules:

Prompt Templates: Prompt templates are templates for different types of prompts. Like “chatbot” style templates, ELI5 (Explain Like I'm 5) question-answering, etc
LLMs: Large language models like GPT-3, Google PaLM 2, Llama 2, etc
Chains: Chains in LangChain are used to execute a sequence of commands that involves LLMs.
Agents: Agents use LLMs to decide what actions should be taken. Tools like web search or calculators can be used, and all are packaged into a logical loop of operations.
Memory: Short-term memory, long-term memory.

LLMs:

In LangChain, there are broadly two types of large language models (LLMs). Let’s understand it via the given image:

Langchain types of LLM.jpg

The first model is LLM, which is designed to process and generate language based on string inputs and outputs.
ChatModels in LangChain are more complex and designed to handle conversational contexts. The input for a ChatModel is a list of ChatMessages, not just a single string. The output of a ChatModel is a single ChatMessage, which is the model's response to the ongoing conversation.

A chat message has two required components:
1. content: This is the content of the message.
2. role: This is the role of the entity from which the chat message is coming from.

Create an LLM object

Step 1: Install LangChain

First, you'll need to install the LangChain library. You can do this by running the following command in your Python environment:

!pip install langchain

Using LangChain will usually require integrations with one or more model providers, data stores, APIs, etc. For this example, we'll use OpenAI's model APIs. You can go with other models also, such as HuggingFace, Llama, etc.

Step 2: Install OpenAI Python Package

Next, install the OpenAI Python package, which will allow you to interact with OpenAI's API:

! pip install openai==0.27.9

Step 3: Obtain OpenAI API Key

To use OpenAI's models, you'll need an API key. Follow these steps to get one:

Create/Open OpenAI account:
- If you don't already have an OpenAI account, create one.
- Visit OpenAI's website and sign up or log in.
Generate API Key:
1. Once logged in, navigate to the API section, where you can generate a new API key.
2. Follow the on-screen instructions to create a key.
Free Credit for New Users:
1. As a new user, you'll typically receive free credits to get started. This allows you to experiment with the API without incurring an initial cost.

After setting up LangChain and obtaining your OpenAI API key, you can start using LangChain with OpenAI's LLMs.

Let’s import the OpenAI class from the langchain package:

from langchain.llms import OpenAI
import os

# First export the OPENAI_API_KEY in environment variable
Os.environ["OPENAI_API_KEY"] = " sk-XXXX"

# Create LLM model
llm = OpenAI()
llm('Python is founded by whom and in which year?')

o/p: '\n\nPython was founded by Guido van Rossum in 1991.'

This is how it worked. The input that we passed is a single string, and as an output, we got a string.

OpenAI has two important parameters that are:

temperature: It controls the randomness and creativity of the model’s output. Its value ranges from 0 to 1.
1. The low temperature (close to 0) results in more deterministic, predictable, and conservative outputs, suitable for tasks where consistency and accuracy are more important than creativity. Example: A temperature of 0.1 might be used for factual data retrieval or business analytics.
2. High temperatures lead to more varied, random, and creative outputs. Ideal for tasks requiring innovation, such as creative writing, brainstorming, or generating diverse ideas. Example: A temperature of 0.9 might be used for creative story writing or generating unique ideas.
model_name: The model_name parameter allows you to specify which of OpenAI's LLMs you want to use. Different models have varying capabilities, sizes, and computational requirements.

# Create LLM model
llm = OpenAI(temperature=0.5, model_name='gpt-3.5-turbo')

Similarly, we can create Chat models, for that import ChatOpenAI from langchain.chat_models:

# We can pass the list of messages, as
from langchain.chat_models import ChatOpenAI
from langchain.schema.messages import HumanMessage, SystemMessage

messages = [
    SystemMessage(content="You're a helpful assistant. In case if you don't know the answer then reply Sorry I don't know the answer!"),
    HumanMessage(content='What is object oriented programming?')
]

# Create LLM chat model
chat_llm = ChatOpenAI()
chat_llm.invoke(messages)

o/p: AIMessage(content='Object-oriented programming (OOP) is a programming…)

In addition to passing a list of HumanMessage and SystemMessage, you can also pass a simple string to the ChatOpenAI model. This is useful for simpler interactions where you don't need the structure of HumanMessage or SystemMessage.

response = chat_llm.invoke("Who is the father of computer?")
print(response)

o/p: AIMessage(content='The father of computer is considered to be Alan Turing, an English mathematician, logician, and computer scientist who laid the foundations for modern computing.')

Conclusion:

LangChain is a key tool that makes advanced AI models like ChatGPT more useful for specific tasks. We've looked at how LangChain helps these models understand and respond better in different situations, by using various settings and approaches. Going forward, we'll explore more about how LangChain works, including its use of templates, chains, and agents, to make AI interactions even smarter and more relevant. Essentially, LangChain is a powerful way to make AI more tailored and effective for our needs.

Ready to Experience the Future of Data?

Discover how Enqurious helps deliver an end-to-end learning experience

Curious how we're reshaping the future of data? Watch our story unfold

The Schema Evolution Challenge in Modern Data Pipelines (Part 1/5) blog cover image

Guides & Tutorials

May 10, 2025

The Schema Evolution Challenge in Modern Data Pipelines (Part 1/5)

This is the first in a five-part series detailing my experience implementing advanced data engineering solutions with Databricks on Google Cloud Platform. The series covers schema evolution, incremental loading, and orchestration of a robust ELT pipeline.

Amit EnquriousCo-founder & CEO

7 Major Stages of the Data Engineering Lifecycle blog cover image

Guides & Tutorials

April 8, 2025

7 Major Stages of the Data Engineering Lifecycle

Discover the 7 major stages of the data engineering lifecycle, from data collection to storage and analysis. Learn the key processes, tools, and best practices that ensure a seamless and efficient data flow, supporting scalable and reliable data systems.

Ayushi EnquriousSr. Data Engineer

Troubleshooting Pip Installation Issues on Dataproc with Internal IP Only blog cover image

Guides & Tutorials

April 3, 2025

Troubleshooting Pip Installation Issues on Dataproc with Internal IP Only

This blog is troubleshooting adventure which navigates networking quirks, uncovers why cluster couldn’t reach PyPI, and find the real fix—without starting from scratch.

Ayushi EnquriousSr. Data Engineer

Optimizing Query Performance in BigQuery blog cover image

Guides & Tutorials

January 24, 2025

Optimizing Query Performance in BigQuery

Explore query scanning can be optimized from 9.78 MB down to just 3.95 MB using table partitioning. And how to use partitioning, how to decide the right strategy, and the impact it can have on performance and costs.

Ayushi EnquriousSr. Data Engineer

When Partitioning and Clustering Go Wrong: Lessons from Optimizing Queries blog cover image

Guides & Tutorials

January 24, 2025

When Partitioning and Clustering Go Wrong: Lessons from Optimizing Queries

Dive deeper into query design, optimization techniques, and practical takeaways for BigQuery users.

Ayushi EnquriousSr. Data Engineer

Stored Procedures vs. Functions: Choosing the Right Tool for the Job blog cover image

Guides & Tutorials

January 6, 2025

Stored Procedures vs. Functions: Choosing the Right Tool for the Job

Wondering when to use a stored procedure vs. a function in SQL? This blog simplifies the differences and helps you choose the right tool for efficient database management and optimized queries.

Divyanshi EnquriousAnalyst

Understanding the Power Law Distribution blog cover image

Guides & Tutorials

January 3, 2025

Understanding the Power Law Distribution

This blog talks about the Power Law statistical distribution and how it explains content virality

Amit EnquriousCo-founder & CEO

Breaking Down Data Silos with BigQuery Omni and BigLake blog cover image

Guides & Tutorials

December 23, 2024

Breaking Down Data Silos with BigQuery Omni and BigLake

Discover how BigQuery Omni and BigLake break down data silos, enabling seamless multi-cloud analytics and cost-efficient insights without data movement.

Ayushi EnquriousSr. Data Engineer

Solving a Computer Vision task with AI assistance blog cover image

Guides & Tutorials

December 18, 2024

Solving a Computer Vision task with AI assistance

In this article we'll build a motivation towards learning computer vision by solving a real world problem by hand along with assistance with chatGPT

Amit EnquriousCo-founder & CEO

How Apache Airflow Helps Manage Tasks, Just Like an Orchestra blog cover image

Guides & Tutorials

September 16, 2024

How Apache Airflow Helps Manage Tasks, Just Like an Orchestra

This blog explains how Apache Airflow orchestrates tasks like a conductor leading an orchestra, ensuring smooth and efficient workflow management. Using a fun Romeo and Juliet analogy, it shows how Airflow handles timing, dependencies, and errors.

Burhanuddin EnquriousJr. Data Engineer

Snapshots and Point-in-Time Restore: The E-Commerce Lifesaver blog cover image

Guides & Tutorials

January 13, 2024

Snapshots and Point-in-Time Restore: The E-Commerce Lifesaver

The blog underscores how snapshots and Point-in-Time Restore (PITR) are essential for data protection, offering a universal, cost-effective solution with applications in disaster recovery, testing, and compliance.

Ayushi EnquriousSr. Data Engineer

Understanding Data Lakes and Data Warehouses: A Simple Guide blog cover image

Guides & Tutorials

December 8, 2023

Understanding Data Lakes and Data Warehouses: A Simple Guide

This blog simplifies the complex world of data management by exploring two pivotal concepts: Data Lakes and Data Warehouses.

Ayushi EnquriousSr. Data Engineer

An L&D Strategy to achieve 100% Certification clearance blog cover image

Guides & Tutorials

December 6, 2023

An L&D Strategy to achieve 100% Certification clearance

An account of experience gained by Enqurious team as a result of guiding our key clients in achieving a 100% success rate at certifications

Amit EnquriousCo-founder & CEO

Serving Up Cloud Concepts: A Pizza Lover's Guide to Understanding Tech blog cover image

Guides & Tutorials

November 2, 2023

Serving Up Cloud Concepts: A Pizza Lover's Guide to Understanding Tech

demystifying the concepts of IaaS, PaaS, and SaaS with Microsoft Azure examples

Ayushi EnquriousSr. Data Engineer

Azure Data Factory: The Ultimate Prep Cook for Your Data Kitchen blog cover image

Guides & Tutorials

October 31, 2023

Azure Data Factory: The Ultimate Prep Cook for Your Data Kitchen

Discover how Azure Data Factory serves as the ultimate tool for data professionals, simplifying and automating data processes

Ayushi EnquriousSr. Data Engineer

Harnessing Azure Cosmos DB APIs: Transforming E-Commerce blog cover image

Guides & Tutorials

October 26, 2023

Harnessing Azure Cosmos DB APIs: Transforming E-Commerce

Revolutionizing e-commerce with Azure Cosmos DB, enhancing data management, personalizing recommendations, real-time responsiveness, and gaining valuable insights.

Ayushi EnquriousSr. Data Engineer

Unleashing the Power of NoSQL: Beyond Traditional Databases blog cover image

Guides & Tutorials

October 26, 2023

Unleashing the Power of NoSQL: Beyond Traditional Databases

Highlights the benefits and applications of various NoSQL database types, illustrating how they have revolutionized data management for modern businesses.

Ayushi EnquriousSr. Data Engineer

Calendar Events Automation: Streamline Your Life with App Script Automation blog cover image

Guides & Tutorials

October 10, 2023

Calendar Events Automation: Streamline Your Life with App Script Automation

This blog delves into the capabilities of Calendar Events Automation using App Script.

Burhanuddin EnquriousJr. Data Engineer

A Journey Through Extraction, Transformation, and Loading blog cover image

Guides & Tutorials

September 7, 2023

A Journey Through Extraction, Transformation, and Loading

Dive into the fundamental concepts and phases of ETL, learning how to extract valuable data, transform it into actionable insights, and load it seamlessly into your systems.

Burhanuddin EnquriousJr. Data Engineer