Basics of Langchain

Ready to transform your data strategy with cutting-edge solutions?
I asked ChatGPT a business-specific question, and unfortunately, it couldn’t provide a satisfactory answer. Wondering why?
Let's pose a question to ChatGPT: "List all products with limited stocks."
The response is vague and unsatisfactory. But why is that? What other options do we have other than ChatGPT that can help us? Let’s explore all possibilities, starting with understanding ChatGPT:
The GenAI Revolution and ChatGPT: A Prelude
In the dynamic realm of GenAI, one name stands out: ChatGPT. A familiar companion in our digital interactions, ChatGPT is where we turn with our lots of questions, expecting precise and informed answers. Its understanding of context became unparalleled, enabling it to seamlessly navigate complex conversations with users. It could effortlessly switch between topics, understand humor, and adapt its tone to match the preferences of its questioner. So don’t underestimate the power of common GPT, I mean ChatGpt.
When we ask a question to ChatGPT, instead of directly passing it to ChatGPT, it is first converted into tokens via a tokenizer.
Large Language Models receive a text as input and generate a text as output. However, being statistical models, they work much better with numbers than text sequences. That’s why every input to the model is processed by a tokenizer, before being used by the core model.
A token is a chunk of text consisting of a variable number of characters, so the tokenizer's main task is splitting the input into an array of tokens. Then, each token is mapped to a token index, which is the integer encoding of the original text chunk. Click here to see how the text gets converted into tokens.
Challenges come in the situation when the inquiries are not just general but deeply rooted in specific contexts, like queries related to an e-commerce business, real-time queries, etc.
Imagine you've launched an e-commerce venture. Your goal is to offer an unparalleled customer experience with instant query resolution. But can ChatGPT, a generalist in its intelligence, be able to handle the business-specific questions that come its way?
We just saw that it’s not that efficient. Why? ChatGPT, despite its brilliance, is a general-purpose model. It's designed for broad queries and not context-specific questions critical to your business. This is where LangChain enters, a superhero with the powers to solve specialized queries 🤽.
Introduction to LangChain
LangChain is the bridge that connects ChatGPT's general intelligence with the specific needs of your context. It empowers ChatGPT to access external tools, from Wikipedia and search engines to databases, CSV files, and documents.
With LangChain, you can feed ChatGPT the context it needs. Imagine ChatGPT, now equipped with LangChain, being able to pull data from your business database or analyze customer trends from a CSV file. Suddenly, the answers become tailored, precise, and contextually aware.
Developers around the globe have harnessed the power of LangChain and Streamlit to build exceptional chatbots. Take a look at this remarkable example and see how LangChain elevates ChatGPT to new heights.
LangChain is not just a tool; it's a revolutionary framework transforming the way we interact with Large Language models (LLMs). At its heart, LangChain allows us to orchestrate complex AI tasks with remarkable simplicity and efficiency. Let's unpack the key components that make LangChain a game-changer in the world of AI.
Imagine LangChain as a sophisticated puzzle, where each piece is a component that, when connected, forms a more powerful and complex AI application. This "chaining" is what gives LangChain its unique ability to handle advanced use cases involving LLMs. Chains may consist of multiple components from several modules:
Prompt Templates: Prompt templates are templates for different types of prompts. Like “chatbot” style templates, ELI5 (Explain Like I'm 5) question-answering, etc
LLMs: Large language models like GPT-3, Google PaLM 2, Llama 2, etc
Chains: Chains in LangChain are used to execute a sequence of commands that involves LLMs.
Agents: Agents use LLMs to decide what actions should be taken. Tools like web search or calculators can be used, and all are packaged into a logical loop of operations.
Memory: Short-term memory, long-term memory.
LLMs:
In LangChain, there are broadly two types of large language models (LLMs). Let’s understand it via the given image:
The first model is LLM, which is designed to process and generate language based on string inputs and outputs.
ChatModels in LangChain are more complex and designed to handle conversational contexts. The input for a ChatModel is a list of ChatMessages, not just a single string. The output of a ChatModel is a single ChatMessage, which is the model's response to the ongoing conversation.
A chat message has two required components:content: This is the content of the message.
role: This is the role of the entity from which the chat message is coming from.
Create an LLM object
Step 1: Install LangChain
First, you'll need to install the LangChain library. You can do this by running the following command in your Python environment:
!pip install langchain
Using LangChain will usually require integrations with one or more model providers, data stores, APIs, etc. For this example, we'll use OpenAI's model APIs. You can go with other models also, such as HuggingFace, Llama, etc.
Step 2: Install OpenAI Python Package
Next, install the OpenAI Python package, which will allow you to interact with OpenAI's API:
! pip install openai==0.27.9
Step 3: Obtain OpenAI API Key
To use OpenAI's models, you'll need an API key. Follow these steps to get one:
Create/Open OpenAI account:
If you don't already have an OpenAI account, create one.
Visit OpenAI's website and sign up or log in.
Generate API Key:
Once logged in, navigate to the API section, where you can generate a new API key.
Follow the on-screen instructions to create a key.
Free Credit for New Users:
As a new user, you'll typically receive free credits to get started. This allows you to experiment with the API without incurring an initial cost.
After setting up LangChain and obtaining your OpenAI API key, you can start using LangChain with OpenAI's LLMs.
Let’s import the OpenAI class from the langchain package:
from langchain.llms import OpenAI
import os
# First export the OPENAI_API_KEY in environment variable
Os.environ["OPENAI_API_KEY"] = " sk-XXXX"
# Create LLM model
llm = OpenAI()
llm('Python is founded by whom and in which year?')
o/p: '\n\nPython was founded by Guido van Rossum in 1991.'
This is how it worked. The input that we passed is a single string, and as an output, we got a string.
OpenAI has two important parameters that are:
temperature: It controls the randomness and creativity of the model’s output. Its value ranges from 0 to 1.
The low temperature (close to 0) results in more deterministic, predictable, and conservative outputs, suitable for tasks where consistency and accuracy are more important than creativity. Example: A temperature of 0.1 might be used for factual data retrieval or business analytics.
High temperatures lead to more varied, random, and creative outputs. Ideal for tasks requiring innovation, such as creative writing, brainstorming, or generating diverse ideas. Example: A temperature of 0.9 might be used for creative story writing or generating unique ideas.
model_name: The model_name parameter allows you to specify which of OpenAI's LLMs you want to use. Different models have varying capabilities, sizes, and computational requirements.
# Create LLM model
llm = OpenAI(temperature=0.5, model_name='gpt-3.5-turbo')
Similarly, we can create Chat models, for that import ChatOpenAI from langchain.chat_models:
# We can pass the list of messages, as
from langchain.chat_models import ChatOpenAI
from langchain.schema.messages import HumanMessage, SystemMessage
messages = [
SystemMessage(content="You're a helpful assistant. In case if you don't know the answer then reply Sorry I don't know the answer!"),
HumanMessage(content='What is object oriented programming?')
]
# Create LLM chat model
chat_llm = ChatOpenAI()
chat_llm.invoke(messages)
o/p: AIMessage(content='Object-oriented programming (OOP) is a programming…)
In addition to passing a list of HumanMessage and SystemMessage, you can also pass a simple string to the ChatOpenAI model. This is useful for simpler interactions where you don't need the structure of HumanMessage or SystemMessage.
response = chat_llm.invoke("Who is the father of computer?")
print(response)
o/p: AIMessage(content='The father of computer is considered to be Alan Turing, an English mathematician, logician, and computer scientist who laid the foundations for modern computing.')
Conclusion:
LangChain is a key tool that makes advanced AI models like ChatGPT more useful for specific tasks. We've looked at how LangChain helps these models understand and respond better in different situations, by using various settings and approaches. Going forward, we'll explore more about how LangChain works, including its use of templates, chains, and agents, to make AI interactions even smarter and more relevant. Essentially, LangChain is a powerful way to make AI more tailored and effective for our needs.
Ready to Experience the Future of Data?
You Might Also Like

This is the first in a five-part series detailing my experience implementing advanced data engineering solutions with Databricks on Google Cloud Platform. The series covers schema evolution, incremental loading, and orchestration of a robust ELT pipeline.

Discover the 7 major stages of the data engineering lifecycle, from data collection to storage and analysis. Learn the key processes, tools, and best practices that ensure a seamless and efficient data flow, supporting scalable and reliable data systems.

This blog is troubleshooting adventure which navigates networking quirks, uncovers why cluster couldn’t reach PyPI, and find the real fix—without starting from scratch.

Explore query scanning can be optimized from 9.78 MB down to just 3.95 MB using table partitioning. And how to use partitioning, how to decide the right strategy, and the impact it can have on performance and costs.

Dive deeper into query design, optimization techniques, and practical takeaways for BigQuery users.

Wondering when to use a stored procedure vs. a function in SQL? This blog simplifies the differences and helps you choose the right tool for efficient database management and optimized queries.

This blog talks about the Power Law statistical distribution and how it explains content virality

Discover how BigQuery Omni and BigLake break down data silos, enabling seamless multi-cloud analytics and cost-efficient insights without data movement.

In this article we'll build a motivation towards learning computer vision by solving a real world problem by hand along with assistance with chatGPT

This blog explains how Apache Airflow orchestrates tasks like a conductor leading an orchestra, ensuring smooth and efficient workflow management. Using a fun Romeo and Juliet analogy, it shows how Airflow handles timing, dependencies, and errors.

The blog underscores how snapshots and Point-in-Time Restore (PITR) are essential for data protection, offering a universal, cost-effective solution with applications in disaster recovery, testing, and compliance.

This blog simplifies the complex world of data management by exploring two pivotal concepts: Data Lakes and Data Warehouses.

An account of experience gained by Enqurious team as a result of guiding our key clients in achieving a 100% success rate at certifications

demystifying the concepts of IaaS, PaaS, and SaaS with Microsoft Azure examples

Discover how Azure Data Factory serves as the ultimate tool for data professionals, simplifying and automating data processes

Revolutionizing e-commerce with Azure Cosmos DB, enhancing data management, personalizing recommendations, real-time responsiveness, and gaining valuable insights.

Highlights the benefits and applications of various NoSQL database types, illustrating how they have revolutionized data management for modern businesses.

This blog delves into the capabilities of Calendar Events Automation using App Script.

Dive into the fundamental concepts and phases of ETL, learning how to extract valuable data, transform it into actionable insights, and load it seamlessly into your systems.

An easy to follow guide prepared based on our experience with upskilling thousands of learners in Data Literacy

Teaching a Robot to Recognize Pastries with Neural Networks and artificial intelligence (AI)

Streamlining Storage Management for E-commerce Business by exploring Flat vs. Hierarchical Systems

Figuring out how Cloud help reduce the Total Cost of Ownership of the IT infrastructure

Understand the circumstances which force organizations to start thinking about migration their business to cloud