Skip to content

LangChain Framework

Intermediate

LangChain is a Python/JS framework providing abstractions for building LLM applications: chains, agents, RAG, memory. It offers a unified interface across providers and composable patterns via LCEL (LangChain Expression Language).

Key Facts

  • Unified API for OpenAI, Anthropic, Ollama, Google, and many other providers
  • LCEL pipe syntax (prompt | llm | parser) for composing chains
  • Includes document loaders, text splitters, vector store integrations, memory, and agent toolkits
  • LangSmith companion provides tracing, evaluation, and monitoring
  • For simple prompts, plain Python may suffice - LangChain adds value for complex pipelines

Core Components

Models

from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from langchain_community.chat_models import ChatOllama

llm = ChatOpenAI(model="gpt-4", temperature=0)
response = llm.invoke("Hello")  # same interface for all providers

Prompts

from langchain.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant specialized in {domain}"),
    ("human", "{question}")
])

chain = prompt | llm  # LCEL pipe syntax
response = chain.invoke({"domain": "finance", "question": "What is ROI?"})

Chains (LCEL)

from langchain_core.output_parsers import StrOutputParser

chain = prompt | llm | StrOutputParser()
result = chain.invoke({"domain": "legal", "question": "What is a tort?"})

Document Loaders

from langchain_community.document_loaders import PyPDFLoader, WebBaseLoader

loader = PyPDFLoader("report.pdf")
docs = loader.load()

Text Splitters

from langchain.text_splitter import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.split_documents(docs)

Vector Stores

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings())
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

Memory

from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(return_messages=True)
# Also: ConversationSummaryMemory, ConversationBufferWindowMemory

Advanced LCEL Patterns

RunnablePassthrough

Identity function in LangChain - passes input through unchanged. Used to route data in complex chain compositions:

from langchain_core.runnables import RunnablePassthrough

passthrough = RunnablePassthrough()
passthrough.invoke("hello")  # -> "hello"
passthrough.invoke([1, 2, 3])  # -> [1, 2, 3]

Primary use: carry original input alongside chain output for downstream processing.

Piping Chains Together

Chain output from one prompt into another using LCEL pipe syntax:

from langchain.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser

llm = ChatOpenAI(model="gpt-4")

# Chain 1: list tools for a profession
tools_prompt = ChatPromptTemplate.from_template(
    "List the 5 most essential tools for a {profession}."
)
tools_chain = tools_prompt | llm | StrOutputParser()

# Chain 2: strategies to master those tools
strategy_prompt = ChatPromptTemplate.from_template(
    "Given these tools: {tools}\nSuggest strategies to master them."
)
strategy_chain = strategy_prompt | llm | StrOutputParser()

# Pipe: output of chain 1 feeds into chain 2
full_chain = tools_chain | (lambda tools: {"tools": tools}) | strategy_chain
result = full_chain.invoke({"profession": "data engineer"})

RunnableParallel

Execute multiple runnables concurrently with the same input:

from langchain_core.runnables import RunnableParallel

parallel = RunnableParallel(
    books=books_chain,      # "Recommend books for {topic}"
    projects=projects_chain  # "Suggest projects for {topic}"
)
result = parallel.invoke({"topic": "machine learning"})
# result = {"books": "...", "projects": "..."}

# Feed parallel results into a downstream chain
time_prompt = ChatPromptTemplate.from_template(
    "Given books: {books} and projects: {projects}, estimate completion time."
)
full_chain = parallel | time_prompt | llm | StrOutputParser()

RunnableParallel runs branches concurrently (uses asyncio under the hood). For chains with multiple independent LLM calls, this is significantly faster than sequential execution.

MarkdownHeaderTextSplitter

Splits documents by markdown headers, preserving document structure in metadata:

from langchain.text_splitter import MarkdownHeaderTextSplitter

headers_to_split = [
    ("#", "h1"),
    ("##", "h2"),
    ("###", "h3")]
splitter = MarkdownHeaderTextSplitter(headers_to_split_on=headers_to_split)
chunks = splitter.split_text(markdown_text)
# Each chunk retains header hierarchy in metadata:
# chunks[0].metadata = {"h1": "Introduction", "h2": "Background"}

Use when documents have meaningful header structure (documentation, wiki pages, reports). Produces semantically coherent chunks compared to character-count splitting.

Patterns

RAG Chain

from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

chain = create_retrieval_chain(
    retriever,
    create_stuff_documents_chain(llm, prompt)
)
result = chain.invoke({"input": "What is the company revenue?"})
print(result["answer"])
print(result["context"])  # retrieved documents

Conversational RAG

from langchain.chains import ConversationalRetrievalChain

qa_chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    memory=memory,
    return_source_documents=True
)

LangChain Agents

from langchain.agents import create_openai_tools_agent, AgentExecutor
from langchain.tools import tool

@tool
def search_web(query: str) -> str:
    """Search the web for information."""
    return web_search(query)

agent = create_openai_tools_agent(llm, [search_web], prompt)
executor = AgentExecutor(agent=agent, tools=[search_web], verbose=True)
result = executor.invoke({"input": "Latest news about AI agents"})

LangSmith Monitoring

Observability platform for LLM applications: - Tracing: full trace of chain/agent execution (inputs, outputs, latency per step) - Evaluation: run test datasets, measure quality - Monitoring: production metrics, error rates, token usage - Datasets: manage test/evaluation datasets

import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "lsv2_..."
# All LangChain operations automatically traced

When an agent fails in production, LangSmith shows the exact chain of thought, tool calls, and failure point.

Gotchas

  • LangChain adds abstraction overhead - for simple prompt+response, use the provider SDK directly
  • LCEL pipe syntax is concise but can be hard to debug for complex chains
  • Version compatibility: LangChain evolves rapidly, breaking changes between versions
  • Memory implementations have limitations - ConversationBufferMemory grows unbounded
  • verbose=True on AgentExecutor is essential for debugging but noisy in production

See Also