# RAG with ChromaDB and LangChain Community

Use the Qualcomm AI Inference Suite with [ChromaDB](https://www.trychroma.com/) and [LangChain Community](https://reference.langchain.com/python/langchain-community/langchain_community) for retrieval-augmented generation (RAG). RAG is a type of information retrieval process that modifies interactions with a large language model (LLM) so that the model queries a specified set of documents in preference to its own vast, static training data. This approach allows LLMs to use domain-specific and updated information instead of only relying on the model’s training data.

## Prerequisites

Install ChromaDB and LangChain Community. ChromaDB is an open-source vector database and LangChain Community contains third-party integrations to build LLM-powered applications. Do the following to install ChromaDB and LangChain Community:

pip install chromadb langchain-community
    Copy to clipboard

## Code sample overview

The examples follow this sequence to setup the RAG system:

1. Import and configure: Import the necessary modules and configure the paths to the book directory and vector store database.
2. Define RAG core functions: Define functions to create the documents, split them into chunks, and store them in a vector database.
3. Ingest the documents: Generate embeddings for document chunks and move them to the vector store.
4. Query the documents: Display the relevant documents in response to a user’s question.

The examples on this page assume that there is a directory called `books` with some documents in plain text format. If you want to follow along, you can create your own books directory and add one or more plain .txt files. Keep in mind that if you use different files than the ones used in the examples, that your retrieved documents and responses will likely differ from the query and response shown in the examples.

## Import and configure models

Import modules for text processing and configure the paths to the book directory and vector store database.

import os
    
    from langchain.text_splitter import CharacterTextSplitter
    from langchain_community.document_loaders import TextLoader
    from langchain_community.vectorstores import Chroma
    from langchain_core.messages import HumanMessage
    
    from imagine.langchain import ImagineChat, ImagineEmbeddings
    
    # Full path to our books directory
    books_dir = "/path/to/my/books"
    
    # Full path to where we will create the vector store database
    store_name = "my_vector_db"
    db_dir = f"/path/to/{store_name}"
    Copy to clipboard

## Define RAG core functions

There are three core functions to perform RAG:

- `create_documents`: Creates the documents from the directory containing the text files.
- `create_vector_store`: Creates the vector store from the documents.
- `query_vector_store`: Queries the vector store and retrieves the relevant documents.

The following examples declare each of the core functions.

1. Declare `create_documents` to create documents from the directory containing the text files.

# Create documents from all the files in the directory
    def create_documents(books_dir):
        if not os.path.exists(books_dir):
            raise FileNotFoundError(f"The directory {books_dir} does not exist. Please check the path.")
    
        book_files = [f for f in os.listdir(books_dir) if f.endswith(".txt")]
        
        documents = []
        for book_file in book_files:
            file_path = os.path.join(books_dir, book_file)
            loader = TextLoader(file_path)
            book_docs = loader.load()
            for doc in book_docs:
                doc.metadata = {"source": book_file}
                documents.append(doc)
    
        text_splitter = CharacterTextSplitter(chunk_size=1024, chunk_overlap=0, separator='\n')
        docs = text_splitter.split_documents(documents)
        return docs
    Copy to clipboard

2. Declare `create_vector_store` to create the vector store and populate it with the document embeddings.

# Create documents from all the files in the directory
    def create_vector_store(docs, embeddings, store_name):
        persistent_directory = os.path.join(db_dir, store_name)
        print(f"Persistent directory: {persistent_directory}")
        if not os.path.exists(persistent_directory):
            print(f"\n--- Creating vector store {store_name} ---")
            Chroma.from_documents(docs, embeddings, persist_directory=persistent_directory)
            print(f"--- Finished creating vector store {store_name} ---")
        else:
            print(
                f"Vector store {store_name} already exists. No need to initialize.")
    Copy to clipboard

3. Declare `query_vector_store` to query the vector store and retrieve the relevant documents.

# Query Vector store given the store name, query and embedding function
    def query_vector_store(store_name, query, embedding_function, k = 2, threshold = 0.1):
        persistent_directory = os.path.join(db_dir, store_name)
        if os.path.exists(persistent_directory):
            db = Chroma(persist_directory=persistent_directory, embedding_function=embedding_function)
            
            retriever = db.as_retriever(
                search_type="similarity_score_threshold",
                search_kwargs={"k": k, "score_threshold": threshold},
            )
            
            relevant_docs = retriever.invoke(query)
            return relevant_docs
        else:
            print(f"Vector store {store_name} does not exist.")
    Copy to clipboard

## Ingest the documents

Do the following to ingest the documents:

1. Break each document into chunks and print the number of chunks.

# Parse documents and generate chunks
    docs = create_documents(books_dir)
    print(f"Number of document chunks: {len(docs)}")
    Copy to clipboard

Number of document chunks: 121
    Copy to clipboard

2. Generate embeddings and create a vector store.

# Generate embeddings and persist in the vector store
    embeddings_fn = ImagineEmbeddings()
    create_vector_store(docs, embeddings_fn, store_name)
    Copy to clipboard

Persistent directory: /home/heyia/code/imagine-sdk-python/examples/langchain/rag/db/my_vector_db
    
    --- Creating vector store my_vector_db ---
    --- Finished creating vector store my_vector_db ---
    Copy to clipboard

## Query relevant documentation

Do the following to query the vector database for relevant documentation:

1. Define the user’s question (`query`) and retrieve the `relevant_docs` from the vector database.

query = "How can I learn more about LangChain?"
    relevant_docs = query_vector_store(store_name, query, embeddings_fn, k = 3)
    Copy to clipboard

2. Display the content of the relevant documents.

print("\n--- Relevant Documents ---")
    for i, doc in enumerate(relevant_docs, 1):
        print(f"Document {i}:\n{doc.page_content}\n")
    
    combined_input = (
        "Here are some documents that might help answer the question: "
        + query
        + "\n\nRelevant Documents:\n"
        + "\n\n".join([doc.page_content for doc in relevant_docs])
        + "\n\nPlease provide an answer based only on the provided documents. If the answer is not found in the documents, respond with 'I'm not sure'."
    )
    Copy to clipboard

--- Relevant Documents ---
    Document 1:
    LangChain: A Framework for LLM-Powered Applications
    LangChain is a powerful and flexible framework designed to simplify the development of applications that harness the capabilities of large language models (LLMs). It provides a wide range of tools, abstractions, and integrations that help developers build, customize, and optimize applications that leverage LLMs for tasks like text generation, question answering, summarization, chatbots, and more.
    Key Features and Benefits
    Modular Components: LangChain offers a variety of modular components (chains, agents, tools, prompts, memory, etc.) that can be easily combined and customized to build complex LLM-powered workflows.
    Data Integration: It seamlessly integrates with various data sources, enabling applications to access and process external information, enhancing the context and relevance of LLM responses.
    
    Document 2:
    Agent Frameworks: LangChain provides agent frameworks that allow LLMs to interact with their environment, make decisions, and take actions based on user input or specific goals.
    Memory Management: It includes memory components that enable applications to maintain context and track conversations, leading to more coherent and personalized interactions.
    Prompt Engineering: LangChain facilitates prompt engineering, the process of crafting effective prompts to elicit desired responses from LLMs, by offering templates and tools for experimentation.
    Chain Optimization: It provides mechanisms to evaluate and optimize chain performance, ensuring that applications deliver the best possible results.
    Use Cases
    LangChain empowers developers to create a wide array of applications, including:
    Chatbots and Conversational Agents: Build intelligent chatbots capable of understanding natural language and providing informative responses.
    
    Document 3:
    Don't Forget to Like and Subscribe!
    If you're looking for in-depth tutorials and insights into LangChain, CrewAI, and other AI technologies, be sure to check out the fantastic YouTube channel by Brandon Hancock:
    YouTube Channel: https://www.youtube.com/@bhancock_ai
    Don't forget to like and subscribe to his channel!!
    Copy to clipboard

3. Send the query and the relevant documents to the LLM to generate a response.

# Send to LLM for summarizing
    model = ImagineChat(model="Llama-3-8B")
    messages = [HumanMessage(content=combined_input)]
    result = model.invoke(messages, max_tokens = 2048, repetition_penalty=1.1, temperature=0.1, top_k=50, top_p=0.95)
    Copy to clipboard

Display the LLM’s generated response.

print("\n--- Generated Response ---")
    print(result.content)
    Copy to clipboard

--- Generated Response ---
    Based on the provided documents, here's what I've learned about LangChain:
    
    * LangChain is a framework designed to simplify the development of applications that use large language models (LLMs).
    * It provides a range of tools, abstractions, and integrations to help developers build, customize, and optimize LLM-powered applications.
    * Key features include modular components, data integration, agent frameworks, memory management, prompt engineering, and chain optimization.
    * Use cases for LangChain include building chatbots and conversational agents, as well as other applications such as text generation, question answering, and summarization.
    
    As for learning more about LangChain, the document mentions a YouTube channel by Brandon Hancock (@bhancock_ai) that provides in-depth tutorials and insights into LangChain and other AI technologies.
    Copy to clipboard

## Next steps

- Review [the basics of using LangChain with the Qualcomm AI Inference Suite SDK](https://docs.qualcomm.com/doc/80-88545-1/topic/3_0_langchain.html).

Last Published: Apr 17, 2026

[Previous Topic
Create custom agents with AutoGen](https://docs.qualcomm.com/bundle/publicresource/80-88545-1/topics/autogen.md) [Next Topic
Add guardrails to an LLM](https://docs.qualcomm.com/bundle/publicresource/80-88545-1/topics/guarded_llm_example.md)