Langchain Simple RAG Project

This project demonstrates a basic Retrieval-Augmented Generation (RAG) system that loads documents from text files, websites, and PDF files.

View Live Site here

The implementation can be found in rag/simplerag.ipynb and follows the steps outlined below:

Document Loading

Text Files: Loaded using TextLoader from langchain_community.document_loaders, with an example file located at rag/robotics.txt.
Web Content: Loaded from Wikipedia using WebBaseLoader.
PDF Files: Loaded using PyPDFLoader from the same library, with the PDF located at rag/robotics.pdf.

Text Splitting

Used RecursiveCharacterTextSplitter from langchain.text_splitter to split documents into 1,000-character chunks with 200-character overlap.

Embedding with HuggingFace

Tokenized and embedded document chunks using the sentence-transformers/all-MiniLM-L6-v2 model from HuggingFace.

Vector Store with ChromaDB

Implemented a vector store using ChromaDB to manage embeddings.

Document Chain with Prompt Template

Created a document chain using a custom prompt that takes user queries as input and relevant document chunks as context, powered by create_stuff_documents_chain from langchain.chains.combine_documents and the ChatGroq model.

Retrieval Chain with ChromaDB

Built a retriever using db.as_retriever() and constructed a retrieval chain via create_retrieval_chain from langchain.chains.retrieval.

Setup

Create virtual environment: python -m venv venv
Activate virtual environment: call venv/Scripts/activate.bat in cmd
Install dependencies: pip install -r requirements.txt
Create environment variables LANGCHAIN_API_KEY and GROQ_API_KEY. You can get your langchain api key from here, and your groq api key from here.

Libraries

Langchain
Langchain Groq
Streamlit
Python-Dotenv
Langchain
PyPDF
bs4
chromadb
transformers
torch

Contact

LinkedIn: Natan Asrat
Gmail: nathanyilmaasrat@gmail.com
Telegram: Natan Asrat
Youtube: Natville

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
db_storage		db_storage
rag		rag
screenshots		screenshots
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Langchain Simple RAG Project

Document Loading

Text Splitting

Embedding with HuggingFace

Vector Store with ChromaDB

Document Chain with Prompt Template

Retrieval Chain with ChromaDB

Setup

Libraries

Contact

About

Releases

Packages

Languages

Natan-Asrat/langchain_simple_rag

Folders and files

Latest commit

History

Repository files navigation

Langchain Simple RAG Project

Document Loading

Text Splitting

Embedding with HuggingFace

Vector Store with ChromaDB

Document Chain with Prompt Template

Retrieval Chain with ChromaDB

Setup

Libraries

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages