bolt
RAGPipe
v0.1.0
Retrieval-Augmented Generation

How RAG Works

A technical walkthrough. Raw documents flow through extraction, chunking, embedding, storage, and retrieval — in 3 lines of code.

$ pip install ragpipe-ai
2+1
Functions + Pipeline
0
Config Needed
0.71s
For 7K Chunks
SEC.02 — Pipeline Flow

The Architecture

Five stages stacked vertically. Data flows top → bottom through extraction, chunking, embedding, storage, and retrieval.

01
01 description

Extract

Read raw text from files, git repos, or web pages. 3 source types.

.py .md .html .txt
FileSource("./docs") · GitSource("https://...") · WebSource("https://...")
arrow_downward
02
02 content_cut

Chunk

Split into 512-char pieces with 64-char overlap. 3 chunking strategies.

RecursiveChunker(chunk_size=512, chunk_overlap=64)
arrow_forward
1 document → 3 overlapping chunks
arrow_downward
03
03 hub

Embed

Convert text → float vectors. Dimensions vary by backend: 768 (Ollama), 1536 (OpenAI), 384 (ST).

AutoEmbed() # auto-detects Ollama → OpenAI → sentence-transformers
[0.12, -0.34, 0.87, 0.05, -0.62, ...] — each chunk becomes a vector
arrow_downward
04
04 storage

Store

Persist vectors + text + metadata. 3 sink backends.

Qdrant Pinecone JSON
JSONSink("./index.json") · QdrantSink("collection") · PineconeSink("index")
arrow_downward
05
05 search

Query

Embed the query, compute cosine similarity against stored vectors, return top-K matches.

ragpipe.query("How does auth work?", top_k=5)
Top-5 results:
0.89
0.85
0.82
0.78
0.74
swap_horiz AutoEmbed Fallback Chain
1. Ollama local, 768-dim
arrow_forward
2. OpenAI cloud, 1536-dim
arrow_forward
3. sentence-transformers local, 384-dim

Zero configuration. Tries each backend in order and uses the first available. Ollama and sentence-transformers work locally with no API key.

SEC.03 — Vector Retrieval

Embedding Space

Each document chunk becomes a point in high-dimensional vector space. Chunks about similar topics cluster together geometrically. When you query, RAGPipe finds the nearest points by cosine similarity.

Algorithm: Cosine Similarity
cos(A,B) = A · B / (|A| × |B|)
Query Time
0.20s
keyword search, top-5
Index Time
0.71s
2,267 files, 7,388 chunks
Dimension 1 (t-SNE)
Dimension 2
Code (.py)
Docs (.md)
Config (.yaml)
Query
src/
docs/
*.yaml
"auth flow?"
SIM 0.89
SIM 0.85
SIM 0.82
SEC.04 — Implementation

The Comparison

Same result. Different complexity.

closeLangChain
40 lines · 5+ pkgs

from langchain_community.document_loaders import WebBaseLoader

from langchain_text_splitters import RecursiveCharacterTextSplitter

from langchain_openai import OpenAIEmbeddings

from langchain_core.vectorstores import InMemoryVectorStore

loader = WebBaseLoader(web_paths=(...))

docs = loader.load()

splits = RecursiveCharacterTextSplitter(
  chunk_size=1000, chunk_overlap=200
).split_documents(docs)

vector_store = InMemoryVectorStore(
  OpenAIEmbeddings())

vector_store.add_documents(splits)

# ... then wire up retriever + LLM chain

# ... 20+ more lines to query

13x less code
check_circleRAGPipe
3 lines · 1 pkg

import ragpipe

# Ingest anything

ragpipe.ingest("./docs")

# Query your data

results = ragpipe.query("What is the refund policy?")

# Or via CLI

$ ragpipe ingest ./docs

$ ragpipe query "refund policy?"

Feature RAGPipe LangChain LlamaIndex
Basic RAG3 lines40 lines5 lines
Packages15+2-3
CLIcheckseparateseparate
YAML pipelinescheckcloseclose
Git hookscheckcloseclose
Zero-config embedcheckclosepartial
REST API servercheckseparateclose
Document loaders3160+300+
SEC.05 — Execute

Ready to Pipeline?

One install. Zero config. Any data source.

> pip install ragpipe-ai