LLMsAI AgentsRAG 2026-05-28

RAG Chunking: Sliding Window Strategy

Sliding window chunking ignores paragraph and sentence boundaries entirely. Instead it moves a fixed-size window forward by a configurable stride — creating dense, overlapping chunks that preserve context across every split.

This is Part 16 of the AI Agents series. Parts 13–15 covered fixed-size, sentence-based, and recursive character splitting. This post covers the fourth main chunking strategy: sliding window.


1. How it differs from previous strategies

Every strategy so far tried to find a good place to split:

  • Fixed-size: split every N characters
  • Sentence-based: split at sentence boundaries
  • Recursive: split at paragraph → sentence → word → character, in that priority order

Sliding window takes a different approach entirely. It doesn’t look for split points. Instead, it moves a window of fixed size forward by a fixed number of steps (the stride), creating chunks that heavily overlap with their neighbors.

Two parameters control everything:

  • Window size — how many words (or characters) each chunk contains
  • Stride — how many words (or characters) to advance before starting the next chunk

2. The mechanics

With window_size=5, stride=3 over the text:

RAG enhances LLMs by retrieving external data. This process reduces hallucinations.
ChunkWords
1RAG enhances LLMs by retrieving
2by retrieving external data This
3data This process reduces hallucinations

The window starts at word 1, takes 5 words, then moves forward 3. Because stride (3) < window size (5), the last 2 words of each chunk appear again at the start of the next one. That repetition is the overlap — and it’s what makes the difference for context preservation.

If stride equals window size, there’s zero overlap — identical to fixed-size word chunking. If stride is 1, every chunk overlaps with the next by window_size - 1 words — maximum density, maximum redundancy.


3. Python implementation

def sliding_window_chunks(text: str, window_size: int, stride: int) -> list[str]:
    words = text.split()
    chunks = []
    start = 0
    while start < len(words):
        chunk = words[start:start + window_size]
        chunks.append(" ".join(chunk))
        start += stride
    return chunks


# Example from the video
text = "The quick brown fox jumps over the lazy dog"

chunks = sliding_window_chunks(text, window_size=8, stride=4)
for i, chunk in enumerate(chunks):
    print(f"[{i}] {chunk}")

Output:

[0] The quick brown fox jumps over the lazy
[1] jumps over the lazy dog

Chunk 1 starts at word 5 (stride=4 moves past the first 4 words). The last chunk takes whatever words remain — it won’t be a full window if the text runs out.


4. Stride controls the overlap

The relationship between stride and overlap is direct:

$$\text{overlap (words)} = \text{window_size} - \text{stride}$$

window_sizestrideoverlap
880 (no overlap)
862 words
844 words (50%)
826 words (75%)
817 words (maximum density)

A stride of 1 produces the most context-preserving chunks but also the most chunks — for a 1000-word document with window size 8 and stride 1, you get ~993 chunks. That’s expensive to index and query.

Rule of thumb: start with stride at roughly 50–60% of window size. Adjust based on retrieval quality and index size tradeoffs.


5. Character-based vs word-based

The implementation above uses words. For embedding models with token/character limits, you may want character-based sliding:

def sliding_window_char_chunks(text: str, window_size: int, stride: int) -> list[str]:
    chunks = []
    start = 0
    while start < len(text):
        chunks.append(text[start:start + window_size])
        start += stride
    return chunks


chunks = sliding_window_char_chunks(
    "The quick brown fox jumps over the lazy dog",
    window_size=20,
    stride=10
)

for i, chunk in enumerate(chunks):
    print(f"[{i}] '{chunk}'")

Character-based gives precise control over chunk byte size — useful when you’re working close to an embedding model’s context limit. Word-based is more readable and easier to reason about.


6. Integrating with ChromaDB

import chromadb

client = chromadb.PersistentClient(path="./chroma_db")
collection = client.get_or_create_collection(name="sliding_window")

document = """
Nerchuko was founded in 2024 to make data science education accessible.
The platform offers structured learning paths in Python, SQL, statistics, and machine learning.
Learners can track their progress and take assessments after each module.
The AI Agents series is one of the flagship courses on the platform.
"""

chunks = sliding_window_chunks(document.strip(), window_size=15, stride=8)

collection.upsert(
    documents=chunks,
    ids=[f"sw_{i}" for i in range(len(chunks))]
)

results = collection.query(
    query_texts=["What courses does Nerchuko offer?"],
    n_results=3
)

for doc, dist in zip(results["documents"][0], results["distances"][0]):
    print(f"[{dist:.4f}] {doc}")

7. When to use sliding window

ScenarioSliding window?
Narrative text, novels, articlesGood fit — no hard topic boundaries
Dense technical docs with clear sectionsWorse than recursive splitting
Semantic search across continuous proseStrong fit
FAQ / policy documentsOverkill — sentence chunking is simpler
Small documents with short answersAvoid — too many overlapping chunks

Sliding window shines when there are no natural structural boundaries to exploit and context flows continuously across the text. For structured documents (handbooks with sections, code with functions, papers with headings), recursive splitting or sentence-based chunking is usually better.


8. Chunking strategies: full comparison

StrategyContext preservedHandles structureOverlap controlComplexity
Fixed-sizePoorNoManualTrivial
Sentence-basedGoodPartialNoneLow
Recursive characterVery goodYesYesMedium
Sliding windowGoodNoPreciseLow

No single strategy is universally best. Match the strategy to the document type, and when in doubt, test retrieval quality empirically with a set of representative questions.


What’s next

Part 17 covers semantic chunking — the most accurate strategy, which groups sentences by meaning rather than size. It uses embeddings and a similarity threshold to decide where one topic ends and the next begins.

Full video walkthrough is embedded above.

Nerchuko Academy · Free DS Interview Prep