Context Window
A context window is the amount of text an AI model can consider at once — the working memory that limits how much brand, data, and campaign context you can supply.
Published 2026-06-23
A context window is the maximum amount of text — measured in tokens, where a token is roughly three-quarters of an English word — that an AI language model can consider in a single interaction. It functions as the model's working memory: everything the model "knows" about your specific task (your prompt, pasted documents, conversation history, retrieved data) must fit inside it. By 2026, mainstream models offer windows from around 200,000 tokens to over a million — roughly a few hundred to a couple thousand pages.
Why it matters
The context window defines what's practical. Large windows let marketers paste an entire brand guideline, a quarter of campaign data, or a full website into one request — tasks that required chunking and summarizing just a few years ago. But the window also explains common frustrations: why a long chat "forgets" early instructions (they scrolled out of the window or were compressed), why an agent's behavior drifts over long sessions, and why costs climb with input size (most APIs price per token, so stuffing the window is expensive at scale). It also matters for GEO: AI systems reading your website work under context constraints, which is part of why concise, well-structured pages — and pointers like llms.txt — get used more effectively than sprawling ones.
How it's used
Practically, marketers manage context rather than maximize it. Relevance beats volume: a tight 500-word brief with three voice examples outperforms a dumped 50-page brand book, and models demonstrably attend better to information at the start and end of very long contexts than the middle (the "lost in the middle" effect, still measurable in 2026). Teams building workflows use retrieval (RAG) to fetch only the relevant slices of large knowledge bases into the window per task, and use running summaries or learnings files to carry state between sessions instead of endless chat threads. When outputs degrade in a long conversation, the standard fix is starting fresh with a distilled summary of what matters.