Context Is Everything

Eric Kraus
technology
Dec 09, 2025

If you’ve ever tried to explain something complex to a six-year-old, you know the challenge: amazing curiosity but a very short attention span. Adults aren’t much different. And neither are Large Language Models.

The 50 First Dates Problem

LMs have a fundamental limitation that’s easy to overlook: they don’t store the state of your conversation.

Imagine your friend is in an accident and has rare amnesia - waking up every morning after the accident, forgetting everything that happened the day before - but remember everything before the accident. To help her catch up on life, you make a movie that she watches EVERY morning. That’s basically the plot of the 50 First Dates movie, conservatively avoiding any spoilers.

Now, imagine your friend lives a distance away. Instead of a movie, you write letters about events, changes in the world, etc. And she writes back reactions, thoughts, est. The problem is that any one letter won’t have enough information to make sense… so you have to include copies of all previous correspondence as well.

This works great until one of two inevitable problems occurs:

Your box of letters gets full - you run out of space and have to start leaving out older messages, leading to confusion
Information overload - soon there’s too much to read that your friend can’t process it all in time to respond meaningfully

This is exactly how the Context Window works with an AI model.

Sequence of mailing letters filling up a box

Context Window Overview

A context window is the maximum amount of text an LLM can process in a single “chat” conversation. It’s like the model’s focus i.e. everything unique to your request required to process and generate a meaningful response.

Here’s how it works in practice:

Each time you send a new message, the entire conversation history gets sent along with it
The LLM “replays” everything to catch up on context, then responds to your latest message
Once you hit the limit, older messages get dropped to make room for new ones

Understanding Tokens

Context windows are measured in tokens . A token can be:

A complete word: “hello” = 1 token
Part of a word: “unthinkable” = “un” + “think” + “able” = 3 tokens
Punctuation or spaces

Rule of thumb 1 token ≈ 0.75 words

Real-world examples:

US Declaration of Independence: ~1,600 tokens
Average email: ~200-500 tokens
This blog post: ~2,000 tokens (20 blog posts = 40k tokens)

Model Context Window Limits

Context Windows, and AI landscape overall, is quite competitive and continuously evolving. As of time of publishing (December 2025):

GPT-4: 128k tokens (~96k words)
GPT-5.1: 400k tokens (~300k words)
Sonnet 4: 200k tokens (~150k words)
Sonnet 4.5: 1M tokens (~750k words)
Gemini Pro 1.5, 2.5, 3: 1M tokens (~750k words)

Want to see exactly how your text gets tokenized? Try OpenAI’s Tokenizer tool .

What I’ve Learned About Managing Context

The Human Focus Sweet Spot

One of my super-powers is removing ambiguity from complex situations. The challenge is that everyone processes information differently, so knowing the right way to present information becomes an modern, information-age ‘art’ - which I continually strive to perfect.

With people and AI models, after a point, too much information becomes overwhelming. Even the most thoughtful clear explanation can have a revers effect.

Skills for the future

I believe ‘Understanding complex systems’ and ‘Ability to organize and explain process in simple terms’ will be two of the most critical and sought-after skills in the upcoming years.

The Context Window Sweet Spot

Here’s something many moderate AI users still don’t know and manage: AI models perform the best when instructions only utilize the first 40-50% of a context window. When messages start filling the last 40-50%, LLMs start to appear progressively “dumber”.

One might think a simple fix is to just: make bigger context windows!

Huge Context Windows

Unfortunately, bigger isn’t always better

50% of 1M tokens is 500k tokens (roughly 375k words). That is A LOT of text to read & process! (for a human OR AI) . So, even staying within the sweet spot (%) can still lead to confusion, if not managed well.

So, what are we to do?

Tips For Managing Context

Shoving more context into a prompt often has a reverse effect.

Here are three strategies that help manage the just right amount of context to send.

1. Pre-Process Raw Information (easy)

Tip: Distill raw information and extract important detail.

Instead of including an entire document with your prompt, summarize the document first. If you want to understand tone, key decisions, next steps, etc. - ask for that in your summary. You can often summarize content by 60-80%.

Then, utilize the first LLM’s response in a new chat with your prompt.

Think: “Does my instruction actually require the full text of a large document to respond?” Maybe. Often Not.

Simple Workflow For Distilling Information
Source

flowchart TD
  A[Raw Text<br/>📄 Document] --> B{Full text <br/>required?}
  B -->|No| C[New Chat Session]
  B -->|Maybe| D[New Chat Session]
  C --> E[Prompt to Summarize<br/>🔍 Extract key details<br/>~60-80% reduction]
  E --> F[New Chat Session<br/>💬 Fresh context window]
  F --> G[Prompt + Summary<br/>✅ Optimal performance]
  G --> H[Managed Context<br/>✅ Optimal response]
  D --> I[Prompt + Full Text<br/>⚠️ Risk: Context overflow]
  I --> J[Large Context Usage<br/>❓ Potential degradation]
  
  style A fill:#a7a7a7
  style B fill:#6598A7
  style C fill:#98BFA6
  style D fill:#98BFA6
  style E fill:#a7a7a7
  style F fill:#98BFA6
  style G fill:#a7a7a7
  style H fill:#a7a1cc
  style I fill:#DE7F80

Note: This technique is not needed for every request. “Write a response to this one-paragraph email”? You’ll be just fine executing that in a single prompt.

There are also ‘sub-agent’ capabilities with some models where you can instruct a parent agent to do this delegation for you

2. Strategic Tool Usage (medium)

Tip: Keep tactical, repetitive tasks out of your LLM responsibilities.

This goes hand in hand with Tip #1, but involves a bit more technical skill.

If part of the task you are asking AI to do is something that can be scripted, a tool may be a better, faster and cheaper choice.

Converting CSV to JSON? → Use a Tool
Extracting a paragraph’s meaning to structured data? → Perfect for an LLM

Think: “Am I asking something that requires thinking and decisions, or structured repetitive tasks?“

3. Architectural Solutions (advanced)

Tip: convert your AI model into an AI ‘agent’

This is definitely an avanced topic, but elements of this can be incorporated without

RAG (Retrieval Augmented Generation): Pull in relevant context only when needed, from a library of documents
Memory systems: Store important (commonly retreieved) information separately so it can be referenced as needed
Sub-agents: Break complex tasks into focused, smaller conversations

The Bottom Line

More context isn’t better: better context is better.

Just like that conversation we have as humans, the magic happens when you’re concise, specific, and focused on what actually matters. Your AI interactions will be more effective, more reliable, and frankly, more human.

Context Is Everything

The 50 First Dates Problem

Context Window Overview

Understanding Tokens

Model Context Window Limits

What I’ve Learned About Managing Context

The Human Focus Sweet Spot

The Context Window Sweet Spot

Huge Context Windows

Tips For Managing Context

1. Pre-Process Raw Information (easy)

2. Strategic Tool Usage (medium)

3. Architectural Solutions (advanced)

The Bottom Line

Tags :

Share :

Related Posts

Agent Framework For Productivity

The Best AI Skill to Develop in 2026?

Intelligence Is Now Free

Subscribe to my newsletter