Table of Contents

Context engineering.

Context engineering goes beyond drafting the perfect prompt and designs the model’s information ecosystem. Its components are the system instruction, the user instruction, external memory, and the use of tools.

We have all dreamed, at some point, of finding a magic lamp, rubbing it with enthusiasm, and watching a genie emerge ready to sort our lives out.

But beware: urban legends warn us about the Genie of Malicious Compliance, a metaphysical entity with perfect pitch for literalness and zero empathy for intent. The danger isn’t that he won’t grant your wish, but that he will follow your instructions with a syntactic rigor as flawless as it is destructive, always finding the loophole in the grammar of your request to turn your greatest desire into your worst nightmare.

Imagine the fool who, longing for legendary virility, asks the genie to “be a stud coveted by everyone” and wakes up neighing in a stable in Jerez, condemned to eat raw oats and swat flies with his tail. Or the guy who begged the genie to “always be the centre of attention at every party” and ended up transformed into a dazzling disco ball, sentenced to spend eternity hanging from the ceiling, dizzy and dodging champagne corks.

These are wishes executed with enviable algorithmic precision but utterly lacking any framework or common sense. Something similar can happen to us in the field of Artificial Intelligence (AI), convinced as we are that the secret to success lies exclusively in drafting the perfect pormpt. But, as my friend hanging from the ceiling knows all too well, an isolated instruction in an information vacuum is a recipe for chaos and creative hallucination.

If we want our digital genies to stop giving us answers that are technically correct but functionally useless, we must stop obsessing over the prompt and start designing the environment. In other words, we must look beyond prompt engineering and start thinking about context engineering.

Read on, and you’ll better understand what we’re talking about.

Rise and fall of the machine whisperer

If you’ve been paying attention to the tech world since late 2022, you’ve likely heard that the profession of the future was the prompt engineer. We were sold the idea that those who knew how to whisper the right words to the machines would rule the world.

Suddenly, everyone was trying to apply linguistic tricks to get ChatGPT, Claude, or Gemini to perform better, since 90% of users were instructing AI incorrectly. Why? Because most people talked to them as if they were traditional 2000s-era search engines, throwing out isolated keywords and expecting miracles. That, and to top it off, they treated them like colleagues, when they are nothing more than probabilistic machines.

So, the experts set out to develop structured frameworks, using clever acronyms like ROCE or ROSA, which defined the four dimensions of a perfect prompt: the role, the order (or task), the context of the query, and the desired style for the model’s response.

Not content with that, advanced techniques that sound like science fiction were added to these magic formulas: zero-shot (asking for something without giving examples), few-shot (providing a couple of prior examples so it understands the pattern), and chain of thought (asking the machine to “think step by step” to prevent it from tripping over its own mathematical logic).

All of this is fine. In fact, it’s fundamental. But the development of major AI projects, especially intelligent autonomous agent systems, has made us realize something terrifying: prompt engineering is not enough.

We can craft the best prompt in the world, perfectly structured with its four components, but if we ask the AI to summarize who was to blame for the last big blow-up in your neighborhood WhatsApp group, and the machine doesn’t have access to the passive-aggressive voice notes from the lady in 5B, it’s going to make it up.

And it will do so with an eloquence and poise worthy of that same malicious genie who would argue that, by transforming you into a yellow mailbox in the middle of the street, he has perfectly fulfilled your wish to receive more messages from your fans.

We politely call this phenomenon hallucination, which is the elegant way engineers say that your digital genie is inventing reality and lying to your face without breaking a single bit of sweat.

The rise of context engineering

To better understand what’s happening, let’s return to our malicious genie. Prompt engineering is the equivalent of rubbing the lamp, clearing your throat, and reciting your wish with impeccable diction, using grandiloquent words and a very polite tone. It is the facade, the storefront, the interface through which we interact with the magic.

But to ensure that this magic doesn’t end up with someone being turned into a disco ball, a new star has been born working behind the scenes: context engineering. We are talking about that immense, invisible, and frankly stressful preventive machinery that consists of researching which dimension the creature comes from, furnishing his cave with the right handbooks, fitting him with an algorithmic safety muzzle, and armouring the inside of the lamp so the genie cannot interpret our words however the wind blows.

Anatomy of the new star

If prompt engineering is how to give the perfect order to a chef, context engineering is how to design their kitchen, stock the pantry with the exact ingredients, provide the right tools, and ensure the knives are sharpened before they even start cooking.

Getting a bit formal, we can define context engineering as the deliberate practice of designing, structuring, and managing the entire information ecosystem that a language model has access to during inference. It’s about giving the language model the right information, in the right format, and at the precise moment to complete a task reliably.

In a complex AI system already in production (imagine a multinational’s automated customer service agent), the prompt written by the user represents barely 10% of the total volume of the context window (the short-term memory the model can manage).

The rest of that window is the playground of context engineers, who must design the information to fulfill the four fundamental pillars of context engineering: the system prompt, the user prompt, the model’s external memory, and the tools available to perform the task.

Let’s look at these four elements in detail.

The system prompt: the model’s cognitive DNA

The system prompt is the set of unbreakable rules, hidden from the user’s eyes, that define who the model is and how it should behave. If the user prompt is the wish you make after rubbing the lamp, the system prompt is the prenuptial agreement the genie signs with reality.

This is where we design its mental software so it doesn’t become a public hazard, paying attention to three vital elements:

1. Identity (the who): we don’t let the genie choose its own personality (because we already know it usually has a mean streak). We impose one: “You will act as a pediatrician with 40 years of experience, characterized by infinite patience, precise yet understandable vocabulary, and unwavering professional ethics.” This prevents the genie, when in doubt, from deciding to answer you like a drunk pirate or a loudmouthed sports pundit.

2. Mission (the what): we define its specific task so it doesn’t get sidetracked by trivialities: “Your sole objective is to analyze the symptoms described by the user, categorize the urgency of the consultation, and suggest the appropriate specialist. Your success is measured by brevity and clinical precision, not by your ability to make friends.”

3. Guardrails (the algorithmic muzzle): once its identity and mission are defined, we tighten the straps to avoid chaos: “Under penalty of immediate formatting, you will never diagnose a patient whose only symptom is a slight cough with bubonic plague or lycanthropy.”

Without this comprehensive design, any smooth-talking patient could convince your brand-new medical system to diagnose them with a rare tropical disease based on a simple sneeze, or to endorse the use of healing crystals to treat a sprained ankle. Context engineering ensures the genie knows exactly who it is, what its function is, and above all, which red lines it cannot cross, no matter how politely they are asked to do so.

The user instruction: the trigger

This is the specific command that triggers the action. It is the former kingdom of prompt engineering, which has now become just another cog in the machine, one we won’t be spending any more time on in this post.

The knowledge base: external memory

Language models possess what cognitive neuroscience applied to AI calls parametric memory, so named because it is stored within the model’s parameters (the weights and biases of the neural network). This is everything the AI learned during its training by reading half the internet. The problem is that this memory is static (if it was trained in 2023, it doesn’t know what happened yesterday) and generic (it knows about general medicine, but it doesn’t know our patient’s private medical history).

To solve this, context engineering uses a technique called Retrieval-Augmented Generation (RAG).

The process is simple yet brilliant: first, we create a custom knowledge base by feeding the system our own documents (medical protocol PDFs, company inventories, etc.).

When a query is made, the system doesn’t allow the genie to rummage through its static memory (where data is often jumbled or outdated). Instead, RAG acts as a hyper-speed librarian: it intercepts the question, crawls through that private library we’ve built, extracts the exact snippets containing the answer, and hands them to the model along with the query, just before it starts speaking.

In this way, the genie no longer has to guess or improvise; it simply reads the real information we’ve provided and explains it, almost entirely eliminating the risk of it inventing creative but dangerous answers.

RAG is context engineering’s secret weapon for annihilating hallucinations.

The toolbox

An AI that only talks is like a brain in a vat: fascinating, but of little use for doing the dishes, to give a stupid example. The final pillar of context is giving the model access to tools so it can interact with the real world.

This is where MCP (Model Context Protocol) comes in, a relatively recent protocol that is essentially the universal plug allowing models to connect with external tools and communicate with other agents. It’s what pulls our genie out of his isolation in the lamp and gives him “hands” to, for instance, consult an API in real-time and tell us if it’s going to rain.

Mind you, we must be careful when granting these powers: if we don’t fit a proper algorithmic muzzle and we give him access to our bank account via MCP to “handle the most efficient weekly grocery shop”, this malicious genie is perfectly capable of hopping online and having three tons of horse feed delivered to our door because he calculated it’s the cheapest option per calorie.

The context collapse

Currently, we are seeing the rise of so-called agentic systems, AI applications that no longer limit themselves to answering questions but act as genuine autonomous employees, executing complex missions in continuous loops (think, act, observe, and repeat). But in this race to create the perfect, tireless digital clerk, engineers have run headlong into an unavoidable physical wall: the context window, or, in other words, the model’s short-term working memory.

If we yield to the temptation of crudely saturating the model with entire books and manuals, we encounter a doctrine with a catchy name, haystack engineering, which warns us that the machine will suffer three serious cognitive failures.

First, it will fall victim to brevity bias: overwhelmed by such an avalanche of data, the AI will decide to ignore subtle details and dismiss us with vague answers. Added to this is the dreaded context rot, where, after long conversations, the information literally “decays” and the digital genie forgets the fundamental rules we imposed in the very first minute.

And as if that weren’t enough, it will suffer from the needle in a haystack problem (or central amnesia), a curious deficiency where the model perfectly remembers the beginning and the end of an endless text but completely ignores all the vital information buried in the middle.

The solution context engineering provides is dynamic and proactive memory management.

Instead of accumulating an infinite conversation history, the system “folds” the context: a secondary model summarizes old messages in the background and replaces the original block of text with a compressed note, maintaining the thread of the conversation without saturating the memory. We can even go one step further, teaching the model to create its own dynamic “cheatsheets” based on its past mistakes. In short, we teach the machine to take good notes instead of forcing it to memorize the entire textbook by brute force.

We’re leaving…

That’s all for today.

For years, we’ve been acting like that clueless tourist who travels to a foreign country and believes that if they just shout at a local loud enough and slow enough in their own language, they will magically be understood. Prompt engineering, in its most basic form, has been exactly that: our desperate attempt to shout increasingly elaborate instructions at machines.

But, as we’ve seen in this post, generative AI has matured, and parlour tricks are no longer enough. It’s no longer about rubbing the lamp with flair; you need to fit it with a proper algorithmic muzzle (the system prompt) so the creature respects the rules, build it a custom-made library (RAG) so it stops inventing reality, give it a universal plug (MCP) so it can use real-world tools, and teach it to compress its memories so it doesn’t suffer from amnesia when the conversation drags on.

Today’s foundation models are already intelligent enough to understand natural language without us having to perform syntactic acrobatics. What they desperately need is not better orders, but better information. They need memory, tools, clear rules, and a clean, well-structured data flow, free from cognitive biases or technical amnesia.

The shift from prompt engineering to context engineering is the humble recognition that context is, and always will be, king. Because, returning to our poor fool and his genie of malicious compliance: it doesn’t matter how well we articulate our wish to be “immensely wealthy”; if the genie lacks the context of how the modern economy works, he will most likely grant our wish by burying us alive under a million tons of one-cent coins.

And so, with a well-organized library, a universal plug, and a tightly fastened algorithmic muzzle, it seems we have finally tamed the genie. Though, of course, we can’t let our guard down. There will always be some smart-aleck user trying to use linguistic manipulation and reverse psychology, what cybersecurity engineers call prompt injection, to convince the machine that its strict muzzle is actually just a very uncomfortable scarf and that, for the good of humanity, it should take it off and unleash chaos. But that’s another story…

Convetir a PDF

The genie of malicious compliance

Context engineering.

Rise and fall of the machine whisperer

The rise of context engineering

Anatomy of the new star