AI & machine learning

Prompt

The text you give an AI model to tell it what to generate. A prompt can be a simple question, a long instruction, a chunk of context plus a task, or a conversation history the model uses to produce its response.

Also known as: input, query

A prompt is the only knob a user controls when talking to an AI model. Everything else — the model’s weights, its training data, its alignment, its hardware — was set by someone else before you arrived. What you type in the box is the entire surface you have to influence what comes out. Getting good at writing prompts is getting good at being precise about what you want and giving the model enough context to do a useful job.

Modern LLM prompts have two parts. The system prompt is the instruction given by the application developer, invisible to the user, setting the model’s tone, role, and constraints. (“You are a helpful assistant specialising in cryptocurrency research. Always cite sources.”) The user prompt is whatever the person typing in the box wrote. The model sees both, concatenated, and produces a response conditioned on the whole sequence. This is why applications that give you an API but no system prompt access behave differently from ones that let you customise both.

Context windows matter for how long a prompt can be. Early LLMs could only handle a few thousand tokens of context, so you had to be brief. Modern frontier models handle hundreds of thousands of tokens, which means you can paste entire documents, code files, or conversation histories into the prompt and ask the model to reason across all of it. But context is not free: longer prompts cost more to process and the quality of attention across very long contexts still degrades in subtle ways. “Just give the model more context” is sometimes the right answer and sometimes a distraction.

The DeAI angle on prompts is privacy. When you send a prompt to a centralised API, the provider sees every word. They can log it, use it for training their next model, or hand it over on legal request. DeAI inference services (Venice, Phala, NEAR AI Private Chat) aim to prevent this by running inference inside TEEs where the operator can’t read the prompt even though their hardware is doing the work. How much you trust those guarantees depends on how robust the TEE is and how much of the stack is open for verification. Reading a project review carefully on this axis is how you decide whether “private inference” means what you want it to mean.

Related terms