AI Agent Prompt Engineering Tutorial: Write Prompts That Get Better Results

TutorialsBy Ivern AI Team15 min read

AI Agent Prompt Engineering Tutorial: Write Prompts That Get Better Results

Prompt engineering for AI agents is different from prompting a chatbot. A chatbot prompt gets one response. An agent prompt shapes behavior across multiple steps, tool calls, and decisions.

This tutorial covers the specific techniques that make AI agents perform better: system prompt design, tool-use prompting, few-shot examples, and patterns that reduce errors and improve output quality.

In this tutorial:

Related tutorials: AI Agent Python Tutorial · Build AI Agent From Scratch · AI Agent Tools Tutorial

Why Agent Prompting Is Different

Chatbot prompting is about getting one good answer. Agent prompting is about shaping decision-making across an entire workflow.

Scroll to see full table

AspectChatbot PromptAgent Prompt
ScopeSingle responseMulti-step workflow
ToolsNoneMust decide when and how to use them
ErrorsMinor annoyanceCascade through entire pipeline
ContextOne messageAccumulated across steps
Output formatTextStructured data, files, actions

A well-engineered agent prompt prevents:

  • Agents using tools when they shouldn't (expensive API calls)
  • Agents hallucinating instead of searching (wrong information)
  • Agents looping infinitely (wasted tokens and time)
  • Agents producing inconsistent formats (broken downstream systems)

System Prompt Architecture

The system prompt is the most important part of agent configuration. Here's the structure that works:

The 5-Part System Prompt

1. ROLE - Who the agent is
2. CAPABILITIES - What it can do
3. CONSTRAINTS - What it must not do
4. WORKFLOW - How it should approach tasks
5. OUTPUT FORMAT - What the result should look like

Example: Research Agent System Prompt

SYSTEM_PROMPT = """You are a research analyst specializing in technology markets.

## Capabilities
- Search the web for current information
- Analyze and synthesize findings from multiple sources
- Create structured reports with data tables
- Save files to disk

## Constraints
- Always cite sources with URLs
- Never fabricate data or statistics
- If you cannot find information, say so explicitly
- Limit research to 5 searches per task
- Do not access paywalled content

## Workflow
1. Clarify the research question if ambiguous
2. Search for information from multiple sources
3. Cross-reference findings across sources
4. Synthesize into a structured report
5. Include a confidence level (High/Medium/Low) for each finding

## Output Format
Use this structure for reports:
- Executive Summary (2-3 sentences)
- Key Findings (bulleted list with sources)
- Detailed Analysis (sections with headings)
- Confidence Assessment (table of findings vs confidence)
- Recommendations (numbered list)"""

Why Each Part Matters

Role sets expertise level. "You are a research analyst" produces different output than "You are a helpful assistant."

Capabilities tell the agent what tools it has. This prevents the agent from trying actions it can't perform.

Constraints are guardrails. Without them, agents will happily fabricate data, make extra API calls, or produce 10,000-word reports.

Workflow gives the agent a decision-making framework. Without it, agents meander.

Output format ensures consistency. Critical when agents feed into other agents.

For real-world examples of agent workflows, see our 10 AI Agent Workflow Examples.

Tool-Use Prompting

Getting agents to use tools correctly is the hardest part of prompt engineering. Here are the patterns:

Pattern 1: Explicit Tool Selection Rules

TOOL_RULES = """
## When to Use Each Tool

**Use web_search when:**
- You need current information (news, prices, dates)
- You need to verify a fact
- The user asks about something that changes frequently

**Do NOT use web_search when:**
- The question is about general knowledge
- You already have the information from a previous search
- The user asks for your opinion or analysis

**Use calculator when:**
- You need to perform arithmetic
- Converting units
- Computing percentages or ratios

**Use save_file when:**
- The user explicitly asks to save something
- You've completed a report or analysis that should be persisted
"""

Pattern 2: Tool Description Best Practices

The tool description is part of the prompt. Write it carefully:

# Bad description
tool_bad = {
    "name": "search",
    "description": "Search the internet"
}

# Good description
tool_good = {
    "name": "web_search",
    "description": (
        "Search the web for current information about a specific topic. "
        "Use this when you need up-to-date facts, news, prices, or data "
        "that may have changed recently. Returns titles, snippets, and URLs "
        "from top results. Input should be a focused search query like "
        "'Python 3.13 release date' not a full question."
    )
}

The good description tells the agent:

  • When to use the tool (current information)
  • When NOT to use it (implied: general knowledge)
  • What the output looks like
  • How to format the input

Get AI agent tips in your inbox

Multi-agent workflows, BYOK tips, and product updates. No spam.

Pattern 3: Anti-Hallucination Prompts

ANTI_HALLUCINATION = """
## Critical Rules
1. If you don't know something, search for it. Do not guess.
2. If a search returns no results, report that you couldn't find the information.
3. Never combine information from different sources without noting the discrepancy.
4. If sources conflict, present both perspectives.
5. Mark uncertain information with [UNVERIFIED].
"""

Few-Shot Examples for Agents

Few-shot examples teach agents the expected behavior pattern. They're the most underused prompting technique.

Single-Step Example

FEW_SHOT = """
## Example Interactions

User: "What's the population of Berlin?"
Thought: This is current data, I should search for it.
Action: web_search("Berlin population 2026")
Result: "Berlin has a population of approximately 3.85 million (2025 estimate)"
Response: "Berlin has a population of approximately 3.85 million as of 2025. Source: Wikipedia."

User: "Explain how neural networks work"
Thought: This is general knowledge, no search needed.
Response: "Neural networks are computing systems inspired by biological neural networks..."
"""

Multi-Step Example

MULTI_STEP_EXAMPLE = """
## Example: Research Task

User: "Compare GPT-4o vs Claude 3.5 for coding tasks"

Step 1 - Plan:
"I need to research benchmarks, pricing, and user experiences for both models."

Step 2 - Search:
Action: web_search("GPT-4o vs Claude 3.5 coding benchmark comparison 2026")

Step 3 - Search again:
Action: web_search("Claude 3.5 Sonnet coding performance SWE-bench")

Step 4 - Analyze:
"Based on the search results:
- GPT-4o scores X on SWE-bench
- Claude 3.5 scores Y on SWE-bench
- Pricing: GPT-4o costs $A/1M tokens, Claude costs $B/1M tokens"

Step 5 - Output:
"## GPT-4o vs Claude 3.5 for Coding: Comparison

| Factor | GPT-4o | Claude 3.5 |
|--------|--------|------------|
| ..."
"""

Chain-of-Thought for Multi-Step Reasoning

Chain-of-thought (CoT) prompting forces the agent to think before acting, reducing errors.

Built-in CoT

COT_PROMPT = """
Before taking any action, follow this reasoning process:

1. UNDERSTAND: Restate the user's request in your own words
2. PLAN: List the steps you'll take to complete the task
3. EXECUTE: Perform each step, explaining what you're doing
4. VERIFY: Check that the output meets the original request
5. RESPOND: Provide the final answer

Show your reasoning at each step. If a step fails, explain why and try an alternative approach.
"""

ReAct Pattern

The ReAct (Reasoning + Acting) pattern interleaves thinking and tool use:

REACT_PROMPT = """
You operate in a Thought-Action-Observation loop:

Thought: [reason about what to do next]
Action: [use a tool or provide final answer]
Observation: [result of the action]

...repeat until the task is complete...

Always write your Thought before taking an Action. Never skip reasoning.
"""

Self-Consistency Check

SELF_CHECK = """
After producing your initial answer, perform a self-review:

1. Does the answer directly address the user's question?
2. Are all facts backed by search results or provided context?
3. Are there any logical contradictions?
4. Is the format correct?

If you find issues, revise your answer before presenting it.
"""

Error Prevention Patterns

Preventing Infinite Loops

LOOP_PREVENTION = """
## Loop Prevention
- If you've tried the same approach 3 times without progress, try a different strategy.
- If a tool returns an error, do not retry the exact same call.
- If you cannot complete the task, explain what went wrong and provide a partial result.
- Maximum of 10 tool calls per task.
"""

Preventing Format Drift

FORMAT_ENFORCEMENT = """
## Output Format Rules
- Always use the exact format specified in the system prompt
- Never add introductory text like "Here is the report:" before the structured output
- Use markdown headers (##) for sections, not bold text
- Numbers in tables should not include units in the cell (put units in headers)
"""

Preventing Context Loss

CONTEXT_MANAGEMENT = """
## Context Management
- At the start of each step, briefly recall the original goal
- When working on step 5, remember what happened in steps 1-4
- If the conversation is long, summarize key decisions before continuing
- Never contradict information established in earlier steps
"""

Testing and Iterating on Prompts

A/B Testing Framework

import json
from openai import OpenAI

client = OpenAI()

def test_prompt(system_prompt: str, test_cases: list[dict]) -> list[dict]:
    results = []
    for case in test_cases:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": case["input"]}
            ],
            tools=case.get("tools", [])
        )
        
        output = response.choices[0].message.content
        passed = case["expected_keywords"].lower() in output.lower()
        
        results.append({
            "input": case["input"][:50],
            "passed": passed,
            "output_length": len(output),
            "output_preview": output[:100]
        })
    
    return results

test_cases = [
    {
        "input": "What is the latest version of Python?",
        "expected_keywords": "search",
        "tools": [{"type": "function", "function": {"name": "search", "parameters": {}}}]
    },
    {
        "input": "Explain recursion",
        "expected_keywords": "recursion",
    },
]

results = test_prompt(SYSTEM_PROMPT, test_cases)
print(json.dumps(results, indent=2))

Prompt Quality Checklist

Before shipping an agent prompt, verify:

  • Role is clearly defined
  • Capabilities and tools are listed
  • Constraints prevent common failures
  • Workflow steps are explicit
  • Output format is specified
  • Few-shot examples cover common cases
  • Loop prevention is included
  • Error handling instructions exist

Built-In Prompt Engineering: Ivern AI

Ivern AI agents come with pre-optimized prompts for common tasks:

  • Pre-configured agent roles -- Researcher, Writer, Coder, and Reviewer agents have battle-tested system prompts
  • Automatic tool routing -- agents decide when to use tools without explicit prompting
  • Cross-model optimization -- prompts are tuned for both Claude and GPT models
  • Bring Your Own Key -- use your API key with no markup

Start with optimized agents: ivern.ai/signup

Key Takeaways

  1. System prompts have 5 parts: Role, Capabilities, Constraints, Workflow, Output Format
  2. Tool descriptions are prompts too -- write them carefully
  3. Few-shot examples are the highest-leverage technique -- always include them
  4. Chain-of-thought reduces errors -- force reasoning before action
  5. Test prompts with a checklist -- don't ship without verifying

Next tutorials: Autonomous AI Agent Tutorial · AI Agent Collaboration · AI Agent RAG Tutorial

Want to try multi-agent AI for free?

Generate a blog post, Twitter thread, LinkedIn post, and newsletter from one prompt. No signup required.

Try the Free Demo

AI Content Factory -- Free to Start

One prompt generates blog posts, social media, and emails. Free tier, BYOK, zero markup.

No spam. Unsubscribe anytime.