Prompt engineering is the art and science of crafting inputs that produce desired outputs from language models. The same model can generate brilliant insights or nonsense depending on how you ask. Prompt engineering is not tricking the model or finding magic words. It is communicating intent clearly, providing appropriate context, and structuring output expectations.

I have spent thousands of hours refining prompts for classification, extraction, generation, and reasoning tasks. I have learned that small changes in phrasing produce dramatically different results. I have seen chain-of-thought prompting transform unreliable reasoning into consistent logic. I have struggled with output format compliance and developed patterns that enforce structure. If you are looking for ready-to-use prompts for business operations, see our collection of prompt engineering templates for small business. This guide covers the patterns that work: fundamental prompting approaches from zero-shot to few-shot, chain-of-thought reasoning for complex problems, structured output enforcement, role prompting for specialized behavior, and systematic optimization strategies.

The Fundamentals

The Anatomy of a Prompt

System prompt: Sets behavior, constraints, and persona.

User message: The specific task or question.

Context: Background information needed to complete the task.

Instructions: How to approach the task.

Output format: Expected structure for the response.

[SYSTEM]
You are a senior software engineer reviewing code for security vulnerabilities.
Be thorough but concise. Focus on SQL injection, XSS, and authentication issues.

[USER]
Review the following Python function for security issues:

def get_user(user_id):
    query = f"SELECT * FROM users WHERE id = {user_id}"
    return db.execute(query)

Provide your findings as:
1. Vulnerability name
2. Severity (Critical/High/Medium/Low)
3. Explanation
4. Recommended fix

Zero-Shot Prompting

The model completes the task with no examples.

Classify the sentiment of this product review:

"This phone has amazing battery life and the camera is incredible. 
Best purchase I've made this year."

Sentiment:

When to use:

  • Simple, well-defined tasks
  • Model has strong training on the task
  • Speed is important

Limitations:

  • May misunderstand format expectations
  • Less consistent than few-shot
  • Struggles with novel complex tasks

Few-Shot Prompting

Provide examples of desired input-output pairs.

Classify the sentiment of these product reviews:

Review: "Terrible quality, broke after one day. Waste of money."
Sentiment: Negative

Review: "Decent product, not great but worth the price."
Sentiment: Neutral

Review: "Absolutely love it! Exceeded all expectations."
Sentiment: Positive

Review: "Shipping was fast but the item doesn't match description."
Sentiment:

Guidelines:

  • Include 2-5 examples for most tasks
  • Show variety in inputs
  • Format examples exactly as desired output
  • Place examples immediately before the target task

Example selection:

  • Choose diverse, representative examples
  • Include edge cases
  • Ensure examples are correct (models amplify errors)

Abstract visualization of a single, raw prompt input passing through structured example filters to focus the output

Chain-of-Thought Prompting

Basic Pattern

Encourage step-by-step reasoning.

Step-by-step logic pathways showing a sequence of connected nodes representing chain-of-thought reasoning

Solve this math problem step by step:

A store has 50 apples. They sell 15 in the morning and get a delivery 
of 30 more in the afternoon. How many apples do they have at the end 
of the day?

Let's solve this step by step:
1. Start with initial apples: 50
2. Subtract morning sales: 50 - 15 = 35
3. Add afternoon delivery: 35 + 30 = 65
4. Final count: 65 apples

Answer: 65 apples

Self-Consistency

Generate multiple reasoning paths and take the most common answer.

answers = []
for _ in range(5):
    response = model.generate(
        prompt,
        temperature=0.7  # Add randomness
    )
    answers.append(parse_answer(response))

# Take majority vote
final_answer = most_common(answers)

Tree of Thoughts

Trace multiple reasoning branches.

Problem: Find the shortest path from A to D in this graph.

Approach 1: Via B
- A to B: 5 units
- B to D: 8 units
- Total: 13 units

Approach 2: Via C
- A to C: 3 units
- C to D: 7 units
- Total: 10 units

Comparison: Approach 2 (10 units) is shorter than Approach 1 (13 units).

Best path: A → C → D (10 units)

Structured Output Patterns

JSON Mode

Enforce JSON output (supported by GPT-4, Claude 3).

Loose unstructured data particles organizing themselves into structured geometric cubes representing structured output formatting

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{
        "role": "user",
        "content": """Extract the following information from this text:

"John Smith is a software engineer at Google with 5 years of experience.
He lives in San Francisco and specializes in machine learning."

Return as JSON with fields:
- name: string
- role: string
- company: string
- years_experience: number
- location: string
- specializations: array of strings"""
    }],
    response_format={"type": "json_object"}  # Force JSON
)

# Parse result
import json
data = json.loads(response.choices[0].message.content)

Schema Enforcement

Provide explicit type information.

Extract information from the invoice and return valid JSON:

Required format:
{
  "invoice_number": string,  // Format: INV-YYYY-NNNN
  "date": string,            // ISO 8601 date
  "total": number,           // Decimal, 2 places
  "items": [
    {
      "description": string,
      "quantity": integer,
      "unit_price": number
    }
  ],
  "vendor": {
    "name": string,
    "tax_id": string  // Optional
  }
}

Invoice text:
[invoice content here]

Function Calling / Tool Use

Define output as function parameters.

tools = [
    {
        "type": "function",
        "function": {
            "name": "extract_meeting_info",
            "description": "Extract meeting details from text",
            "parameters": {
                "type": "object",
                "properties": {
                    "title": {"type": "string"},
                    "date": {"type": "string", "format": "date"},
                    "time": {"type": "string", "format": "time"},
                    "attendees": {
                        "type": "array",
                        "items": {"type": "string"}
                    },
                    "agenda_items": {
                        "type": "array",
                        "items": {"type": "string"}
                    }
                },
                "required": ["title", "date", "time"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{
        "role": "user",
        "content": "Meeting: Project kickoff tomorrow at 2pm with Alice, Bob, and Carol. "
                   "We'll discuss timeline and resource allocation."
    }],
    tools=tools,
    tool_choice={"type": "function", "function": {"name": "extract_meeting_info"}}
)

# Extract structured data
args = json.loads(
    response.choices[0].message.tool_calls[0].function.arguments
)

Role Prompting

Persona Assignment

Set expertise and perspective.

A single light beam splitting through a triangular prism into multiple specialized color bands representing different agent personas

You are a senior DevOps engineer with 10 years of experience managing 
AWS infrastructure. You are reviewing a Terraform configuration for 
security and best practices. Be thorough and mention specific AWS 
services and features.

Multi-Persona Debate

Have the model argue from multiple perspectives.

Analyze this architecture decision from three perspectives:

1. As a security engineer: Focus on data protection and access control
2. As a performance engineer: Focus on scalability and latency
3. As a cost engineer: Focus on resource efficiency and pricing

For each perspective, provide:
- Main concerns
- Potential issues
- Recommendations

Architecture: [description here]

Advanced Techniques

ReAct (Reasoning + Acting)

Combine reasoning with tool use.

You can use these tools:
- search(query): Search documentation
- calculate(expression): Perform calculations
- get_weather(location): Get weather data

Question: What is the population of Tokyo divided by the population of 
Osaka? Consider only the city proper, not metropolitan area.

Thought: I need to find the populations of both cities.
Action: search("population of Tokyo city proper 2026")
Observation: Tokyo (city proper) has a population of 14.09 million

Thought: Now I need Osaka's population.
Action: search("population of Osaka city proper 2026")
Observation: Osaka (city proper) has a population of 2.75 million

Thought: Now I can calculate the ratio.
Action: calculate("14.09 / 2.75")
Observation: 5.123636...

Answer: The population of Tokyo is approximately 5.12 times that of Osaka.

Prompt Chaining

Break complex tasks into sequential prompts.

def analyze_document(document):
    # Step 1: Summarize
    summary = llm.generate(f"Summarize this document in 3 sentences:\n{document}")
    
    # Step 2: Extract key points
    key_points = llm.generate(
        f"Based on this summary, extract 5 key points:\n{summary}",
        output_format="bullet_list"
    )
    
    # Step 3: Generate action items
    actions = llm.generate(
        f"Based on these key points, suggest action items:\n{key_points}"
    )
    
    return {
        'summary': summary,
        'key_points': key_points,
        'actions': actions
    }

Retrieval-Augmented Generation

Ground prompts in external knowledge. For a detailed guide on building the infrastructure for semantic search and document ingestion, read our article on building a RAG system for your business.

def answer_with_context(question, knowledge_base):
    # Retrieve relevant documents
    relevant_docs = knowledge_base.search(question, k=3)
    context = "\n\n".join([doc.text for doc in relevant_docs])
    
    prompt = f"""Answer the question based on the provided context.

Context:
{context}

Question: {question}

If the context does not contain the answer, say "I don't have 
sufficient information to answer this question."

Answer:"""
    
    return llm.generate(prompt)

Prompt Optimization

A/B Testing

Compare prompt variations systematically. When running large-scale tests, you should monitor token costs and API latency. For practical strategies on reducing model fees during development and production, see our guide to optimizing LLM token costs.

A split visual pathway comparing Path A and Path B with Path B ending in a glowing green optimized point

prompts = [
    "Classify the sentiment:",
    "Is this review positive, negative, or neutral?",
    "Rate the sentiment from -1 (very negative) to 1 (very positive):"
]

results = {}
for prompt in prompts:
    correct = 0
    for example in test_set:
        response = llm.generate(f"{prompt}\n\n{example.text}")
        if parse_sentiment(response) == example.label:
            correct += 1
    
    accuracy = correct / len(test_set)
    results[prompt] = accuracy

best_prompt = max(results, key=results.get)

Prompt Versioning

Track prompt changes and performance.

# prompts.yaml
classification_v1:
  template: "Classify: {text}"
  accuracy: 0.78
  
classification_v2:
  template: |
    Classify the following text as positive, negative, or neutral.
    
    Text: {text}
    Classification:
  accuracy: 0.85
  improved: true

Temperature and Sampling

Temperature:

  • 0.0: Deterministic, best for structured output
  • 0.7: Balanced creativity
  • 1.0: Maximum creativity

Top-p: Alternative to temperature, consider tokens until cumulative probability exceeds threshold.

# Consistent classification
response = client.chat.completions.create(
    model="gpt-4",
    messages=messages,
    temperature=0.0  # Deterministic
)

# Creative generation
response = client.chat.completions.create(
    model="gpt-4",
    messages=messages,
    temperature=0.9  # Creative
)

Domain-Specific Patterns

Code Generation

Write a Python function that [description].

Requirements:
- Include type hints
- Add docstring with examples
- Handle edge cases
- Follow PEP 8 style
- Include unit tests as comments

Function signature: def function_name(param: type) -> return_type:

Data Extraction

Extract structured data from this unstructured text.

Text: [unstructured text]

Extract:
1. Person names
2. Organization names
3. Dates
4. Monetary amounts
5. Email addresses

Return as JSON array with fields: type, value, context (surrounding text)

Summarization

Summarize the following article for a technical audience.

Constraints:
- Maximum 200 words
- Include 3 key takeaways as bullet points
- Maintain original tone
- Preserve technical accuracy

Article:
[article text]

Common Pitfalls

Pitfall 1: Vague Instructions

“Analyze this text” → “Extract named entities and their relationships”

Pitfall 2: No Output Format

No format specified → Inconsistent structure

Pitfall 3: Leading Questions

“Don’t you think X is bad?” → Biased responses

Pitfall 4: Too Much Context

Unrelated information dilutes focus

Pitfall 5: Assuming Knowledge

Acronyms without definitions

Pitfall 6: Not Testing Edge Cases

Works on common cases, fails on unusual inputs

Pitfall 7: Ignoring Security Vulnerabilities

Passing untrusted user input directly into prompt templates exposes your application to prompt injection attacks, allowing users to override system constraints. Check our guide on preventing prompt injection attacks to learn how to secure your pipelines.

Conclusion

Prompt engineering is a skill developed through practice and measurement. Start with clear, specific instructions. Provide examples for complex tasks. Use chain-of-thought for reasoning. Enforce structure through formatting instructions or function calling.

Test prompts systematically. A/B test variations. Version your prompts. Measure accuracy against labeled data.

The goal is not clever tricks but clear communication of intent to the model. The model wants to help; your job is to explain what you need.

Invest in prompt engineering. Good prompts make the difference between unreliable demos and production-ready AI features.


Further Reading