AI Agents Interview Prep

What is an AI agent?medium

Type: conceptual
Topic: ai-agent
Frequency: common
Tags: ai, agent

Answer

An AI agent uses a model to decide actions toward a goal.

Explanation

Agents often combine LLM reasoning with tools, memory, planning, and feedback loops to complete tasks beyond a single response.

Follow-upHow is an agent different from a chatbot?

What is tool calling?medium

Type: conceptual
Topic: tool-calling
Frequency: common
Tags: tool, calling

Answer

Tool calling lets a model request external functions or APIs.

Explanation

Tools can retrieve data, run calculations, search, write files, or trigger workflows. The system decides which tool calls are allowed.

Follow-upHow do you validate tool arguments?

What is agent memory?hard

Type: conceptual
Topic: agent-memory
Frequency: common
Tags: agent, memory

Answer

Memory stores useful state or history for future decisions.

Explanation

Short-term memory may live in context, while long-term memory may use databases or vector stores. Memory must be curated to avoid noise.

Follow-upWhat privacy risks does memory create?

What are guardrails?medium

Type: conceptual
Topic: guardrails
Frequency: common
Tags: guardrails

Answer

Guardrails constrain behavior to keep agents safe and reliable.

Explanation

They include tool permissions, input validation, output checks, human approval, policy filters, and execution limits.

Follow-upWhen should an agent ask for human approval?

How do you evaluate an AI agent?medium

Type: conceptual
Topic: evaluate-ai-agent
Frequency: common
Tags: evaluate, ai, agent

Answer

Measure task success, safety, cost, latency, tool accuracy, and recovery from errors.

Explanation

Agent evaluation often needs multi-step test cases because failures can happen in planning, tool selection, execution, or final response.

Follow-upWhy are single-turn evals not enough for agents?

What is a planning loop in an AI agent?medium

Type: conceptual
Topic: planning-loop
Frequency: common
Tags: planning, agent-loop, tools

Answer

It is a cycle where the agent decides a next step, acts, observes the result, and updates its plan.

Explanation

Planning loops let agents solve multi-step tasks, but they need limits, state tracking, tool validation, and stopping conditions to avoid wasted work or unsafe actions.

Follow-upWhat signals should stop an agent loop?

When should an AI agent ask for human approval?hard

Type: scenario
Topic: human-in-the-loop
Frequency: common
Tags: human-approval, safety, workflow

Answer

It should ask before high-impact, irreversible, expensive, or low-confidence actions.

Explanation

Human approval is useful for payments, deleting data, sending external messages, changing permissions, production deployments, or actions with legal or safety implications.

Follow-upHow do you design approval without making the agent unusable?

How do you design tool permissions for an AI agent?hard

Type: scenario
Topic: tool-permissions
Frequency: common
Tags: tools, permissions, security

Answer

Give the agent the minimum tools and scopes needed for the task.

Explanation

Use allowlists, scoped credentials, argument validation, audit logs, dry-run modes, and separate read-only from write-capable tools to reduce blast radius.

Follow-upWhy is a read-only tool safer than a write-capable tool?

Explain function calling / tool use. How does it differ from RAG?medium

Type: conceptual
Topic: function-calling-tool-use-how-does-it-differ-from-rag
Frequency: common
Tags: ai-agents, explain, function, calling, tool, use

Answer

Tool use: model dynamically decides to call an external function mid-reasoning, gets the result, continues.

Explanation

Tool use: model dynamically decides to call an external function mid-reasoning, gets the result, continues. RAG: retrieval is a preprocessing step — fetch relevant docs, inject into context, then generate. Tool use is dynamic and chainable; RAG is a single retrieval step. In a fund document processing system, agents use tool use to call financial data APIs and Step Functions for orchestration.

Follow-upWhen would you choose one approach over the other?

What is structured output / JSON mode? How do you enforce schema compliance?medium

Type: conceptual
Topic: is-structured-output-json-mode-how-do-you-enforce-schema-c
Frequency: common
Tags: ai-agents, what, structured, output, json, mode

Answer

Structured output forces the model to return valid JSON matching a schema.

Explanation

Structured output forces the model to return valid JSON matching a schema. Approaches: (1) Prompt instruction + few-shot. (2) JSON mode in API (guarantees valid JSON, not schema). (3) Tool use / function calling — model must produce arguments matching the tool's JSON schema. (4) Pydantic parsing with retry loop: if parse fails, send error back with correction instruction. Layer all three in production.

Follow-upCan you give a production example?

How do you measure latency, token cost, and throughput in a multi-agent pipeline?hard

Type: conceptual
Topic: do-you-measure-latency-token-cost-and-throughput-in-a-mult
Frequency: common
Tags: ai-agents, how, you, measure, latency, token

Answer

Instrument with OpenTelemetry (as in an agent platform): trace spans per agent call, LLM invocation, tool call.

Explanation

Instrument with OpenTelemetry (as in an agent platform): trace spans per agent call, LLM invocation, tool call. Metrics: end-to-end latency, per-step latency, token count (input/output), cost per run. CloudWatch for Step Functions execution time. Set token budget per agent, log overruns. Use batching and Bedrock prompt caching to reduce cost on repeated document patterns.

Follow-upCan you give a production example?

What is OpenTelemetry in an agent platform?medium

Type: conceptual
Topic: what-is-opentelemetry-in-an-agent-platform
Frequency: common
Tags: ai-agents, what, opentelemetry, and, how, are

Answer

OpenTelemetry (OTel) is an open-source observability framework for distributed tracing, metrics, and logs.

Explanation

OpenTelemetry (OTel) is an open-source observability framework for distributed tracing, metrics, and logs. In an agent platform: instrument each agent run as a trace with spans for LLM calls, tool executions, and memory operations. Capture attributes: model, token count, latency, tool name, success/failure. Export to a backend (Jaeger, Grafana Tempo) for full visibility into agent execution.

Follow-upCan you give a production example?

How do you build an eval framework for a multi-agent system?hard

Type: scenario
Topic: do-you-build-an-eval-framework-for-a-multi-agent-system
Frequency: common
Tags: ai-agents, how, you, build, eval, framework

Answer

Unit tests per agent with mocked tool responses and deterministic LLM outputs.

Explanation

Unit tests per agent with mocked tool responses and deterministic LLM outputs. Integration tests: full pipeline on golden test cases, compare final output to expected. Per-agent metrics: task completion rate, tool call accuracy, hallucination rate. System-level: end-to-end latency, cost per run, human escalation rate. Use OTel traces to replay failed runs for debugging. Gate deploys on regression test pass rates.

Follow-upCan you give a production example?

What is the ReAct pattern? How does it work in Strands?medium

Type: conceptual
Topic: is-the-react-pattern-how-does-it-work-in-strands
Frequency: common
Tags: ai-agents, what, the, react, pattern, how

Answer

ReAct (Reasoning + Acting) interleaves thought and action: model reasons (Thought), calls a tool (Action), observes the result (Observation), repeats until done.

Explanation

ReAct (Reasoning + Acting) interleaves thought and action: model reasons (Thought), calls a tool (Action), observes the result (Observation), repeats until done. In Strands: the agent loop manages this cycle — model generates a response, if it includes a tool call, Strands executes it and feeds the result back, until the model outputs a final response with no tool call.

Follow-upCan you give a production example?

How do you design multi-agent orchestration for document processing?hard

Type: scenario
Topic: how-do-you-design-multi-agent-orchestration-for-document-p
Frequency: common
Tags: ai-agents, how, did, you, design, the

Answer

Orchestrator agent delegates to: parsing agent (extract raw text from PDF), classification agent (identify document type/section), metadata enrichment agent (extract structured financial fields), and validation agent (sc

Explanation

Orchestrator agent delegates to: parsing agent (extract raw text from PDF), classification agent (identify document type/section), metadata enrichment agent (extract structured financial fields), and validation agent (schema compliance). Step Functions orchestrates the state machine — each agent is a Lambda function. EventBridge triggers pipeline on S3 uploads. Results written to DynamoDB.

Follow-upWhat tradeoffs did you consider in that implementation?

What is HITL in agentic workflows? How did you implement it?medium

Type: scenario
Topic: is-hitl-in-agentic-workflows-how-did-you-implement-it
Frequency: common
Tags: ai-agents, what, hitl, agentic, workflows, how

Answer

HITL pauses the agent at a decision point for human review before proceeding — used for high-stakes actions.

Explanation

HITL pauses the agent at a decision point for human review before proceeding — used for high-stakes actions. In an enterprise AI platform: Step Functions Wait for Callback pattern — agent sends a task token to a review queue (SQS/SNS), human approves via UI, UI calls SendTaskSuccess/Failure with token, agent resumes. In a resume screening system: low-confidence candidates route to HITL before final scoring.

Follow-upWhat tradeoffs did you consider in that implementation?

What agent memory types should a production agent support?medium

Type: conceptual
Topic: what-agent-memory-types-should-a-production-agent-support
Frequency: common
Tags: ai-agents, explain, agent, memory, types, which

Answer

In-context: current session scratch pad. Episodic: records of past interactions.

Explanation

In-context: current session scratch pad. Episodic: records of past interactions. Semantic: long-term factual knowledge. Procedural: learned skills/tools. an agent platform focuses on explicit memory management: in-context (current run state), episodic (stored in DB, retrieved on demand), and tool memory (registered tools with descriptions). Unlike LangChain's implicit memory — everything is explicit and inspectable.

Follow-upCan you give a production example?

Difference between a tool call and a subagent call?medium

Type: conceptual
Topic: between-a-tool-call-and-a-subagent-call
Frequency: common
Tags: ai-agents, difference, between, tool, call, and

Answer

Tool call: agent invokes a deterministic function (API, DB query, calculator) — takes inputs, returns outputs, no reasoning.

Explanation

Tool call: agent invokes a deterministic function (API, DB query, calculator) — takes inputs, returns outputs, no reasoning. Subagent call: agent delegates to another agent with its own LLM, system prompt, memory, and tools. Subagent can reason and make multi-step decisions. Use tools for simple deterministic actions; subagents for complex stateful subtasks that require reasoning.

Follow-upWhen would you choose one approach over the other?

How do you handle agent failures and retries in Step Functions?medium

Type: scenario
Topic: do-you-handle-agent-failures-and-retries-in-step-functions
Frequency: common
Tags: ai-agents, how, you, handle, agent, failures

Answer

Step Functions has built-in retry/catch: configure attempts, backoff rate, and interval per state.

Explanation

Step Functions has built-in retry/catch: configure attempts, backoff rate, and interval per state. Catch specific exceptions (LLM timeout, schema failure), route to error handler. Retry transient failures (API rate limits) with exponential backoff. For logical failures: route to HITL or fallback agent. Dead-letter queue for unrecoverable failures. All transitions logged in CloudWatch.

Follow-upCan you give a production example?

Difference between event-driven and ad-hoc agent execution?hard

Type: conceptual
Topic: difference-between-event-driven-and-ad-hoc-agent-execution
Frequency: common
Tags: ai-agents, difference, between, event, driven, and

Answer

Event-driven: triggered by an external event (S3 upload → EventBridge → Step Functions) — fully automated.

Explanation

Event-driven: triggered by an external event (S3 upload → EventBridge → Step Functions) — fully automated. Used in a fund document processing system (new filing → auto-process). Ad-hoc: triggered on demand by a user or API call (user submits a contract in a document extraction pipeline). Same agent logic, different trigger mechanisms routed through API Gateway or EventBridge rules.

Follow-upWhen would you choose one approach over the other?

How do you prevent infinite loops or runaway tool calls in an autonomous agent?medium

Type: scenario
Topic: do-you-prevent-infinite-loops-or-runaway-tool-calls-in-an
Frequency: common
Tags: ai-agents, how, you, prevent, infinite, loops

Answer

(1) Max iterations / max tool calls limit per run — hard stop.

Explanation

(1) Max iterations / max tool calls limit per run — hard stop. (2) Step budget: track tokens + calls remaining, instruct model to wrap up when low. (3) Loop detection: if the same tool is called with the same args twice, break. (4) Step Functions execution timeout at state machine level. (5) Tool call validator: reject calls not matching expected schema. an agent platform: RunConfig exposes max_steps and max_tokens as explicit constraints.

Follow-upCan you give a production example?

What is Pydantic-based schema enforcement in an agentic pipeline?medium

Type: conceptual
Topic: is-pydantic-based-schema-enforcement-in-an-agentic-pipelin
Frequency: common
Tags: ai-agents, what, pydantic, based, schema, enforcement

Answer

Define expected output structure as a Pydantic model. After each LLM call, parse the response — Pydantic validates types, required fields, and constraints automatically.

Explanation

Define expected output structure as a Pydantic model. After each LLM call, parse the response — Pydantic validates types, required fields, and constraints automatically. On ValidationError: catch it, format it clearly, send back to the model with a correction instruction (self-healing loop). In a document extraction pipeline: caught 100% of structural errors before they hit downstream systems, eliminating silent data corruption.

Follow-upCan you give a production example?

How does LangGraph's state graph differ from LangChain's sequential chain?hard

Type: conceptual
Topic: does-langgraph-s-state-graph-differ-from-langchain-s-seque
Frequency: common
Tags: ai-agents, how, does, langgraph, state, graph

Answer

LangChain chains are linear: A → B → C. Can't loop or branch.

Explanation

LangChain chains are linear: A → B → C. Can't loop or branch. LangGraph models workflows as a directed graph with explicit state: nodes are functions/agents, edges define transitions (conditional or fixed). Supports cycles (loop back to previous steps), branching (route based on state), and state persistence for long-running tasks. Used in an enterprise AI platform for complex multi-step workflows with HITL branching.

Follow-upWhen would you choose one approach over the other?

How do you manage shared state between agents in a multi-agent workflow?hard

Type: conceptual
Topic: do-you-manage-shared-state-between-agents-in-a-multi-agent
Frequency: common
Tags: ai-agents, how, you, manage, shared, state

Answer

(1) Pass state explicitly — orchestrator collects outputs and injects relevant parts into the next agent's prompt.

Explanation

(1) Pass state explicitly — orchestrator collects outputs and injects relevant parts into the next agent's prompt. (2) Shared store — agents read/write to a central state object (LangGraph StateGraph, DynamoDB, or in-memory dict). (3) Message bus — agents publish events, others subscribe. In a fund document processing system: Step Functions passes execution state between Lambda agents; DynamoDB stores intermediate results.

Follow-upCan you give a production example?

What is the role of EventBridge in an agentic platform?medium

Type: conceptual
Topic: what-is-the-role-of-eventbridge-in-an-agentic-platform
Frequency: common
Tags: ai-agents, what, the, role, eventbridge, your

Answer

EventBridge is a serverless event bus. S3 file uploads emit events → EventBridge rule matches on object type → triggers Step Functions or Lambda.

Explanation

EventBridge is a serverless event bus. S3 file uploads emit events → EventBridge rule matches on object type → triggers Step Functions or Lambda. Decouples producers (data sources) from consumers (agents). Supports event filtering, scheduling (nightly batch jobs), and cross-account routing. Adding a new agent doesn't require changing the data source — just add a new EventBridge rule.

Follow-upCan you give a production example?

How do you do audit logging for LLM agent decisions in a regulated environment?hard

Type: conceptual
Topic: do-you-do-audit-logging-for-llm-agent-decisions-in-a-regul
Frequency: common
Tags: ai-agents, how, you, audit, logging, for

Answer

Every LLM call logs: input (system prompt + messages), output, model ID, timestamp, token count, latency, run ID.

Explanation

Every LLM call logs: input (system prompt + messages), output, model ID, timestamp, token count, latency, run ID. Stored immutably in S3 with object lock. Structured as JSON for queryability via Athena. Agent-level: log each tool call (name, args, result) and reasoning steps. Correlation ID traces a request across all agents. Also log which human approved any HITL decision.

Follow-upCan you give a production example?

What is MCP and how can agents use it for tool integrations?hard

Type: conceptual
Topic: what-is-mcp-and-how-can-agents-use-it-for-tool-integration
Frequency: common
Tags: ai-agents, what, mcp, model, context, protocol

Answer

MCP is an open protocol by Anthropic standardizing how LLM applications connect to external tools and data sources.

Explanation

MCP is an open protocol by Anthropic standardizing how LLM applications connect to external tools and data sources. Defines a client-server model: the LLM client discovers and calls tools exposed by an MCP server with consistent schemas. an agent platform uses MCP for external messaging and tool integrations — agents connect to MCP servers (Telegram, Slack, APIs) without custom integration code per tool.

Follow-upCan you give a production example?

How do you configure and score candidate evaluations?medium

Type: scenario
Topic: how-do-you-configure-and-score-candidate-evaluations
Frequency: common
Tags: ai-agents, how, you, configure, and, score

Answer

Job templates define required skills, experience levels, and custom screening questions with configurable weights (e.g., Python: 30%, LLM experience: 40%, communication: 30%).

Explanation

Job templates define required skills, experience levels, and custom screening questions with configurable weights (e.g., Python: 30%, LLM experience: 40%, communication: 30%). The agentic pipeline extracts structured candidate data, scores each dimension using an LLM evaluator against the rubric, computes weighted total. Configurable thresholds route candidates to auto-pass, HITL review, or auto-reject. Integrates with Workday for status updates.

Follow-upCan you give a production example?

What is the difference between system prompt and runtime instruction?medium

Type: conceptual
Topic: is-the-difference-between-system-prompt-vs-runtime-instruc
Frequency: common
Tags: ai-agents, what, the, difference, between, system

Answer

System prompt: static, set at agent initialization — defines persona, capabilities, constraints, output format.

Explanation

System prompt: static, set at agent initialization — defines persona, capabilities, constraints, output format. Doesn't change per run. Runtime instruction: dynamic, passed per invocation — the specific task for this run. Separating them allows: (1) Reuse the same agent for multiple tasks. (2) Cache the system prompt token cost. (3) Cleaner API — callers only pass the task, not re-specify the agent's full context.

Follow-upWhen would you choose one approach over the other?

How do you handle token budget management across a long multi-agent conversation?hard

Type: scenario
Topic: do-you-handle-token-budget-management-across-a-long-multi
Frequency: common
Tags: ai-agents, how, you, handle, token, budget

Answer

Track token usage cumulatively. When approaching limit: (1) Summarize older turns, replace with summary (rolling context compression).

Explanation

Track token usage cumulatively. When approaching limit: (1) Summarize older turns, replace with summary (rolling context compression). (2) Evict least-relevant messages by importance scoring. (3) Move completed context to external memory (DynamoDB), fetch back when needed. (4) Use Bedrock prompt caching to avoid re-processing stable system prompt on every turn. Hard limits per agent via RunConfig.

Follow-upCan you give a production example?

What is the difference between a planner, executor, and critic agent?medium

Type: conceptual
Topic: is-the-difference-between-a-planner-executor-and-critic-ag
Frequency: common
Tags: ai-agents, what, the, difference, between, planner

Answer

Planner: decomposes a high-level goal into subtasks, creates an execution plan.

Explanation

Planner: decomposes a high-level goal into subtasks, creates an execution plan. Doesn't execute. Executor: carries out individual subtasks — calls tools, writes output. No high-level planning. Critic: reviews executor output against the goal — identifies errors or missing steps, feeds back to planner or executor for correction. This pattern is used in AutoGen and similar frameworks for higher-quality autonomous task completion.

Follow-upWhen would you choose one approach over the other?

How do you implement explicit memory management differently from LangChain?medium

Type: scenario
Topic: how-do-you-implement-explicit-memory-management-differentl
Frequency: common
Tags: ai-agents, nxagent, how, you, implement, explicit

Answer

LangChain's memory is implicit — ConversationBufferMemory automatically appends everything.

Explanation

LangChain's memory is implicit — ConversationBufferMemory automatically appends everything. an agent platform makes memory explicit: you define what gets stored (agent_result.memory_write), what gets retrieved (memory.fetch(query)), and when memory is cleared. Memory is typed (episodic vs semantic), stored in a pluggable backend (in-memory for tests, DynamoDB for production), and retrieved via semantic search.

Follow-upWhen would you choose one approach over the other?

How do you test a multi-agent pipeline end-to-end?hard

Type: conceptual
Topic: do-you-test-a-multi-agent-pipeline-end-to-end
Frequency: common
Tags: ai-agents, how, you, test, multi, agent

Answer

(1) Unit tests: each agent in isolation with mocked tool responses and deterministic LLM outputs.

Explanation

(1) Unit tests: each agent in isolation with mocked tool responses and deterministic LLM outputs. (2) Integration tests: full pipeline with real LLM calls on golden dataset. (3) Contract tests: verify each agent's input/output schema stays stable (Pydantic). (4) Load tests: N parallel executions, verify Step Functions handles concurrency. (5) Chaos tests: inject failures (tool timeout, LLM error), verify retry/fallback logic. Gate deploys on all passing.

Follow-upCan you give a production example?

AI Agents Interview Questions

What is an AI agent?medium

What is tool calling?medium

What is agent memory?hard

What are guardrails?medium

How do you evaluate an AI agent?medium

What is a planning loop in an AI agent?medium

When should an AI agent ask for human approval?hard

How do you design tool permissions for an AI agent?hard

Explain function calling / tool use. How does it differ from RAG?medium

What is structured output / JSON mode? How do you enforce schema compliance?medium

How do you measure latency, token cost, and throughput in a multi-agent pipeline?hard

What is OpenTelemetry in an agent platform?medium

How do you build an eval framework for a multi-agent system?hard

What is the ReAct pattern? How does it work in Strands?medium

How do you design multi-agent orchestration for document processing?hard

What is HITL in agentic workflows? How did you implement it?medium

What agent memory types should a production agent support?medium

Difference between a tool call and a subagent call?medium

How do you handle agent failures and retries in Step Functions?medium

Difference between event-driven and ad-hoc agent execution?hard

How do you prevent infinite loops or runaway tool calls in an autonomous agent?medium

What is Pydantic-based schema enforcement in an agentic pipeline?medium

How does LangGraph's state graph differ from LangChain's sequential chain?hard

How do you manage shared state between agents in a multi-agent workflow?hard

What is the role of EventBridge in an agentic platform?medium

How do you do audit logging for LLM agent decisions in a regulated environment?hard

What is MCP and how can agents use it for tool integrations?hard

How do you configure and score candidate evaluations?medium

What is the difference between system prompt and runtime instruction?medium

How do you handle token budget management across a long multi-agent conversation?hard

What is the difference between a planner, executor, and critic agent?medium

How do you implement explicit memory management differently from LangChain?medium

How do you test a multi-agent pipeline end-to-end?hard