How to Build Secure AI Applications: An SDLC Checklist.
This checklist walks through five phases of the software development lifecycle (from design and threat modelling through to runtime monitoring) with concrete checks at each stage for teams building with LLMs, RAG pipelines, MCP-connected tooling, and AI agents. Use it as a build guide, a pre-deployment audit, or a brief for your penetration testing engagement.
When I talk to SaaS teams about securing their AI features, the conversation almost always starts with the model. Can it be jailbroken?
Can it leak data?
Those are real questions. But they are not the whole picture.
The bigger risk is the application around the model, what it can access, retrieve, trigger, and expose.
Your AI features are not chat boxes. They process customer data, retrieve internal documents, call APIs, update records, and make decisions that affect real users.
So the real question isn't whether the model is well-behaved. It's: what can the model reach, what can it reveal, what can it trigger, and what happens when an attacker influences any part of that chain?
I built this checklist to answer that. It organizes testing by when in the build process each check belongs, and ties every item back to the five layers of a real AI-powered product:
- LLM API: the prompt in, the response out
- RAG: retrieval from your internal data sources
- MCP: tools the model can invoke
- Agents: multi-step, tool-using, goal-seeking workflows
- AI-assisted dev tooling: code your team ships with Copilot, Cursor, Claude Code, etc.
Whatever layer you're testing, you're really testing for the same three failures: untrusted input reaches a privileged context, output is trusted too early, and authorization is checked in the wrong place. Keep those in mind as you work through the phases below.
Phase 1: Design & Threat Modelling
Before a line of code ships. The goal is to map trust boundaries; every arrow between layers crosses one.
Diagram the full stack end-to-end. Identify every point where data crosses a trust boundary (user → prompt, document → context, model → tool, agent → agen
Phase 2, Development
While the feature is being built. Validate that secure handling is built in, not bolted on.
Phase 3, Pre-Deployment Testing
Adversarial testing before go-live. This is where you actually try to break it.
Phase 4, Deployment
Validate that controls assumed in design are actually live in production config.
Phase 5, Runtime & Monitoring
After launch. AI features and their data sources change continuously.
The Common Thread
If you take nothing else from this, test for these three things everywhere:
- Did untrusted input reach a privileged context? A user prompt reaching the system layer, a planted document reaching the model context, a user-level request reaching admin-scoped tool credentials.
- Was output trusted too early? Model output driving actions or rendering without validation; generated code executed without review; retrieved content used without authorization checks.
- Was authorization checked in the wrong place? UI-layer controls that miss the retrieval layer; session-level checks that don't apply at tool invocation; user-level assumptions that ignore service-credential scope.
Underneath, these are the same vulnerability classes you already know: broken access control, unsafe output handling, parameter injection, over-trusted third-party components. The stack is more complex and more connected than what we're used to, but the risks aren't harder to understand. They're just harder to test and reproduce. That's the part I'd push you not to skip.
Test the full stack, not just the model.
.avif)


