Intervyou — 2023–2025
Case study
01 / 06
The work

How do you design AI behaviour for a product when no UX patterns exist yet?

Intervyou is an AI-powered, candidate-centric hiring platform built to fix a process that has a 50% failure rate. I led product design end to end — from problem framing through to shipping the MVP — with a particular focus on how AI was integrated into the experience.

Role
Founder & Lead Product Designer
Timeframe
2023 — 2025
Industry
HR Tech · Recruitment
Status
Live, closed pilot
01
Research
02
Workshops
03
Prompt mapping
04
Prototyping
05
User testing
06
Pilot validation
The tension

The only major business process that runs at a 50% failure rate — and is accepted as normal.

Hiring is the only major business process that runs at a 50% failure rate and gets accepted as normal. Companies globally lose around $400 billion a year to bad hires. The interview format itself — high-pressure, conversational, 1:1 — is well documented in occupational psychology research as a poor predictor of on-the-job performance. It selects for one thing: the ability to interview.

The candidates who lose are the ones whose strengths don’t show up in a 45-minute conversation under pressure. Introverts. Anxious candidates. Neurodivergent candidates. Anyone who needs more than one channel to demonstrate what they can actually do. That’s roughly half the population, screened out before the work has even started.

The product brief was clear: build a hiring platform that uses AI to fix this. The design brief was much harder. AI wasn’t a single feature — it had to be woven through role creation, candidate scoring, interview question generation, async video evaluation, and an AI-led interview experience itself. And in 2023, when we started, established UX patterns for AI at that depth simply did not exist.

When the technology has no precedents, what is your design process for deciding how it should behave?
The framing

Three decisions shaped everything downstream.

No process diagram. The interesting work was not the sequence — it was the three moves below. Each one set up the constraints I’d use to make every later decision.

Move 01
Invert the brief.

Most recruitment platforms optimise for the recruiter — filter, rank, eliminate. I re-framed Intervyou as candidate-centric. Better signal comes from giving more candidates more ways to show their work, not from filtering harder at the top of the funnel.

Move 02
Treat AI as behaviour, not feature.

A button labelled “Generate with AI” is a feature. What the AI says, what it refuses to say, how it explains itself, when it defers to a human, and how it surfaces uncertainty — that is behaviour. Behaviour has to be designed before pixels.

Move 03
Trust before capability.

It would have been easier to lead with what the AI could do. Instead, I led with how the AI would earn the user’s confidence over time. That meant explainable outputs, assistive language, and humans always in control of decisions.

Three design moments

Where the design work actually happened.

Moment 01
Inverting the brief — designing for the candidate, not the recruiter.
Moment 02
Designing AI behaviour without precedents.
Moment 03
Designing for trust — explainability as the load-bearing pattern.
01
Design moment

Inverting the brief — designing for the candidate.

Two cross-functional workshops with engineering and product mapped the candidate journey end-to-end and identified where signal was being lost. From there, I built the entire interview model around a multi-stage, multi-format process. Written responses, async video, simulated scenarios, role-specific scenario questions.

No single channel decides anything. A candidate who freezes on video can still demonstrate their thinking in writing. A candidate who writes poorly can show their reasoning verbally. Each stage is a separate piece of evidence, not a gate.

The structure was validated through paper-prototype walk-throughs with candidates drawn from a beta tester pool before any high-fidelity work began.

The recruiter-side benefit is real and counterintuitive: more viable candidates make it to the decision stage, which means the recruiter is choosing between strong options rather than narrowing under uncertainty.

“Hiring agent overlay — candidate analysis with stage-by-stage signals, links to full evaluations, and approved next-step actions.”
Fig. 01 Hiring agent · recruiter-side overlay
Candidate-side application flow — async video step.
Fig. 02 Async video question
Candidate-side application flow — scenario step.
Fig. 03 Scenario-based question
02
Design moment

Designing AI behaviour without precedents.

Across the product, AI was doing six distinct jobs: generating roles, generating interview questions, scoring written responses, evaluating async video, surfacing candidate recommendations, and conducting AI interviews. Each one had its own failure modes and its own trust problem. There were no precedents I could lift from established design systems. We were building the precedent.

Rather than start with screens, I started with prompt maps. The process ran prompt-map-first: for every AI feature, I led a workshop with engineering and architecture to define — explicitly — what the AI was allowed to do, what it was not allowed to do, what its inputs were, what its outputs looked like, what tone it spoke in, and what it should do when uncertain. These weren’t ideation documents. They were specs that engineering built against.

Three rules emerged that became the foundation of the system:

  1. Assistive, never authoritative. AI surfaces, explains, and suggests. Humans always decide.
  2. Predictable. The same input produces the same shape of output every time. No surprises.
  3. Explainable by default. Every AI output ships with the reasoning, sources or criteria behind it. If we can’t explain it, we don’t surface it.

I prototyped each AI interaction with rapid throwaway flows — including using AI itself to stress-test edge cases — before any UI was built. That meant we caught failure modes (hallucinated candidate names in summaries, overconfident scoring) at the behaviour layer, where they were cheap to fix, instead of at the UI layer where they would have shipped.

“The strategic AI–UX collaboration board — cross-functional working session output.”
Fig. 04 Strategic AI–UX collaboration board
A prompt map — inputs, boundaries, escalation paths.
Fig. 05 Prompt map detail
Principle
from the work
Trust scales with explainability, not accuracy. A 70% AI that shows its work beats a 95% AI that doesn’t.
03
Design moment

Designing for trust — explainability as the load-bearing pattern.

AI candidate scoring is opaque by default. The system gives you “92% match” and expects you to trust it. No senior hiring manager will. And they shouldn’t.

Every AI output in the product had to carry its reasoning with it. The candidate evaluation screen doesn’t say “92% match” — it shows you which stage the candidate performed strongly in, which response drove that signal, and what the gaps are. The recommendation copy is deliberately language-controlled: “consider,” not “hire.” The hiring manager can override at any point, and the AI never makes a final decision. The AI interviewer always discloses that it is an AI, and adapts its questions in real time based on candidate responses to reduce pressure without losing structure.

I tested both surfaces in moderated sessions with hiring managers drawn from pilot teams, and the AI interviewer was tested in unmoderated sessions with candidate participants. Two findings drove late-stage redesigns: hiring managers wanted reasoning surfaced before the recommendation, not after, and candidates trusted the AI more when it acknowledged uncertainty than when it sounded confident. Both insights ended up in the shipped product.

The microcopy work mattered as much as the IA work. The difference between “score,” “match,” “fit,” and “signal” — that’s not a copy decision, that’s a trust decision. Most of my time in the final design phase was spent on language.

“Candidate evaluation screen — reasoning surfaced before recommendation, with stage-by-stage explainability.”
Fig. 07 Candidate evaluation · explainability detail
“Stage breakdown — AI-recommended candidates with scored reasoning surfaced before recruiter action.”
Fig. 08 Stage analytics · AI recommendations
Principle
from the work
Treat AI as a behaviour to be designed, not a feature to be added.
Outcomes

What the pilot actually changed.

From the closed pilot with early teams. Each number is tied to a specific design decision — not a vanity metric.

Headline metric
60%

Reduction in time-to-hire across pilot teams. Driven by parallel multi-stage evaluation replacing serial interview rounds.

Role creation
95%

Time saved building a role — ~1 hour collapsed to ~5 minutes.

Candidate alignment
72%

Better candidate-to-role alignment reported by hiring managers.

UI clarity
87%

Of users praised interface simplicity in pilot feedback.

Would recommend
92%

Of pilot users would recommend Intervyou to a peer.

Reflection

What I’d do differently.

Where I underweighted the work.

Language. I treated microcopy as a polish-phase concern and found out the hard way that on an AI product, language is where trust gets won or lost. The word “score” implies finality. The word “signal” implies an input to a decision. Pick the wrong one and the entire product feels different. Next time, language design starts at the same time as IA.

Where I’d push earlier.

Prompt maps. On Intervyou, prompt mapping ran in parallel with screen design — they caught up to each other. Next time, prompt maps lead screen design by at least a sprint. Knowing the behaviour first makes the screens fall out naturally; designing the screens first means retrofitting the behaviour.

Where this thinking goes next

Behaviour before pixels. Every AI product, from now on.

Intervyou was the proving ground for a way of working with AI that I now bring to any product where a model is making suggestions on behalf of a user. Recruitment was the first place to apply it. It is far from the last.