Console Sessions

Enterprise tool · AI agentic platform

Syllable’s AI agentic platform, launched in January 2025, enables healthcare enterprises to build and manage voice-based AI agents at scale. As part of the platform’s initial rollout, I led the design of Sessions—a tool built to help internal QA analysts, engineers, and enterprise clients evaluate AI agent performance across millions of conversations.

Due to NDA, I can only share non-confidential takeaways only. Feel free to reach out to learn more details.

My Role
Owned UX designs of the Sessions QA experience, including user research, IA design, and the interaction model for reviewing, debugging, and reporting on AI performance.

Period
2024 Oct- Nov

Skills
User research, competitive research, feature mapping, product thinking, Information architecture, UI/UX design, Cross-functional Collaboration

Syllable AI agentic platform

Launched in 2025 Jan

Designing a tool to evaluate AI voice agent interactions

As we prepared to launch the platform, one core design challenge emerged:

How can teams effectively QA millions of LLM-powered conversations, each with complex, real-time AI agent actions?

Traditional call review tools fell short. These new AI agents handled natural dialogue, triggered APIs, and navigated unpredictable human behavior—demanding new ways of surfacing what happened in a call and whether the AI behaved as intended.

Approach

Our first step was understanding what different users needed to see during the QA process. Through stakeholder interviews, we identified three main personas:

QA Analysts, who reviewed transcripts and listened for failures
Engineers, who needed access to tool/API events.
Enterprise Clients, who needed readable AI summary and review transcript.

Also identify the key insights that shaped the foundation of our design.

Users struggled to quickly assess what happened in a call without listening end-to-end.
Engineers needed transparency into which tools/API calls were triggered and when.
Enterprise clients lacked a clear, scalable way to report and track issues.

IA and navigation

Designing a review tool for LLM-powered conversations required us to rethink how multiple layers of information could be structured and surfaced clearly. These sessions are rich with context: transcripts, real-time API calls, evaluation metadata, and QA markers. Our goal was to create an interface where these dimensions coexist meaningfully—without overwhelming the users.

We began by identifying the four critical information types that needed to be presented:

A navigable call list for cross-session context
The call transcript with conversational flow
Real-time system actions like API and tool invocations
Debugging signals and QA evaluation inputs

To match these with user needs, we mapped mental models across three primary personas. After multiple explorations, we use a modular, three-panel layout to support these diverse workflows:

Primary Panel: Transcript with AI-generated summaries and timeline markers for easy scanning
Secondary Panel: Tool Inspector displaying real-time API activity and system context
Tertiary Panel: Debug Panel for tagging, scoring, and structured feedback capture

This role-sensitive IA gave each user the ability to focus on their critical tasks, without losing access to supporting information. The layout ensured clarity while supporting progressive disclosure—users could go deep when needed, or stay high-level when moving fast. We designed sticky session navigation, making it easy to move between calls.

Additionally, to improve in-session and between-call efficiency we also

We designed sticky session navigation, making it easy to move between calls

Added a smart playbar with markers for silences, agent turns, and jump points
Made API/tool events clickable, tying system logic directly to conversation flow.

Reporting issues with ease

Previously, issue reporting was manual—often through email, disconnected from the session itself.

We redesigned a contextual slide-in debug panel to supports in-the-moment reporting with minimal friction.

Users can:

Rate AI agent performance using a simple, expressive smiley scale
Select from structured issue categories with smart pre-filled options
Add optional comments for clarification

To encourage consistent reporting, we introduced a lightweight, expressive rating system using smiley icons. This design decision lowered the barrier for quick input—especially useful for non-technical users who might otherwise hesitate to provide feedback.

We also implemented smart pre-filling of common issue types for quick reporting. Together, these decisions created a structured yet accessible feedback loop between users and the engineering team, helping us identify recurring issues and close the QA loop faster.

Impact

Sessions became a core pillar of Syllable’s platform, enabling faster, clearer, and more collaborative AI evaluation.

Results:

⏱ 2× faster QA cycles through streamlined navigation and in-call surfacing.
📊 12M+ calls reviewed across internal and client teams.
🔍 Wider adoption by enterprise partners using Sessions for audits and training.
🤝 Improved cross-functional workflow between QA, Engineering, and Customer Success.