Console Sessions

Enterprise tool · AI agentic platform

Syllable AI agentic platform

Launched in 2025 Jan

Syllable’s AI agentic platform, launched in January 2025, enables healthcare enterprises to build and manage voice-based AI agents at scale.

As part of the platform’s initial rollout, I led the design of Sessions—a QA tool that empowers internal analysts, engineers, and enterprise clients to evaluate AI agent performance across millions of conversations.

I designed both:

  • Sessions V1 — a manual QA tool to support our immediate launch needs.

  • North Star Vision — a long-term vision leveraging AI automation to scale the QA process.
    I collaborated closely with data scientists to validate the potential of LLM-generated insights, paving the way for the platform's future evolution.

Due to NDA, only non-confidential insights are shared here. Feel free to reach out if you'd like to learn more.

My Role

  • Owned UX design for the Sessions QA experience, from user research to IA and interaction design.

  • Led conceptualization of the North Star vision for AI-powered automated QA.

  • Collaborated cross-functionally with product managers, engineers, and data scientists to validate LLM responses and AI feature opportunities.

Period
2024 Oct- Nov

Skills
User Research · UI/UX design · Product Thinking · Information Architecture · Cross-functional Collaboration · Visionary Thinking for AI Automation

Designing a tool to evaluate LLM-AI agent interactions

As we prepared to launch the platform, one core design challenge emerged:


”How can teams effectively QA millions of LLM-powered conversations, each with complex, real-time AI agent actions
?”


Traditional call review tools fell short. These new AI agents handled natural dialogue, triggered APIs, and navigated unpredictable human behavior—demanding new ways of surfacing what happened in a call and whether the LLM-supported AI agents behaved as intended.

Approach

Identifying role-specific needs

Our first step was understanding what different users needed to see during the QA process. Through stakeholder interviews, we identified three main personas:

  • QA Analysts: Needed to review transcripts and detect conversation failures.

  • Engineers: Required visibility into tool/API events and system actions.

  • Enterprise Clients: Wanted a clear, high-level summary and quick transcript review.

We also uncovered critical insights:

  • Users struggled to quickly assess what happened in a call without listening end-to-end.

  • Engineers needed transparency into which tools/API calls were triggered and when.

  • Enterprise clients lacked a clear, structured way to report and track issues.

Designing IA and Navigation for complex AI Reviews

Designing a review tool for LLM-powered conversations required us to rethink how multiple layers of information could be structured and surfaced clearly. These sessions are rich with context: transcripts, real-time API calls, evaluation metadata, and QA markers. The goal was to create an interface where these dimensions coexist meaningfully—without overwhelming the users.

I began by identifying the four critical information types that needed to be presented:

  1. A navigable call list for cross-session context

  2. The call transcript with playbar

  3. Real-time API and tool invocation logs

  4. Reporting panel to manually label AI performance

To match these with user needs, I mapped mental models across three primary personas. After multiple explorations, we use a modular, three-panel layout to support these diverse workflows:

  • Primary Panel: Transcript with AI-generated summaries and timeline markers for easy scanning

  • Secondary Panel: Tool Inspector displaying real-time API activity and system context

  • Tertiary Panel: Reporting Panel for labeling, filing an issue, and structured feedback capture

This role-sensitive IA gave each user the ability to focus on their critical tasks, without losing access to supporting information. The layout ensured clarity while supporting progressive disclosure—users could go deep when needed, or stay high-level when moving fast. We designed sticky session navigation, making it easy to move between calls.

Additionally, to improve in-session and between-call efficiency we also

  • I designed sticky session navigation, making it easy to move between calls

  • Added a smart playbar with markers for silences, agent turns, and jump points

  • Made tool events traceable , tying system logic directly to conversation flow.

Reporting issues with ease

Previously, issue reporting was manual—often through email, disconnected from the session itself. I designed a contextual slide-in debug panel to support in-the-moment reporting with minimal friction.

Users can:

  • Rate AI agent performance using a simple, expressive smiley scale

  • Select from structured issue categories with smart pre-filled options

  • Add optional comments for clarification

To encourage consistent reporting, we introduced a lightweight, expressive rating system using smiley icons. This design decision lowered the barrier for quick input—especially useful for non-technical users who might otherwise hesitate to provide feedback.

I also implemented smart pre-filling of common issue types for quick reporting. Together, these decisions created a structured yet accessible feedback loop between users and the engineering team, helping us identify recurring issues and close the QA loop faster.

Led Vision for AI-Powered QA in Sessions

Beyond V1, I led the creation of a North Star vision for Sessions: automating the QA process using AI itself.

In partnership with data science, we validated the feasibility of using LLMs to automatically surface call insights, detect failures, and generate evaluation signals—transforming a manual, labor-intensive workflow into a scalable, intelligent system.

This North Star vision set the direction for the platform’s next generation of quality assurance, fundamentally changing how we think about AI performance evaluation at scale. To bring this future closer to reality, I designed early interaction models and UX foundations.

AI Content Indicators
Subtle visual icons marking AI-generated insights within the call review interface.

Task-Based AI Navigation
Seamless integration of AI tools into the user's natural workflow within Console, based on specific review tasks.

Onboarding Screens
Contextual guidance to help users understand and adopt AI features.

Feedback Modules
Lightweight, modular components enabling users to review and respond to various types of AI-generated outputs.

Impact

After the Sessions V1 launch, it became a core pillar of Syllable’s platform, enabling faster, clearer, and more collaborative AI evaluation.

2X faster QA cycles through streamlined navigation and in-call surfacing.

📊 12M+ calls reviewed across internal and client teams.

🔍 Wider adoption by enterprise partners using Sessions for audits and training.

🤝 Improved cross-functional workflow between QA, Engineering, and Customer Success.