
Project Overview
Context
Optimizely is a B2B SaaS company that provides digital experience optimization tools. Its Feature Experimentation (FX) platform helps businesses run A/B tests without modifying source code.
However, non-technical users found the platform unintuitive and difficult to navigate, which risked slowing adoption.
As part of a client-sponsored project through UW HCDE, I worked with a 5-person team to investigate these challenges. Partnering closely with Optimizely’s product manager, we aligned on business goals and identified usability barriers in the A/B test setup flow.
Our goal was to uncover where non-technical users struggled most, deliver actionable design recommendations to improve satisfaction and drive adoption.
My Contribution
Research leadership: Designed the test protocol, moderated usability sessions, and synthesized findings.
Design exploration: Ideated and illustrated solutions based on research insights to support new user adoption and satisfaction.
Client collaboration: Acted as the main point of contact and presented the final report to stakeholders.
Client
Role
UX Researcher & Designer
Timeline
8 Weeks (2025)
Team
4 Researchers
1 Product Manager
Skills
User Test Plan & Kit Development
Remote Usability Test Moderation
Data Analysis (Quant + Qual)
Client Communication
Impact
14 issues identified
From moderated usability tests and affinity mapping, we uncovered 14 issues across global and feature-level UX, categorized into 5 areas of improvements.
Prioritized insights & designs
We delivered clear, prioritized insights and design recommendations backed by data. The product team plans to use them in future redesigns.
Design Exploration Preview
Problem
A core product, but low satisfaction
While FX is one of the most-used products in Optimizely, it consistently received feedback as confusing, especially for less technical users like product managers and marketers. Many struggled to navigate and launch A/B tests with confidence.
400+
users
FX is one of the most used screens in our product.
90%
usability pain points
In a recent product satisfaction survey, 90% of feedback on FX focused on poor usability.
project Goal
Uncover friction points and improve adoption
Our goal was to identify specific usability pain points and propose clear design fixes to improve adoption, reduce support needs, and help users feel more confident using FX.
Defining the research scope
Upon aligning with the client on the business needs and problems, I facilitated defining the research scope, narrowing our focus to usability issues in the FX interface — especially for non-technical users. We focused our research around the following key questions:
Key research questions
How intuitive is FX for setting up and managing A/B tests?
What challenges and frustrations do users face when navigating and using FX?
How well do users understand the purpose and functionality of the FX interface while running A/B testing?
UNDERSTANDING the USERS
Our research target
We focused on non-technical users with little to no FX experience but some familiarity with A/B testing. This profile reflects FX’s target audience—business roles expected to run tests without engineering help. Prior A/B testing knowledge helped us ensure issues we uncovered were due to the interface, not domain unfamiliarity.
Understanding the product
Familiarizing with the FX platform
To prepare for the user study, we began with a tool walkthrough and interaction mapping exercise to onboard ourselves.
FX Interaction Map
As a non-technical first-time user, I faced a key obstacle during onboarding: the FX interface felt complex, and I lacked domain knowledge in A/B testing. Concepts like “flags,” “variants,” and “variables” were unfamiliar. Even understanding how a typical A/B test runs wasn’t straightforward.
To overcome this, I reviewed developer documentation, initiated team discussions, and asked clarifying questions in client meetings.
This fast-tracked my understanding of both the platform and its terminology. More importantly, it helped me build empathy for our target users — many of whom would face the similar hurdles. It also shaped how I approached task design later in the study.
plan and conduct the study
Design broad, scenario-driven tasks to reflect real workflows
I collaborated with the PM to design scenario-driven tasks that mimic real-world use. Because FX has no standard “happy path,” I avoided prescriptive task flows. Instead, I made tasks broad enough for users to explore naturally, as they would in a real case.
Conduct remote usability test
Moderated usability test, with Qualitative + Quantitative data
We ran remote moderated usability tests with 7 participants, ranging from product managers to content strategists. I moderated 3 sessions.
Each session followed a consistent flow and combined both quantitative metrics and qualitative observations. This helped us uncover pain points, measure severity, and understand user behavior in context.
Research Findings
Research findings & Prioritization
14 challenges identified, prioritized by user severity and product impact
Through the research, we identified 14 usability issues, spanning both global and feature-specific aspects. They fall into categories including 1) information hierarchy, 2) functionality, 3) status visibility, 4) affordance, and 5) terminology.
With each of our findings, I organized them into different levels, helping Optimizely to prioritize their efforts towards the issues.
Severity → captures the immediate pain for the user during tasks (how hard it is to succeed).
Impact → captures the longer-term product and business consequences (trust, misuse, adoption).
Severity x Impact Prioritization Matrix
System-level challenges and opportunities were identified from our research. Below are some design solutions that I explored to address those challenges. They show how targeted improvements can reduce friction across major flows, support onboarding, and improve usability for both new and returning users.
1. Make system language accessible.
Across the platform, users struggled with abstract system terms like “flag” and “rule”. Without context, they hesitated or asked for clarification before proceeding. This lack of clarity slowed setup and created avoidable errors for non-technical users.
Hoverhelp tooltip, with documentation links embedded.
I proposed a low-effort, short-term solution: surfacing definitions directly in the product. Instead of inline explanatory text, which would clutter the UI, I added hoverhelp tooltips. This gives clarity when needed while keeping the layout clean for advanced users.
Current design
❌ Technical jargon with no explanation.
❌ Users must leave the workflow to check documentation.
New design
✅ Terms explained in context with hover help
✅ Keeps layout clean while supporting novices when needed.
2. Strengthen hierarchy and visual clarity.
Weak hierarchy and subtle cues caused users to miss critical context. For example, environment labels were overlooked, leading to misconfigurations. Active and concluded tests were mixed together, creating clutter. CTAs and status indicators also lacked visual emphasis.
Restructured hierarchy – clearer sections and emphasized indicators.
I redesigned the ruleset panel, where I separated active and concluded experiments into distinct sections, made the environment more prominent at the top, and emphasized CTAs and status chips. These changes reduce clutter, prevent environment errors, and support faster scanning.
Current ruleset panel design
❌ Active and concluded experiments mixed together, creating clutter.
❌ Environment labels subtle and easily overlooked.
❌ CTAs and status indicators visually understated.
New ruleset panel design
✅ Environment context gets prominent at the top, less likely to be missed.
✅ CTAs and status chips emphasized for quick recognition.
✅ Active and concluded sections clearly separated made navigation easy.
3. Provide robust onboarding guidance to lower the learning curve.
First-time and non-technical users struggled with multi-step workflows such as creating experiments or interpreting traffic controls. The existing onboarding relied on static checklists and documentation, which failed to explain interactions in context or prevent early mistakes.
Blank state prompts paired with a guided walkthrough.
To better support first-time and non-technical users, I designed two complementary onboarding approaches. A blank state prompt in the empty panel would guide users toward their first action. An interactive learn-by-doing walkthrough would let them practice setting up a dummy experiment step by step. Each tooltip anchors to real controls, explains the action, and only progresses once completed. Together, these approaches reduce early errors, clarify key concepts, and build user confidence faster.
Current onboarding design
❌ Static checklists offered little real guidance.
❌ Reading documentation felt effortful and disconnected from the task.
New onboarding design
✅ Interactive walkthrough clarifies concepts, lowers learning curve, and accelerates first success.
✅ Blank state prompt directs users to their first key action towards their goal.
Next Steps
Internal Review & Design Implementation
Based on the prioritized findings, the Optimizely PM and Design team will evaluate solutions in light of engineering effort and business goals. The proposed designs serve as references and a solid foundation for implementation, to be adapted and expanded through internal review.
Reflection
1. Always run pilot tests when possible.
Running a pilot test with a peer (who matched the target participant profile) surfaced critical issues early. It helped us refine task flow, adjust timing, and clarify instructions. If resources allowed, I would also pilot with a real user or client to ensure the study design directly supports research goals.
2. Align more actively with the client on recruitment.
In this study, the client managed recruitment, and some participants only partially matched our criteria. For future projects, I would be more involved in recruitment to ensure consistency and participant quality. This would strengthen the reliability of findings.
3. Plan session logistics with more buffer.
Some sessions were scheduled last-minute, leaving little time to refine tasks or align expectations with the client. This led to issues, such as discovering task gaps during live sessions. In the future, I would build in more time between tests, review tasks jointly with the client, and validate them in a pilot before full rollout.