VW Usability Test

Analytics – Research – User Testing

The Goal

In this project, I was responsible for testing new UI concepts for Volkswagen’s compare tool.

The goal was to evaluate if the new features and optimizations developed by the team were intuitive and user-friendly.

The Challenge

One of the key challenges in this project is designing a study that collects both detailed feedback and valid, measurable data using the available tools. Since the company’s tool is limited to a maximum participant pool of 100, ensuring reliable results becomes difficult.

Deliverables

  • Set up of click test study under consideration of task-specific testing of prototypes

  • Analyzing the heatmap and success rates to identify user interaction patterns

  • Providing recommendations to project owners based on findings

My Role

  • Designing, conducting and reporting the results of the study

  • Delivery of measurable data

  • Analyzing both qualitative and quantitative data to inform design improvements

The Approach

Pain Point

The team had generated several new UI concepts aimed at improving the compare tool’s functionality. However, they needed to verify if their designs addressed user needs effectively.


The key questions were:

  • Could users intuitively find the features they needed (e.g. add/remove car models)?

  • How did different interface designs impact usability and task success?

  • Were there clear user preferences between variations of the tool?


The study needed a method that could evaluate ease of use, track task success, and provide space for user feedback. Choosing the right approach was key to getting useful results with the tools I had available.

Solution

I designed a click test study using the companys online tool to evaluate the team’s prototypes. This method was chosen because it allowed for task-specific tracking, heatmap generation, and quantitative success rates.

The study used static prototypes. After each task, follow-up questions were asked to capture participants' subjective experiences, ensuring a comprehensive analysis.

Task 1 – Removing a Model

Due to confidentiality only part of the prototype is shown

A/B Test – Variant A

Due to confidentiality only part of the prototype is shown

A/B Test – Variant B

Due to confidentiality only part of the prototype is shown

Task 4 – Alternative Concept

Due to confidentiality only part of the prototype is shown

Click Test Set up

Tasks tested:

  • Task 1: Removing a car model from the compare tool.

  • Task 2 & 3 (A/B test of different CTA’s) Comparing two variations with different locations of buttons that add a car model to the compare tool. Each participant tested only either version A or B to avoid cross-learning between tasks.

  • Task 4: An new concept of an alternative approach to the compare tool.

Follow-up questions after each task:

  • Task difficulty ratings (using a Likert scale).

  • Participants’ reasoning behind their ratings.

  • Whether the process of adding/removing cars matched user expectations.

  • How clearly the users were able to tell apart the compare features from the rest of the interface.

Data from heatmaps, success rates, and follow-up responses provided both quantitative and qualitative insights.

Overview of study setup

Challenges

After the first round of testing, results were unexpectedly contradictory. The task success rates were low, but the majority of participants rated the tasks as “easy.” We suspected the issue resulted from the Likert scale being arranged from “easy” to “difficult” (instead of the more conventional “difficult” to “easy”). Because participants are often rushing through multiple studies a day, they could have misunderstood the Likert scale.

To test this hypothesis, I repeated the study with reversed scales. Surprisingly, the results remained consistent, confirming that the initial findings were accurate.

Data Insights

  • Task 2 & 3 (A/B test of different CTA’s): Version A (85% success) outperformed Version B (45% success).

  • Task Success Rates: Most tasks had low success rates, indicating the need for refinement before implementation. The exception was Task 4 (95% success), which showed promise for a closer rollout of the prototype.

  • Interface Clarity: Feedback revealed some confusion about which elements belonged to the compare tool, pointing out that there are areas that needed clearer visual distinction.

Results changed for confidentiality

Heatmap results

Task 1

Due to confidentiality only part of the heatmap is shown

A/B Test – Version A

Due to confidentiality only part of the heatmap is shown

A/B Test – Version B

Due to confidentiality only part of the heatmap is shown

Task 4

Due to confidentiality only part of the heatmap is shown


The Conclusion

  • While the study provided useful insights, the the 100-participant limit made it harder to get more reliable results. For simpler A/B tests, this sample size was sufficient, but click tests often require larger groups to account for variability in user behaviour.

  • Nonetheless, the research highlighted clear opportunities for improvement. The team gained actionable recommendations, such as refining CTA placement and redesigning unclear interface elements.