Song Upload Tool — Sruthy Kumar

Challenge

Only 60% of songs uploaded to the Smule music library, included metadata tags such as genre and language. Incomplete metadata was hampering Smule’s ability to deliver a personalized consumption experience.

My Role

I was responsible for evaluating usability challenges in this tool and updating it to enable and encourage metadata input.

Outcome

92% of the songs uploaded now have at least one metadata tag 🎉

Final Product

Context

Smule is a karaoke app with over 40 million users worldwide. The songs on the app are arranged and uploaded by the users themselves, through the song upload tool. More metadata tags associated with uploaded songs means better personalized recommendations to users, hence the interest in improving this tool.

Goals

100% of songs uploaded have at least one language tag.
Input at least one metadata tag to describe a song’s mood/emotion.
Input at least one relevant genre tag per song.

Usability Testing

I tried the song upload tool, and the first time took about 10 minutes to input tags and all the metadata required to identify a song. This excludes the amount of time it would take to further arrange the song, add lyrics, select parts etc. It felt more like work than fun, so I ran usability tests to investigate what other aspects of the tool was causing friction.

My test subjects - all first time users of the tool - faced several usability issues. Those relevant to our goals were that:

5 / 5

users preferred to skip non-mandatory input fields.

4 / 5

users didn’t know how to go about determining the Genre, Mood/Emotion of a song, and so left it blank.

3 / 5

users didn’t realize the input fields would convert to tags.

The Opportunity

The solution was not design-only as we had to think about what kind of data would be useful to collect in the form of tags. I reframed the problems into:

Generate a finite list of Genre titles that users can search and select from.
Impress upon users through the use of better design and copy, the importance of inputting tags.
Consolidate screens and modify layout to highlight required fields.

What Didn’t Work

One of the concepts I struggled to design for was how could I help our users pick relevant tags to describe the mood or emotion that a song is evocative of? I worked with a domain expert - a music researcher - to understand and explore the different ways in which mood could be construed and categorized. I also consulted the machine-learning team to understand how they use metadata to personalize recommendations. From a user’s perspective, I had two main takeaways:

Tags cannot be captured in a single word

I asked testers to recommend “mood” tags for Britney Spears’ Baby One More Time. I got, Party, High School, 90s, Pop, Teenage, Crush and Girls. Then I exposed them to the song upload tool - generated by the music researcher and machine-learning team - a list of adjectives awaited them. The users tested had some strong feelings about the list:

“...tags seem prescriptive…”
“What’s the difference between ‘joyous’ and ‘elated’? How do I know which to choose?”
“What if I wanted to include a mood that was more than 1 word? Like chill music, or hip-hop. To me, that’s mood.”
“This feels more like a GRE/SAT Vocab list.”

That last one ☝️ made it clear to us that machine-learning needed to work with the unstructured input, instead of prescribing to the users how to think.

Tags need to align with more commonplace expressions

The second takeaway was that when it comes to music, mood is often mix of an emotion, a moment in time, a place (or all three at once!) captured as a phrase - e.g. friday night vibes, disco music, chill beats etc. It became apparent to me that tags for mood in particular need to align with more commonplace expressions. So I iterated on that concept:

Tag and layout suggestions for Occasion.

4/5 users tested felt this layout aligned better with how they would categorize a song they uploaded. They also felt that the presented options reduced cognitive load considering the main task at hand is not about tag input, but about arranging a song for other users to sing to.

The above explorations did not ship as part of this project. However, it was crucial in convincing the machine-learning team to modify how they collect and store unstructured data when it comes to tags, which was implemented as part of this project.

What Worked

Marking required input fields

In the old design, it wasn’t clear which fields were required and which weren’t, until the user tapped “Next”. I consolidated all the data input screens into one and made it clear which fields were mandatory.

Providing users with a purpose

The in old design, users didn’t have enough context on why they needed to add metadata. In the redesign, I led with that purpose as the header (to make your song easier to find). Metadata is also highlighted as a tag at the point of input - 5/5 users when tested, said this made it clear that these were tags. I also included tooltips to guide the users on how to add tags.

Pre-defined Dropdown List

Finite dropdown lists were generated for Language and Genre - this led to 92% of real world users adding at least one Language and at least one Genre tag. This brought the much needed structure to our data where it mattered.

Future-proofing Metadata Input

As a musician myself, I know genres and styles in music are constantly evolving and being reinterpreted. I wanted to future-proof this tool for those circumstances, so I designed for the ability to add user-defined fields which would trigger and event in the backend so it can be verified, de-duped and added to the list.

Final Thoughts

How can I make it fun?

I interviewed power users of Smule, and many of them treat this tool as something very functional - They were so used to it and knew exactly how to work their way around it. Had one of the company’s growth objectives been to expand its collection of songs, I would relook this tool in its entirety. But for now, here’s my upgraded interpretation of a fun, immersive song arrangement experience.