Semantically Meaningful Attributes from Co-listen Embeddings for Playlist Exploration and Expansion

Ayush Patwari; Nicholas Kong; Jun Wang; Ullas Gargi; Michele Covell; Aren Jansen

4-08 - Semantically Meaningful Attributes from Co-listen Embeddings for Playlist Exploration and Expansion

Ayush Patwari, Nicholas Kong, Jun Wang, Ullas Gargi, Michele Covell, Aren Jansen

Keywords: Applications, Music recommendation and playlist generation, MIR tasks, Automatic classification, Musical features and properties, Musical affect, emotion, and mood

Abstract: Audio embeddings of musical similarity are often used for music recommendations and autoplay discovery. These embeddings are typically learned using co-listen data to train a deep neural network, to provide consistent tripletloss distances. Instead of directly using these co-listen–based embeddings, we explore making recommendations based on a second, smaller embedding space of human-intelligible musical attributes. To do this, we use the co-listen–based audio embeddings as inputs to small attribute classifiers, trained on a small hand-labeled dataset. These classifiers map from the original embedding space to a new interpretable attribute coordinate system that provides a more useful distance measure for downstream applications. The attributes and attribute embeddings allow us to provide a search interface and more intelligible recommendations for music curators. We examine the relative performance of these two embedding spaces (the co-listen–audio embedding and the attribute embedding) for the mathematical separation of thematic playlists. We also report on the usefulness of recommendations from the attribute-embedding space to human curators for automatically extending thematic playlists.