Ultra-light Deep MIR by Trimming Lottery Tickets

Philippe  Esling; Théis Bazin; Adrien Bitton; Tristan J. J. Carsault; Ninon Devis

4-11 - Ultra-light Deep MIR by Trimming Lottery Tickets

Philippe Esling, Théis Bazin, Adrien Bitton, Tristan J. J. Carsault, Ninon Devis

Keywords: Domain knowledge, Machine learning/Artificial intelligence for music, Applications, Music retrieval systems, Representations of music, Evaluation, datasets, and reproducibility, MIR tasks, Automatic classification

Abstract: Current state-of-art results in Music Information Retrieval are largely dominated by deep learning approaches. These provide unprecedented accuracy across all discriminative tasks. However, the consistently overlooked downside of these models is their stunningly massive complexity, which seems concomitantly crucial to their success. In this paper, we address this issue by developing a new approach based on the recent lottery ticket hypothesis. We modify the original lottery approach to allow for explicitly removing parameters, through structured trimming of entire units, instead of simply masking individual weights. This allows to obtain models which are effectively lighter in terms of size, memory and number of operations.We show that our proposal allows to remove up to 95% of the models parameters without loss of accuracy, leading to ultra-light deep MIR models. We confirm the surprising result that, at smaller compression ratios (removing up to 90% of the network), lighter models consistently outperform their heavier counterpart. We exhibit these results on a large array of MIR tasks including audio classification, pitch recognition, chord extraction, drum transcription and onset estimation. These resulting ultra-light deep models for MIR can run on CPU, and can even fit on embedded devices with minimal degradation of accuracy.