PyTSMod: A Python Implementation of Time-Scale Modification Algorithms

Overview

Time-scale modification (TSM) is a digital audio effect that adjusts the length of an audio signal while preserving its pitch. The TSM audio effect is widely used in not only sound production but also music and audio research such as for data augmentation. In this paper, we present PyTSMod, an open-source Python library that implements several different classical TSM algorithms. We expect that PyTSMod can help MIR and audio researchers easily use the TSM algorithms in the Python-based environment.

Available TSM algorithms

Overlap-Add (OLA)
- OLA is the most simple TSM algorithm that changes the length of the signal through modifying the hop size between analysis frame and synthesis frame.
Pitch Synchronous Overlap-Add (TD-PSOLA)
- TD-PSOLA is the algorithm that analyzes the orignal waveforms to create pitich-synchronous analysis windows and synthesize the output signal both for modifying time-scale and pitch-scale.
Waveform Similarity Overlap-Add (WSOLA)
- WSOLA maximizes the waveform similarity by allowing timing tolerance to analysis frame to find the most similar position through cross correlation.
Phase Vocoder (PV)
- Phase vocoder estimates instantaneous frequency, and it is used to update phases of input signal’s frequency components in short-time Fourier transform. Although TSM results with phase vocoder has high phase continuity, it causes an transient smearing for percussive audio sources and a coloring artifact called phasiness.
TSM based on harmonic-percussive source separation (HPTSM)
- A novel TSM algorithm that applying phase vocoder to only harmonic sources and applying OLA to only percussive sources.

Installation

PyTSMod is hosted on PyPI. To install, run the following command in your Python environment:

$ pip install pytsmod

Examples

Singing Voice

Original

TSM Results

Algorithm	α=0.5	α=1.2	α=1.5
OLA
TD-PSOLA ¹
WSOLA
PV
PV (with phase lock)
HPTSM

Music Clip

Original

TSM Results

Algorithm	α=0.5	α=1.2	α=1.5
OLA
WSOLA
PV
PV (with phase lock)
HPTSM

For more audio examples, please visit the GitHub repo of PyTSMod.

References

[1] Jonathan Driedger, Meinard Müller. “TSM Toolbox: MATLAB Implementations of Time-Scale Modification Algorithms”, Proceedings of the 17th International Conference on Digital Audio Effects (DAFx-14). 2014.

[2] Jonathan Driedger, Meinard Müller. “A review of time-scale modification of music signals”, Applied Sciences, 6(2), 57. 2016.

[3] Udo Zölzer. “DAFX: digital audio effects”, John Wiley & Sons. 2011.

CREPE is used for extracting pitch of the audio source. ↩

PyTSMod: A Python Implementation of Time-Scale Modification Algorithms

Music and Audio Computing Lab., KAIST

Sangeon Yong, Soonbeom Choi, and Juhan Nam

Overview

Available TSM algorithms

Installation

Examples

Singing Voice

Original

TSM Results

Music Clip

Original

TSM Results

References