Demonstration Page

SoundMorpher: Perceptually-Uniform Sound Morphing with Diffusion Model

This page demonstrates SoundMorpher, featuring a selection of morphed results from our experiments, including timbral morphing of musical instruments, dynamic music morphing, and environmental sound morphing. To ensure clarity and avoid potential misunderstanding, we use the sound perceptual distance proportion (SPDP) points 𝑝 to indicate how the morphed results are close to the source and target sounds. We hope you could enjoy our demonstartion and find some interesting sounds!


Timbral Morphing for Musical Instruments

This section demonstrates how SoundMorpher perform timbral morphing for musical instruments compared to SMT[1]. We provide 4 groups of comparison across different musical isntrument timbre. Please refer to paper Section 5.2.

Input Method p = [0,1] p = [0.1,0.9] p = [0.2,0.8] p = [0.3,0.7] p = [0.4,0.6] p = [0.5,0.5] p = [0.6,0.4] p = [0.7,0.3] p = [0.8,0.2] p = [0.9,0.1] p = [1,0]
Source

SMT

Target

SoundMorpher

Source

SMT

Target

SoundMorpher

Source

SMT

Target

SoundMorpher

Source

SMT

Target

SoundMorpher


Music Morphing

This section we demonstrate how SoundMorpher smoothly morph source music to the target music by dynamic morphing. Further explaination and experimental results, please refer to paper Section 5.4. In selected intermediate morphed samples, we provide some selected morphed samples within N=15 intermediate results to showcase how morphed results from SoundMorpher ensure intermediateness and correspondence between two inputs. This also showcase SoundMorpher can be applied to creative music composition in arts by giving two music compositions.

Source Target Dynamic Morphing α = 0 α = 1 Selected intermediate morphed samples

Environmental Sound Morphing

This section we demonstrate how SoundMorpher smoothly morph source environmental sound to the target environmental sound cross different categories. Source and target audio recordings are from ESC50 dataset. Further explaination and experimental results, please refer to paper Section 5.3. This section also includes failure cases with abrupt transitions for SoundMorpher that two input sounds as significant semantic differences in content.

1. Dog and cat vocial morphing

Source Target p = [0,1] p = [0.25,0.75] p = [0.5,0.5] p = [0.75,0.25] p = [1,0]

2. Baby crying and laughing human sounds morphing

Source Target p = [0,1] p = [0.25,0.75] p = [0.5,0.5] p = [0.75,0.25] p = [1,0]

3. Church bells and clock alarm sounds morphing

Source Target p = [0,1] p = [0.25,0.75] p = [0.5,0.5] p = [0.75,0.25] p = [1,0]

4. Wood door knocking and clapping sounds morphing

Source Target p = [0,1] p = [0.25,0.75] p = [0.5,0.5] p = [0.75,0.25] p = [1,0]

Reference

[1] Marcelo Caetano. Morphing musical instrument sounds with the sinusoidal model in the sound morphing toolbox. In International Symposium on Computer Music Multidisciplinary Research, pp. 481–503. Springer, 2019.