SoundMorpher: Perceptually-Uniform Sound Morphing with Diffusion Model

Return index page

Ablation Study

This page contains demonstration for ablation studies discussed in Paper Section V.E.



1. Ablation study on sound perceptual features

Example 1

Source music

Target music




(log) Mel-Spectrogram

MFCCs

Reduced (log) Mel Spectrogram (n=2)

Reduced (log) Mel Spectrogram (n=3)

Spectral Contras

Example 2

Source music

Target music




(log) Mel-Spectrogram

MFCCs

Reduced (log) Mel Spectrogram (n=2)

Reduced (log) Mel Spectrogram (n=3)

Spectral Contras

Example 3

Source music

Target music




(log) Mel-Spectrogram

MFCCs

Reduced (log) Mel Spectrogram (n=2)

Reduced (log) Mel Spectrogram (n=3)

Spectral Contras

Example 4

Source music

Target music




(log) Mel-Spectrogram

MFCCs

Reduced (log) Mel Spectrogram (n=2)

Reduced (log) Mel Spectrogram (n=3)

Spectral Contras


2. Ablation study on LoRA rank

Example 1

Source music

Target music




Reconstruction α=0.0

wo LoRA

r = 4

r = 8

r = 16

r = 32

Reconstruction α=1.0

wo LoRA

r = 4

r = 8

r = 16

r = 32

Dynamic morph

wo LoRA

r = 4

r = 8

r = 16

r = 32

Example 2

Source music

Target music




Reconstruction α=0.0

wo LoRA

r = 4

r = 8

r = 16

r = 32

Reconstruction α=1.0

wo LoRA

r = 4

r = 8

r = 16

r = 32

Dynamic morph

wo LoRA

r = 4

r = 8

r = 16

r = 32


3. Failure case

Example 1

Source audio

Target audio




α=0.0




α=1.0

Example 2

Source audio

Target audio




α=0.0




α=1.0

Example 3

Source audio

Target audio




α=0.0




α=1.0

Return index page