UniPASE Demo

Contents

Audio Demos from the DNS 2020 No-reverb Test set
Audio Demos from the DNS 2020 With-reverb Test set
Audio Demos from the PLC 2024 Validation set
Audio Demos from the VoiceFixer GSR Test set
Audio Demos from the URGENT 2025 Non-blind Test set
Audio Examples in Ablation Study
References

Audio Demos from the DNS 2020 No-reverb Set

Noisy Signal

TF-GridNet [1] Output

StoRM [2] Output

LLaSE-G1 [3] Output

AnyEnhance [4] Output

PASE [5] Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet [1] Output

StoRM [2] Output

LLaSE-G1 [3] Output

AnyEnhance [4] Output

PASE [5] Output

UniPASE Output (Ours)

Clean

Audio Demos from the DNS 2020 With-reverb Set

Noisy Signal

TF-GridNet [1] Output

StoRM [2] Output

LLaSE-G1 [3] Output

AnyEnhance [4] Output

PASE [5] Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet [1] Output

StoRM [2] Output

LLaSE-G1 [3] Output

AnyEnhance [4] Output

PASE [5] Output

UniPASE Output (Ours)

Clean

Audio Demos from the PLC 2024 Validation Set

Noisy Signal

TF-GridNet Output

LLaSE-G1 Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet Output

LLaSE-G1 Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet Output

LLaSE-G1 Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet Output

LLaSE-G1 Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet Output

LLaSE-G1 Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet Output

LLaSE-G1 Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet Output

LLaSE-G1 Output

UniPASE Output (Ours)

Clean

Audio Demos from the VoiceFixer GSR Test Set

Noisy Signal

TF-GridNet [1] Output

VoiceFixer [6] Output

AnyEnhance [4] Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet [1] Output

VoiceFixer [6] Output

AnyEnhance [4] Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet [1] Output

VoiceFixer [6] Output

AnyEnhance [4] Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet [1] Output

VoiceFixer [6] Output

AnyEnhance [4] Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet [1] Output

VoiceFixer [6] Output

AnyEnhance [4] Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet [1] Output

VoiceFixer [6] Output

AnyEnhance [4] Output

UniPASE Output (Ours)

Clean

Audio Demos from the URGENT 2025 Non-blind Test Set

Noisy Signal

TF-GridNet Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet Output

UniPASE Output (Ours)

Clean

Noisy Signal

TF-GridNet Output

UniPASE Output (Ours)

Clean

Audio Examples in Ablation Study

Example 1

Enhanced Signal without MSRD

Enhanced Signal with MSRD

Example 2

Enhanced Signal without MSRD

Enhanced Signal with MSRD

Example 3

Enhanced Signal without MSRD

Enhanced Signal with MSRD

References

[1] Z.-Q. Wang, S. Cornell, S. Choi, Y. Lee, B.-Y. Kim, and S. Watanabe, “TF-GridNet: Integrating full-and sub-band modeling for speech separation,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 3221-3236, 2023.

[2] J.-M. Lemercier, J. Richter, S. Welker, and T. Gerkmann, “StoRM: A diffusion-based stochastic regeneration model for speech enhancement and dereverberation,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 2724-2737, 2023.

[3] B. Kang, X. Zhu, Z. Zhang, Z. Ye, M. Liu, Z. Wang, Y. Zhu, G. Ma, J. Chen, L. Xiao, C. Weng, W. Xue, and L. Xie, “LLaSE-G1: Incentivizing generalization capability for LLaMA-based speech enhancement,” in Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vienna, Austria, Jul. 2025, pp. 13 292-13 305.

[4] J. Zhang, J. Yang, Z. Fang, Y. Wang, Z. Zhang, Z. Wang, F. Fan, and Z. Wu, “Anyenhance: A unified generative model with prompt-guidance and self-critic for voice enhancement,” IEEE Transactions on Audio, Speech and Language Processing, vol. 33, pp. 3085-3098, 2025.

[5] X. Rong, Q. Hu, M. Yesilbursa, K. Wojcicki, and J. Lu, “PASE: Leveraging the Phonological Prior of WavLM for Low-Hallucination Generative Speech Enhancement,” in Proceedings of the 40th AAAI Conference on Artificial Intelligence (AAAI 2026), 2026, accepted.

[6] H. Liu, X. Liu, Q. Kong, Q. Tian, Y. Zhao, D. Wang, C. Huang, and Y. Wang, “Voicefixer: A unified framework for high-fidelity speech restoration,” in Interspeech 2022, 2022, pp. 4232-4236.