Uncategorized | ARCHIE

New paper in Neurocomputing: Deciphering the transformation of sounds into meaning: Insights from disentangling intermediate representations in sound-to-event DNNs

2 October , 2025 14:54 Comments Off

In neuroscientific applications of deep neural networks (DNNs), interpretability of latent representations is crucial. Otherwise, we risk replacing one unknown (the brain) with another (the network).

This new Neurocomputing article by Tim Dick, Alexia Briassouli, Enrique Hortal, and Elia Formisano, published shows how invertible flow models can:
– Disentangle DNN latent spaces into interpretable dimensions (here: actions & materials in sound categorization networks).
– Enable systematic manipulations of these dimensions that predictably change the network’s outputs.

Although applied here to auditory networks, the method is generalizable and can extend to other domains where neuroscientists use DNNs as computational models.

Check: Data and code. and Custom flow model library Gyoza

Launching of ERC-Synergy project NASCE

7 February , 2025 13:52 Comments Off

In October 2024, Natural Auditory SCEnes in Humans and Machines (NASCE), collaborative projects by Elia Formisano and Bruno Giordano received ERC Synergy funding (8.6 M€) NASCE integrates AI and multimodal imaging to uncover how the brain transforms everyday soundscapes into meaningful representations of the world (read more here).

We will be hiring soon!. Launching in April 2025, NASCE will significantly expand research facilities in Maastricht and Marseille. We are looking for talented PhD students and postdocs interested in AI, and auditory neuroscience to join our team. If you’re passionate about research and ready to take on this exciting challenge, we’d love to hear from you!

New Article in IEEE Access: Audio-Language Datasets of Scenes and Events: A Survey

12:58 Comments Off

This work – published in IEEE Access (Open Access) – offers a comprehensive analysis of many audio-language datasets used to train audio-language models, examining dataset origins, audio-linguistic characteristics, and use cases. An extremely useful resource if you are working on or planning to work in audio-language AI models.

Read the full publication here: https://ieeexplore.ieee.org/document/10854210?source=authoralert .

Code for downloading and analyzing the datasets here: https://github.com/GLJS/audio-datasets

New preprint in SSRN: Deciphering the Transformation of Sounds into Meaning: Insights from Disentangling Intermediate Representations in Sound-to-Event DNNs.

21 October , 2024 15:01 Comments Off

In this new paper,we examine the nature of sound representations in intermediate layers of convolutional DNNs by means of in silico experiments involving a new sound data set of material and actions. Furthermore, by means of a new methodology based on invertible neural networks, we show that there is a causal relationship between these internal representations and the semantic model output.

Deciphering the Transformation of Sounds into Meaning: Insights from Disentangling Intermediate Representations in Sound-to-Event DNNs. Tim Dick, Alexia Briassouli, Enrique Hortal Quesada and Elia Formisano (2024) Available at SSRN: https://ssrn.com/abstract=4979651 or http://dx.doi.org/10.2139/ssrn.4979651.

New article in Scientific Reports: Semantically-informed DNN for sound recognition

14:17 Comments Off

In this new article in Scientific Report (Open Access), we have learned that deep neural networks (DNN) mapping sounds to distributed language semantics approximate human listeners’ behavior better than standard DNNs with categorical output:

Esposito, M., Valente, G., Plasencia-Calaña, Y., Michel Dumontier, Bruno L. Giordano, Elia Formisano. Bridging auditory perception and natural language processing with semantically informed deep neural networks. Scientific Reports 14, 20994 (2024). https://doi.org/10.1038/s41598-024-71693-9

New preprint in arXiv: Audio-Language Datasets of Scenes and Events: A Survey

14:10 Comments Off

Read our preprint on arxiv reporting a survey of the many available audio-language datasets. The survey includes a useful overview of the different audio-language tasks and an informative analysis examining the overlap between the different datasets.

Audio-Language Datasets of Scenes and Events: A Survey. Gijs Wijngaard, Elia Formisano, Michele Esposito, Michel Dumontier. https://arxiv.org/abs/2407.06947

New preprint in bioRxiv: Bridging Auditory Perception and Natural Language Processing with Semantically informed Deep Neural Networks

1 May , 2024 12:25 Comments Off

Read our new preprint in bioRxiv on enhancing sound recognition with semantics. We found that integrating NLP embeddings into CNNs enhances the network’s sound recognition performance and aligns it with human auditory perception:

Bridging Auditory Perception and Natural Language Processing with Semantically informed Deep Neural Networks. Michele Esposito, Giancarlo Valente, Yenisel Plasencia-Calaña, Michel Dumontier, Bruno L. Giordano, and Elia Formisano, bioRxiv 2024.04.29.591634. https://www.biorxiv.org/content/10.1101/2024.04.29.591634v1

New preprint in arXiv: ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds.

12:04 Comments Off

Read our new preprint in arXiv on a novel metric to evaluate and compare audio captions generated by humans or automated models: Gijs Wijngaard and Elia Formisano and Bruno L. Giordano and Michel Dumontier, ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds. (2024) arXiv, 2403.18572. https://arxiv.org/abs/2403.18572.

You may also want to check Gijs’s HuggingFace model associated with this article and that classifs tokens of sound caption with auditory semantic tags (Who/What/How/Where): here

New article in Nature Neuroscience: DNN modeling of behavioural and brain responses to natural sounds

24 March , 2023 15:15 Leave a Comment

Read our new article in Nature Neuroscience (Open Access) on how deep neural networks and other computational models can help us understanding how sounds are transformed into meaning in the human brain:

Giordano, B.L., Esposito, M., Valente, G., Formisano. Intermediate acoustic-to-semantic representations link behavioral and neural responses to natural sounds. Nat Neurosci (2023). https://doi.org/10.1038/s41593-023-01285-9

New member of the Aud2Sem research team: Maria de Araújo Nobre Duarte Vitória

10 January , 2023 11:55 Leave a Comment

Maria de Araújo Nobre Duarte Vitória has joined the Aud2Sem research team as a full-time PhD student. Welcome, Maria!

Apply now for a PhD student position in Cognitive Neuroscience and join our team

14 July , 2022 12:07 Leave a Comment

Do you have a genuine interest in auditory perception? Are you familiar with neuroimaging acquisition and analysis methods? Apply now for a PhD student position in Cognitive Neuroscience and join our team! (closed)

Ph.D. student positions available in 2022

21 October , 2021 17:45 Leave a Comment

Are you passionate about combining AI research and neuroscience and find out how the brain recognize sounds? During the coming year Ph.D. student positions will be available in Maastricht (1 position, 4 years) and Marseille (2 positions, 3 years) and you will have the possibility to join our team. We will post here information and links to the official applications, so stay tuned…

Category Archive: Uncategorized