{"id":593,"date":"2024-04-16T13:40:15","date_gmt":"2024-04-16T13:40:15","guid":{"rendered":"https:\/\/mbic-auditorylab.nl\/home\/?page_id=593"},"modified":"2025-10-02T15:19:55","modified_gmt":"2025-10-02T13:19:55","slug":"publications","status":"publish","type":"page","link":"https:\/\/mbic-auditorylab.nl\/home\/publications\/","title":{"rendered":"Publications"},"content":{"rendered":"<div class=\"wpb-content-wrapper\"><p>[vc_row][vc_column][vc_column_text css=&#8221;&#8221;]<\/p>\n<h3><strong>Articles\u00a0<\/strong><\/h3>\n<ul>\n<li><span dir=\"ltr\" role=\"presentation\"><em>Article:<\/em> Deciphering the Transformation of Sounds into Meaning: Insights from Disentangling Intermediate Representations in Sound-to-Event DNNs. <strong>Tim Dick<\/strong>, Alexia Briassouli, Enrique Hortal Quesada and Elia Formisano (in press). Neurocomputing <a href=\"https:\/\/doi.org\/10.1016\/j.neucom.2025.131600\">https:\/\/doi.org\/10.1016\/j.neucom.2025.131600<\/a><\/span>\n<ul>\n<li><span dir=\"ltr\" role=\"presentation\">Preprint on SSRN: <a href=\"https:\/\/ssrn.com\/abstract=4979651\" target=\"_blank\" rel=\"noopener\">https:\/\/ssrn.com\/abstract=4979651<\/a>\u00a0or\u00a0<a class=\"textlink\" href=\"https:\/\/dx.doi.org\/10.2139\/ssrn.4979651\" target=\"_blank\" rel=\"noopener\">http:\/\/dx.doi.org\/10.2139\/ssrn.4979651<\/a><\/span><span dir=\"ltr\" role=\"presentation\">.<\/span><\/li>\n<\/ul>\n<\/li>\n<li><em>Article:<\/em> Audio-Language Datasets of Scenes and Events: A Survey, <strong>Gijs Wijngaard,<\/strong> E. Formisano, M. Esposito and M. Dumontier (2025) <em>IEEE Access, <\/em>13,20328-20360. <a href=\"https:\/\/ieeexplore.ieee.org\/document\/10854210?source=authoralert\">https:\/\/ieeexplore.ieee.org\/document\/10854210?source=authoralert<\/a>\n<ul>\n<li>\u00a0Preprint on\u00a0arXiv:2407.06947v1 \u00a0<a href=\"https:\/\/arxiv.org\/abs\/2407.06947#\">https:\/\/arxiv.org\/abs\/2407.06947#\u00a0<\/a><\/li>\n<\/ul>\n<\/li>\n<li><span dir=\"ltr\" role=\"presentation\"><em>Article:<\/em> Bridging Auditory Perception and Natural <\/span><span dir=\"ltr\" role=\"presentation\">Language Processing with Semantically <\/span><span dir=\"ltr\" role=\"presentation\">informed Deep Neural Networks. <\/span><strong><span dir=\"ltr\" role=\"presentation\">Michele Esposito<\/span><\/strong><span dir=\"ltr\" role=\"presentation\">, Giancarlo Valente<\/span><span dir=\"ltr\" role=\"presentation\">, Yenisel <\/span><span dir=\"ltr\" role=\"presentation\">Plasencia-Cala\u00f1<\/span><span dir=\"ltr\" role=\"presentation\">a<\/span><span dir=\"ltr\" role=\"presentation\">, Michel Dumontier<\/span><span dir=\"ltr\" role=\"presentation\">, Bruno L. Giordano, <\/span><span dir=\"ltr\" role=\"presentation\">and Elia Formisano (2024) \u00a0<i>Scientific Reports<\/i>\u00a0<b>14<\/b>, 20994 (2024). <a href=\"https:\/\/doi.org\/10.1038\/s41598-024-71693-9\">https:\/\/doi.org\/10.1038\/s41598-024-71693-9<\/a><\/span>\n<ul>\n<li><span dir=\"ltr\" role=\"presentation\"><span class=\"highwire-cite-metadata-journal highwire-cite-metadata\">Preprint on bioRxiv <\/span><span class=\"highwire-cite-metadata-pages highwire-cite-metadata\">2024.04.29.591634.\u00a0<a href=\"https:\/\/www.biorxiv.org\/content\/10.1101\/2024.04.29.591634v1\">https:\/\/www.biorxiv.org\/content\/10.1101\/2024.04.29.591634v1<\/a><\/span><\/span><\/li>\n<\/ul>\n<\/li>\n<li><em>Preprint:<\/em> ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds.<strong> Gijs Wijngaard<\/strong> and Elia Formisano and Bruno L. Giordano and Michel Dumontier (2024) <em>arXiv, <\/em>2403.18572. <a href=\"https:\/\/arxiv.org\/abs\/2403.18572\">https:\/\/arxiv.org\/abs\/2403.18572<\/a><\/li>\n<li><em>Article:<\/em> Intermediate acoustic-to-semantic representations link behavioural and neural responses to natural sounds. <strong>Bruno L. Giordano<\/strong>, Michele Esposito, Giancarlo Valente, Elia Formisano\u00a0(2023)\u00a0<em>Nature Neuroscience<\/em>, 26(4):664-672.\u00a0<a href=\"https:\/\/www.nature.com\/articles\/s41593-023-01285-9\">https:\/\/www.nature.com\/articles\/s41593-023-01285-9<\/a><\/li>\n<li><em>Article:<\/em> What do we mean with sound semantics, exactly? A survey of taxonomies and ontologies of everyday sounds.<strong> Bruno L. Giordano<\/strong>, de Miranda Azevedo R, Plasencia-Cala\u00f1a Y, Formisano E*, Dumontier M* (2022) Front Psychol. 2022 Sep 29;13:964209. \u00a0<a href=\"https:\/\/doi.org\/10.3389\/fpsyg.2022.964209\">https:\/\/doi.org\/10.3389\/fpsyg.2022.964209<\/a><\/li>\n<\/ul>\n<h3>Conference proceedings<\/h3>\n<ul>\n<li>AudSemThinker: Enhancing Audio-Language Models through Reasoning over Semantics of Sound. <strong>Wijngaard G<\/strong>, Formisano E, Esposito M, Dumontier M. Accepted at <a href=\"https:\/\/neurips.cc\/\">NeurIPS2025.<\/a> Preprint: <a href=\"https:\/\/arxiv.org\/abs\/2505.14142\">arXiv:2505.14142<\/a><\/li>\n<li>ACES: Evaluating automated audio captioning models on the semantics of sounds <strong>Wijngaard G,<\/strong> Formisano E, Giordano B, Dumontier M, ACES: Evaluating automated audio captioning models on the semantics of sounds. Accepted at 31st European Signal Processing Conference, (EUSIPCO 2023)<a href=\"https:\/\/ieeexplore.ieee.org\/abstract\/document\/10289793\"> https:\/\/ieeexplore.ieee.org\/abstract\/document\/10289793<\/a><\/li>\n<li>Semantically informed deep neural networks for sound recognition. <strong><span dir=\"ltr\" role=\"presentation\">Michele Esposito<\/span><\/strong><span dir=\"ltr\" role=\"presentation\"><strong>,<\/strong> Giancarlo Valente<\/span><span dir=\"ltr\" role=\"presentation\">, Yeni <\/span><span dir=\"ltr\" role=\"presentation\">Plasencia-Cala<\/span><span dir=\"ltr\" role=\"presentation\">na<\/span><span dir=\"ltr\" role=\"presentation\">, Bruno L. Giordano<\/span>\u00a0<span dir=\"ltr\" role=\"presentation\">and Elia Formisano.\u00a0 Presented at <\/span>IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023), Session: &#8220;Synergy between human and machine approaches to sound\/scene recognition and processing&#8221;\u00a0<span dir=\"ltr\" role=\"presentation\"> <a href=\"https:\/\/ieeexplore.ieee.org\/document\/10095606\">https:\/\/ieeexplore.ieee.org\/document\/10095606\u00a0<\/a><\/span><\/li>\n<\/ul>\n<h3>Conference abstracts\/posters\/presentations<\/h3>\n<ul>\n<li>Evaluating the impact of multiscale temporal processing on sound-to-event recurrent neural networks. <strong>Michele Esposito<\/strong>, Bruno, L. Giordano, Giancarlo Valente, Elia Formisano (Poster Presented at Cognitive Computational Neuroscience, August 6\u20139, 2024, Boston, Massachusetts)\n<ul>\n<li><a href=\"https:\/\/2024.ccneuro.org\/pdf\/478_Paper_authored_CCN_Abstract_Esposito.pdf\">https:\/\/2024.ccneuro.org\/pdf\/478_Paper_authored_CCN_Abstract_Esposito.pdf<\/a><\/li>\n<\/ul>\n<\/li>\n<li>Disentangling Intermediate Representations in Sound-to-Event DNNs using Invertible Flow Models. <strong>Tim Dick<\/strong>, Enrique Hortal Quesada, Alexia Briassouli, Elia Formisano\u00a0(Poster Presented at Cognitive Computational Neuroscience, August 6\u20139, 2024, Boston, Massachusetts)\n<ul>\n<li><a href=\"https:\/\/2024.ccneuro.org\/pdf\/158_Paper_authored_Disentangling-Intermediate-Representations-in-Sound-to-Event-DNNs.pdf\">https:\/\/2024.ccneuro.org\/pdf\/158_Paper_authored_Disentangling-Intermediate-Representations-in-Sound-to-Event-DNNs.pdf\u00a0<\/a><\/li>\n<\/ul>\n<\/li>\n<li>Optimal stimulus selection for dissociating acoustic and semantic processing of natural sounds. <strong>Maria Ara\u00fajo Vitoria<\/strong>, Marie Plegat, Giorgio Marinato, Michele Esposito, Christian Herff, Bruno L. Giordano, Elia Formisano (Poster Presented at Cognitive Computational Neuroscience, August 6\u20139, 2024, Boston, Massachusetts)\n<ul>\n<li><a href=\"https:\/\/2024.ccneuro.org\/pdf\/14_Paper_authored_ccn_manuscript_authors_MariaAraujoVitoria.pdf\">https:\/\/2024.ccneuro.org\/pdf\/14_Paper_authored_ccn_manuscript_authors_MariaAraujoVitoria.pdf<\/a><\/li>\n<\/ul>\n<\/li>\n<li>Semantically informed deep neural networks for sound recognition. <strong><span dir=\"ltr\" role=\"presentation\">Michele Esposito<\/span><\/strong><span dir=\"ltr\" role=\"presentation\"><strong>,<\/strong> Giancarlo Valente<\/span><span dir=\"ltr\" role=\"presentation\">, Yeni <\/span><span dir=\"ltr\" role=\"presentation\">Plasencia-Cala<\/span><span dir=\"ltr\" role=\"presentation\">na<\/span><span dir=\"ltr\" role=\"presentation\">, Bruno L. Giordano<\/span>\u00a0<span dir=\"ltr\" role=\"presentation\">and Elia Formisano. (Poster presented\u00a0 at International Conference on Auditory Cortex 2022)<\/span><\/li>\n<li>Intermediate acoustic-to-semantic representations link behavioural and neural responses to natural sounds. <strong>Bruno L. Giordano<\/strong>, Michele Esposito, Giancarlo Valente, Elia Formisano <span dir=\"ltr\" role=\"presentation\">(Poster presented\u00a0 at ICAC 2022)<\/span><\/li>\n<\/ul>\n<h3>Master Theses<\/h3>\n<ul>\n<li>2023-2024: Master Data Science for Decision Making. How do we recognize auditory events? Disentangling intermediate representations in the acoustic-to-semantic transformation. <strong>Tim Dick<\/strong> (supervision: Formisano, Briassouli, Quesada) (2024 Master program&#8217;s best thesis award)<\/li>\n<li>2021-2022: Research Master Cognitive Neuroscience: The Influence of Semantics on Neural Models of Sound Recognition. <strong>Martin Bremm<\/strong>, (supervision: Formisano, Dumontier)<\/li>\n<li>2020-2021: Research Master Cognitive Neuroscience:\u00a0 Opening the Black Box: Interpreting a Neural Network for Semantic Sound Recognition Using Layer-wise Relevance Propagation. <strong>Anna Vorreuther\u00a0<\/strong> (supervision, Formisano, Valente)<\/li>\n<\/ul>\n<p>[\/vc_column_text][\/vc_column][\/vc_row]<\/p>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>[vc_row][vc_column][vc_column_text css=&#8221;&#8221;] Articles\u00a0 Article: Deciphering the Transformation of Sounds into Meaning: Insights from Disentangling Intermediate Representations in Sound-to-Event DNNs. Tim Dick, Alexia Briassouli, Enrique Hortal Quesada and Elia Formisano (in press). Neurocomputing https:\/\/doi.org\/10.1016\/j.neucom.2025.131600 Preprint on SSRN: https:\/\/ssrn.com\/abstract=4979651\u00a0or\u00a0http:\/\/dx.doi.org\/10.2139\/ssrn.4979651. Article: Audio-Language Datasets of Scenes and Events: A Survey, Gijs Wijngaard, E. Formisano, M. Esposito and M. Dumontier [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"footnotes":""},"class_list":["post-593","page","type-page","status-publish","hentry"],"acf":[],"_links":{"self":[{"href":"https:\/\/mbic-auditorylab.nl\/home\/wp-json\/wp\/v2\/pages\/593","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mbic-auditorylab.nl\/home\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mbic-auditorylab.nl\/home\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mbic-auditorylab.nl\/home\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/mbic-auditorylab.nl\/home\/wp-json\/wp\/v2\/comments?post=593"}],"version-history":[{"count":15,"href":"https:\/\/mbic-auditorylab.nl\/home\/wp-json\/wp\/v2\/pages\/593\/revisions"}],"predecessor-version":[{"id":878,"href":"https:\/\/mbic-auditorylab.nl\/home\/wp-json\/wp\/v2\/pages\/593\/revisions\/878"}],"wp:attachment":[{"href":"https:\/\/mbic-auditorylab.nl\/home\/wp-json\/wp\/v2\/media?parent=593"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}