Main Conference Accepted Papers

Long Papers

Enhancing Ethical Explanations of Large Language Models through Iterative Symbolic Refinement
Xin Quan, Marco Valentino, Louise A. Dennis and Andre Freitas

Multi-Relational Hyperbolic Word Embeddings from Natural Language Definitions
Marco Valentino, Danilo Carvalho and Andre Freitas

Anisotropy Is Inherent to Self-Attention in Transformers
Nathan Godey, Éric Villemonte De La Clergerie and Benoît Sagot

Generating Benchmarks for Factuality Evaluation of Language Models
Dor Muhlgay, Ori Ram, Inbal Magar, Yoav Levine, Nir Ratner, Yonatan Belinkov, Omri Abend, Kevin Leyton-brown, Amnon Shashua and Yoav Shoham

Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs
Simone Balloccu, Patrícia Schmidtová, Mateusz Lango and Ondrej Dusek

Archer: A Human-Labeled Text-to-SQL Dataset with Arithmetic, Commonsense and Hypothetical Reasoning
Danna Zheng, Mirella Lapata and Jeff Z. Pan

GEAR: Augmenting Language Models with Generalizable and Efficient Tool Resolution
Yining Lu, Haoping Yu and Daniel Khashabi

LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models
Adian Liusie, Potsawee Manakul and Mark Gales

Parameter-Efficient Conversational Recommender System as a Language Processing Task
Mathieu Ravaut, Hao Zhang, Lu Xu, Aixin Sun and Yong Liu

OpenPI2.0: An Improved Dataset for Entity Tracking in Texts
Li Zhang, Hainiu Xu, Abhinav Kommula, Chris Callison-burch and Niket Tandon

A Comparative Multidimensional Analysis of Empathetic Systems
Andrew Lee, Jonathan K. Kummerfeld, Larry Ann and Rada Mihalcea

Few-Shot Data Synthesis for Open Domain Multi-Hop Question Answering
Mingda Chen, Xilun Chen and Wen-tau Yih

Language Models as Inductive Reasoners
Zonglin Yang, Li Dong, Xinya Du, Hao Cheng, Erik Cambria, Xiaodong Liu, Jianfeng Gao and Furu Wei

FTC-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects
David Ifeoluwa Adelani, Hannah Liu, Xiaoyu Shen, Nikita Vassilyev, Jesujoba Oluwadara Alabi, Yanke Mao, Haonan Gao and En-shiun Annie Lee

FinBPM: A Framework for Portfolio Management-based Financial Investor Behavior Perception Model
Zhilu Zhang, Procheta Sen, Zimu Wang, Ruoyu Sun, Zhengyong Jiang and Jionglong Su

Asking the Right Question at the Right Time: Human and Model Uncertainty Guidance To Ask Clarification Questions
Alberto Testoni and Raquel Fernández

Like a Good Nearest Neighbor: Practical Content Moderation and Text Classification
Luke Bates and Iryna Gurevych

Zero-shot Sentiment Analysis in Low-Resource Languages Using a Multilingual Sentiment Lexicon
Fajri Koto, Tilman Beck, Zeerak Talat, Iryna Gurevych and Timothy Baldwin

CEAN: Contrastive Event Aggregation Network with LLM-based Augmentation for Event Extraction
Zihao Meng, Tao Liu, Heng Zhang, Kai Feng and Peng Zhao

How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models?
Danni Liu and Jan Niehues

MultiMUC: Multilingual Template Filling on MUC-4
William Gantt, Shabnam Behzad, Hannah Youngeun An, Yunmo Chen, Aaron Steven White, Benjamin Van Durme and Mahsa Yarmohammadi

Align and Augment: Generative Data Augmentation for Compositional Generalization
Francesco Cazzaro, Davide Locatelli and Ariadna Quattoni

UNSEE: Unsupervised Non-contrastive Sentence Embeddings
Ömer Veysel Çağatan

EXPLORER: Exploration-guided Reasoning for Textual Reinforcement Learning
Kinjal Basu, Keerthiram Murugesan, Subhajit Chaudhury, Murray Campbell, Kartik Talamadupula and Tim Klinger

From Text Segmentation to Smart Chaptering: A Novel Benchmark for Structuring Video Transcriptions
Fabian Retkowski and Alex Waibel

Fréchet Distance for Offline Evaluation of Information Retrieval Systems with Sparse Labels
Negar Arabzadeh and Charles Clarke

Semantic Sensitivities and Inconsistent Predictions: Measuring the Fragility of NLI Models
Erik Arakelyan, Zhaoqi Liu and Isabelle Augenstein

Exploring the Robustness of Task-oriented Dialogue Systems for Colloquial German Varieties
Katya Artemova, Verena Blaschke and Barbara Plank

PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long Documents
Simeng Sun, Yang Liu, Shuohang Wang, Dan Iter, Chenguang Zhu and Mohit Iyyer

LAraBench: Benchmarking Arabic AI with Large Language Models
Ahmed Abdelali, Hamdy Mubarak, Shammur Absar Chowdhury, Maram Hasanain, Basel Mousi, Sabri Boughorbel, Samir Abdaljalil, Yassine El Kheir, Daniel Izham, Fahim Dalvi, Majd Hawasly, Nizi Nazar, Youssef Ibrahim Elshahawy, Ahmed Ali, Nadir Durrani, Natasa Milic-Frayling and Firoj Alam

SentenceLDA: Discriminative and Robust Document Representation with Sentence Level Topic Model
Taehun Cha and Donghun Lee

AdaPT: A Set of Guidelines for Hyperbolic Multimodal Multilingual NLP
Ramit Sawhney, Megh Thakkar, Vishwa Shah, Shrey Pandit and Shafiq Joty

Towards Hierarchical Spoken Language Disfluency Modeling
Jiachen Lian and Gopala Anumanchipalli

Finding a Needle in the Adversarial Haystack: A Targeted Paraphrasing Approach For Uncovering Edge Cases with Minimal Distribution Distortion
Aly M. Kassem and Sherif Saad

FAIR: Filtering of Automatically Induced Rules
Divya Jyoti Bajpai, Ayush Maheshwari, Manjesh Kumar Hanawal and Ganesh Ramakrishnan

NNOSE: Nearest Neighbor Occupational Skill Extraction
Mike Zhang, Rob Van Der Goot, Min-yen Kan and Barbara Plank

GAINER: Graph Machine Learning with Node-specific Radius for Classification of Short Texts and Documents
Naganand Yadati

MAFIA: Multi-Adapter Fused Inclusive Language Models
Prachi Jain, Ashutosh Sathe, Varun Gumma, Kabir Ahuja and Sunayana Sitaram

Code-Switched Language Identification is Harder Than You Think
Laurie Burchell, Alexandra Birch, Robert Peter Thompson and Kenneth Heafield

Generation-driven Contrastive Self-training for Zero-shot Text Classification with Instruction-tuned GPT
Ruohong Zhang, Yau-shian Wang and Yiming Yang

Quantifying the Hyperparameter Sensitivity of Neural Networks for Character-level Sequence-to-Sequence Tasks
Adam Wiemerslage, Kyle Gorman and Katharina Von Der Wense

Examining Gender and Racial Bias in Large Vision--Language Models Using a Novel Dataset of Parallel Images
Kathleen C. Fraser and Svetlana Kiritchenko

ConstraintChecker: A Plugin for Large Language Models to Reason on Commonsense Knowledge Bases
Quyet V. Do, Tianqing Fang, Shizhe Diao, Zhaowei Wang and Yangqiu Song

A* shortest string decoding for non-idempotent semirings
Kyle Gorman and Cyril Allauzen

Importance-Aware Data Augmentation for Document-Level Neural Machine Translation
Minghao Wu, Yufei Wang, George Foster, Lizhen Qu and Gholamreza Haffari

Lost in Translationese? Reducing Translation Effect Using Abstract Meaning Representation
Shira Wein and Nathan Schneider

Comparing Template-based and Template-free Language Model Probing
Sagi Shaier, Kevin Bennett, Lawrence Hunter and Katharina Von Der Wense

Desiderata For The Context Use Of Question Answering Systems
Sagi Shaier, Lawrence Hunter and Katharina Von Der Wense

Scaled-up Discovery of Latent Concepts in Deep NLP Models
Majd Hawasly, Fahim Dalvi and Nadir Durrani

AnthroScore: A Computational Linguistic Measure of Anthropomorphism
Myra Cheng, Kristina Gligoric, Tiziano Piccardi and Dan Jurafsky

Centering the Speech Community
Steven Bird and Dean Yibarbuk

Beyond Automated Evaluation Metrics: Evaluating Topic Models On Practical Social Science Content Analysis Tasks
Zongxia Li, Andrew Mao, Daniel Kofi Stephens, Pranav Goel, Emily Walpole, Alden Dima, Juan Francisco Fung and Jordan Lee Boyd-graber

Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora
Surangika Ranathunga, Nisansa De Silva, Charitha S.m. Rathnayake, Aloka Fernando and Velayuthan Menan

VOLTAGE: A Versatile contrastive learning based OCR methodology for ultra Low-resource scripts Through Auto Glyph feature Extraction
Prawaal Sharma, Poonam Goyal, Vidisha Sharma and Navneet Goyal

Unsupervised Contrast-Consistent Ranking with Language Models
Niklas Stoehr, Pengxiang Cheng, Jing Wang, Daniel Preotiuc-pietro and Rajarshi Bhowmik

Entity-level Factual Adaptiveness of Fine-tuning based Abstractive Summarization Models
Jongyoon Song, Nohil Park, Bongkyu Hwang, Jaewoong Yun, Seongho Joe, Youngjune Gwon and Sungroh Yoon

Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes Through Multimodal Explanations
Prince Jha, Krishanu Maity, Raghav Jain, Apoorv Verma, Sriparna Saha and Pushpak Bhattacharyya

LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-mageed and Alham Fikri Aji

Automated Cognate Detection as a Supervised Link Prediction Task with Cognate Transformer
V.s.d.s.mahesh Akavarapu and Arnab Bhattacharya

Leveraging Multi-lingual Positive Instances in Contrastive Learning to Improve Sentence Embedding
Kaiyan Zhao, Qiyu Wu, Xin-qiang Cai and Yoshimasa Tsuruoka

Moderation in the Wild: Investigating User-Driven Moderation in Online Discussions
Neele Falk, Eva Maria Vecchi, Iman Jundi and Gabriella Lapesa

Cross-Lingual Transfer from Related Languages: Treating Low-Resource Maltese as Multilingual Code-Switching
Kurt Micallef, Nizar Habash, Claudia Borg, Fadhl Eryani and Houda Bouamor

Where Do We Go From Here? Multi-scale Allocentric Relational Inferencefrom Natural Spatial Descriptions
Tzuf Paz-argaman, John Palowitch, Sayali Kulkarni, Reut Tsarfaty and Jason Michael Baldridge

Bias in Opinion Summarisation from Pre-training to Adaptation: A Case Study in Political Bias
Nannan Huang, Haytham M. Fayek and Xiuzhen Zhang

Document Structure in Long Document Transformers
Jan Buchmann, Max Eichler, Jan-micha Bodensohn, Ilia Kuznetsov and Iryna Gurevych

The Role of Data Curation in Image Captioning
Wenyan Li, Jonas F. Lotz, Chen Qiu and Desmond Elliott

Large-Scale Bitext Corpora Provide New Evidence for Cognitive Representations of Spatial Terms
Peter Viechnicki, Kevin Duh, Anthony Kostacos and Barbara Landau

REFINER: Reasoning Feedback on Intermediate Representations
Debjit Paul, Mete Ismayilzada, Maxime Peyrard, Beatriz Borges, Antoine Bosselut, Robert West and Boi Faltings

HumBEL: A Human-in-the-Loop Approach for Evaluating Demographic Factors of Language Models in Human-Machine Conversations
Anthony Sicilia, Jennifer C. Gates and Malihe Alikhani

LOCOST: State-Space Models for Long Document Abstractive Summarization
Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy F. Chen, Vincent Guigue, Alberto Lumbreras, Laure Soulier and Patrick Gallinari

A Classification-Guided Approach for Adversarial Attacks against Neural Machine Translation
Sahar Sadrizadeh, Ljiljana Dolamic and Pascal Frossard

Improving Generalization in Semantic Parsing by Increasing Natural Language Variation
Irina Saparina and Mirella Lapata

Text-to-Code Generation with Modality-relative Pre-training
Fenia Christopoulou, Guchun Zhang and Gerasimos Lampouras

No Error Left Behind: Multilingual Grammatical Error Correction with Pre-trained Translation Models
Agnes Luhtaru, Elizaveta Korotkova and Mark Fishel

Quantifying Stereotypes in Language
Yang Liu

Generation, Distillation and Evaluation of Motivational Interviewing-Style Reflections with a Foundational Language Model
Andrew Brown, Jiading Zhu, Mohamed Abdelwahab, Alec Dong, Cindy Wang and Jonathan Rose

Multi-Reference Benchmarks for Russian Grammatical Error Correction
Frank Palma Gomez and Alla Rozovskaya

Plan-Grounded Large Language Models for Dual Goal Conversational Settings
Diogo Glória-silva, Rafael Ferreira, Diogo Tavares, David Semedo and Joao Magalhaes

“Define Your Terms” : Enhancing Efficient Offensive Speech Classification with Definition
Huy Nghiem, Umang Gupta and Fred Morstatter

VlogQA: Task, Dataset, and Baseline Models for Vietnamese Spoken-Based Machine Reading Comprehension
Thinh Phuoc Ngo, Khoa Tran Anh Dang, Son T. Luu, Kiet Van Nguyen and Ngan Luu-thuy Nguyen

CEV-LM: Controlled Edit Vector Language Model for Shaping Natural Language Generations
Samraj Moorjani, Adit Krishnan and Hari Sundaram

It's All Relative: Learning Interpretable Models for Scoring Subjective Bias in Documents from Pairwise Comparisons
Aswin Suresh, Wu Chi Hsuan and Matthias Grossglauser

HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification
Vidit Jain, Mukund Rungta, Yuchen Zhuang, Yue Yu, Zeyu Wang, Mu Gao, Jeffrey Skolnick and Chao Zhang

A Truly Joint Neural Architecture for Segmentation and Parsing
Danit Yshaayahu Levi and Reut Tsarfaty

ViLexNorm: A Lexical Normalization Corpus for Vietnamese Social Media Text
Thanh-nhi Nguyen, Thanh-phong Le and Kiet Van Nguyen

Diffusion-NAT: Self-Prompting Discrete Diffusion for Non-Autoregressive Text Generation
Kun Zhou, Yifan Li, Xin Zhao and Ji-rong Wen

Unleashing the Power of Discourse-Enhanced Transformers for Propaganda Detection
Alexander Chernyavskiy, Dmitry Ilvovsky and Preslav Nakov

Predicting Client Emotions and Therapist Interventions in Psychotherapy Dialogues
Tobias Mayer, Neha Warikoo, Amir Eliassaf, Dana Atzil-slonim and Iryna Gurevych

Who Needs Decoders? Efficient Estimation of Sequence-Level Attributes with Proxies
Yassir Fathullah, Puria Radmard, Adian Liusie and Mark Gales

3D Rotation and Translation for Hyperbolic Knowledge Graph Embedding
Yihua Zhu and Hidetoshi Shimodaira

Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese Geographic Re-Ranking
Yong Cao, Ruixue Ding, Boli Chen, Xianzhi Li, Min Chen, Daniel Hershcovich, Pengjun Xie and Fei Huang

Style-News: Incorporating Stylized News Generation and Adversarial Verification for Neural Fake News Detection
Wei-yao Wang, Yu-chieh Chang and Wen-chih Peng

Graph-based Clustering for Detecting Semantic Change Across Time and Languages
Xianghe Ma, Michael Strube and Wei Zhao

Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models
Haoqiang Kang, Terra Blevins and Luke Zettlemoyer

Anchor Points: Benchmarking Models with Much Fewer Examples
Rajan Pathe Vivek, Kawin Ethayarajh, Diyi Yang and Douwe Kiela

SCO-VIST: Social Interaction Commonsense Knowledge-based Visual Storytelling
Eileen Wang, Caren Han and Josiah Poon

Discovering and Articulating Frames of Communication from Social Media Using Chain-of-Thought Reasoning
Maxwell Weinzierl and Sanda Harabagiu

VEIL: Vetting Extracted Image Labels from In-the-Wild Captions for Weakly-Supervised Object Detection
Arushi Rai and Adriana Kovashka

WSC+: Enhancing The Winograd Schema Challenge Using Tree-of-Experts
Pardis Sadat Zahraei and Ali Emami

An Interactive Framework for Profiling News Media Sources
Nikhil Mehta and Dan Goldwasser

Karde\c{s}-NLU: Transfer to Low-Resource Languages with Big Brother's Help -- A Benchmark and Evaluation for Turkic Languages
Lütfi Kerem Senel, Benedikt Ebing, Konul Baghirova, Hinrich Schuetze and Goran Glavaš

Inductive Reasoning Elicitation for Temporal Relation Understanding
Jongho Kim, Dohyeon Lee, Minsoo Kim and Seung-won Hwang

Fine-Grained Natural Language Inference Based Faithfulness Evaluation for Diverse Summarisation Tasks
Huajian Zhang, Yumo Xu and Laura Perez-beltrachini

AnaDE1.0: A Novel Data Set for Benchmarking Analogy Detection and Extraction
Bhavya Bhavya, Shradha Sehgal, Jinjun Xiong and Chengxiang Zhai

Beyond Words: A Comprehensive Survey of Sentence Representations
Abhinav Ramesh Kashyap, Thanh-tung Nguyen, Viktor Schlegel, Stefan Winkler, See-kiong Ng and Soujanya Poria

Learning to Retrieve In-Context Examples for Large Language Models
Liang Wang, Nan Yang and Furu Wei

EnCore: Fine-Grained Entity Typing by Pre-Training Entity Encoders on Coreference Chains
Frank Martin Mtumbuka and Steven Schockaert

Unsupervised stance detection for social media discussions: A generic baseline
Maia Sutter, Antoine Gourru, Amine Trabelsi and Christine Largeron

Putting Context in Context: the Impact of Discussion Structure on Text Classification
Nicolò Penzo, Antonio Longa, Bruno Lepri, Sara Tonelli and Marco Guerini

Aligning Large Language Models via Chain-of-Thought Reasoning
Leonardo Ranaldi and Andre Freitas

Disentangling the Roles of Target-side Transfer and Regularization in Multilingual Machine Translation
Yan Meng and Christof Monz

Uncovering Stereotypes in Large Language Models: A Task Complexity-based Approach
Hari Shrawgi, Prasanjit Rath, Tushar Singhal and Sandipan Dandapat

Nearest-neighbor-assisted Fine-tuning for Neural Machine Translation
Jiayi Wang, Ke Wang, Yuqi Zhang, Zhongqiang Huang, Yu Zhao and Pontus Stenetorp

Rainbow - A Benchmark for Systematic Testing of How Sensitive Visio-Linguistic Models are to Color Naming
Marie Bexte, Andrea Horbach and Torsten Zesch

CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain Performance and Calibration
Rachneet Singh Sachdeva, Martin Tutek and Iryna Gurevych

UP5: Unbiased Foundation Model for Fairness-aware Recommendation
Wenyue Hua, Yingqiang Ge, Shuyuan Xu, Jianchao Ji, Zelong Li and Yongfeng Zhang

Human Temporal Inferences Go Beyond Aspectual Class
Katarzyna Pruś, Mark Steedman and Adam Lopez

It is not True that Transformers are Inductive Learners: Probing NLI Models with External Negation
Michael Sullivan

Polarized Opinion Detection Improves the Detection of Toxic Language
John Pavlopoulos and Aristidis Likas

Improving Acoustic Word Embeddings through Correspondence Training of Self-supervised Speech Representations
Amit Meghanani and Thomas Hain

Investigating Agency of LLMs in Human-AI Collaboration Tasks
Ashish Sharma, Sudha Rao, Chris Brockett, Akanksha Malhotra, Nebojsa Jojic and Bill Dolan

Do We Really Need Training Data for Dialogue State Tracking with Large Language Models?
Atharva Kulkarni, Bo-hsiang Tseng, Joel Ruben Antony Moniz, Dhivya Piraviperumal, Hong Yu and Shruti Bhargava

Argument Mining as a Text-to-Text Generation Task
Masayuki Kawarada, Tsutomu Hirao, Wataru Uchida and Masaaki Nagata

Answering legal questions from laymen in German civil law system
Marius Büttner and Ivan Habernal

An Empirical Analysis of Diversity in Argument Summarization
Michiel Van Der Meer, Piek Vossen, Catholijn M Jonker and Pradeep Kumar Murukannaiah

What Makes Medical Claims (Un)Verifiable? Analyzing Entity and Relation Properties for Fact Verification
Amelie Wuehrl, Yarik Menchaca Resendiz, Lara Grimminger and Roman Klinger

Approximate Attributions for Off-the-Shelf Siamese Transformers
Lucas Moeller, Dmitry Nikolaev and Sebastian Padó

Describing Images $\textit{Fast and Slow}$: Quantifying and Predicting the Variation in Human Signals during Visuo-Linguistic Processes
Ece Takmaz, Sandro Pezzelle and Raquel Fernández

Unraveling Cross-Lingual Dynamics in Language Models: Independent, Shared and Transferred Factual Knowledge
Xin Zhao, Naoki Yoshinaga and Daisuke Oba

Exploring Open-Domain Fact Verification of Scientific Claims: A Comparative Analysis of Knowledge Sources
Juraj Vladika and Florian Matthes

Measuring Uncertainty in Neural Machine Translation with Similarity-Sensitive Entropy
Julius Cheng and Andreas Vlachos

LegalLens: Leveraging LLMs for Legal Violation Identification in Unstructured Text
Dor Bernsohn, Yaron Vazana, Ben Hagag and Joel Niklaus

$\mu$PLAN: Summarizing using a Content Plan as Cross-Lingual Bridge
Fantine Huot, Joshua Maynez, Chris Alberti, Reinald Kim Amplayo, Priyanka Agrawal, Constanza Fierro, Shashi Narayan and Mirella Lapata

Exploring Data Augmentation in Neural DRS-to-Text Generation
Muhammad Saad Amin, Alessandro Mazzei and Luca Anselma

Think Twice: Measuring the Efficiency of Eliminating Prediction Shortcuts of Question Answering Models
Lukáš Mikula, Michal Štefánik, Marek Petrovič and Petr Sojka

Improving Contrastive Learning in Emotion Recognition in Conversation via Data Augmentation and Decoupled Neutral Emotion
Yujin Kang and Yoon-sik Cho

CroCoAlign: A Cross-Lingual, Context-Aware and Fully-Neural Sentence Alignment System for Long Texts
Francesco Maria Molfese, Andrei Stefan Bejgu, Simone Tedeschi, Simone Conia and Roberto Navigli

Explaining Speech Classification Models via Word-Level Audio Segments and Paralinguistic Features
Eliana Pastor, Alkis Koudounas, Giuseppe Attanasio, Dirk Hovy and Elena Baralis

Zero-Shot End-to-End Spoken Language Understanding via Cross-Modal Selective Self-Training
Jianfeng He, Julian Salazar, Kaisheng Yao, Haoqi Li and Jason Cai

Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models
Natalie Shapira, Mosh Levy, Seyed Hossein Alavi, Xuhui Zhou, Yejin Choi, Yoav Goldberg, Maarten Sap and Vered Shwartz

NevIR: Negation in Neural Information Retrieval
Orion Weller, Dawn Lawrie and Benjamin Van Durme

“According to . . . ”: Prompting Language Models Improves Quoting from Pre-Training Data
Orion Weller, Marc Marone, Nathaniel Weir, Dawn Lawrie, Daniel Khashabi and Benjamin Van Durme

Accurate and Well-Calibrated ICD Code Assignment with a Chunk-Based Classifier Attending over Diverse Label Embeddings
Goncalo Emanuel Cavaco Gomes, Bruno Martins and Isabel Pereira Coutinho

Investigating Content Planning for Navigating Trade-offs in Knowledge-Grounded Dialogue
Kushal Chawla, Hannah Rashkin, Gaurav Singh Tomar and David Reitter

SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models
Xiang Gao, Jiaxin Zhang, Lalla Mouatadid and Kamalika Das

TESS: Text-to-Text Self-Conditioned Simplex Diffusion
Rabeeh Karimi Mahabadi, Hamish Ivison, Jaesung Tae, James Henderson, Iz Beltagy, Matthew E Peters and Arman Cohan

Advancing Precise Outline-Conditioned Text Generation with Task Duality and Explicit Outline Control
Yunzhe Li, Qian Chen, Weixiang Yan, Wen Wang, Qinglin Zhang and Hari Sundaram

Localization vs. Semantics: Visual Representations in Unimodal and Multimodal Models
Zhuowan Li, Cihang Xie, Benjamin Van Durme and Alan Yuille

Creating Suspenseful Stories with Large Language Models
Kaige Xie and Mark Riedl

Few-Shot Dialogue Summarization via Skeleton-Assisted Prompt Transfer
Kaige Xie, Tong Yu, Haoliang Wang, Junda Wu, Handong Zhao, Ruiyi Zhang, Kanak Mahadik, Ani Nenkova and Mark Riedl

Ask, Assess, and Refine: Rectifying Factual Consistency and Hallucination in LLMs with Metric-Guided Feedback Learning
Dongyub Lee, Eunhwan Park, Hodong Lee and Heuiseok Lim

Effective Controllable Bias Mitigation for Classification and Retrieval using Gate Adapters
Shahed Masoudian, Cornelia Volaucnik, Markus Schedl and Navid Rekabsaz

STable: Table Generation Framework for Encoder-Decoder Models
Michał Pietruszka, Michał Turski, Łukasz Borchmann, Tomasz Dwojak, Gabriela Nowakowska, Karolina Szyndler, Dawid Jurkiewicz and Łukasz Garncarek

A RelEntLess Benchmark for Modelling Graded Relations between Named Entities
Asahi Ushio, Jose Camacho-collados and Steven Schockaert

A Multimodal Framework to Detect Target Aware Aggression in Memes
Shawly Ahsan, Eftekhar Hossain, Omar Sharif, Avishek Das, Mohammed Moshiul Hoque and M. Ali Akber Dewan

Graph Guided Question Answer Generation for Procedural Question-Answering
Hai X. Pham, Isma Hadji, Xinnuo Xu, Ziedune Degutyte, Jay Rainey, Evangelos Kazakos, Afsaneh Fazly, Georgios Tzimiropoulos and Brais Martinez

Contrastive Decoding Reduces Hallucinations in Large Multilingual Machine Translation Models
Jonas Waldendorf, Barry Haddow and Alexandra Birch

Leveraging fine-tuned Large Language Models with LoRA for Effective Claim, Claimer, and Claim Object Detection
Sotiris Kotitsas, Panagiotis Kounoudis, Eleni Koutli and Haris Papageorgiou

Should I try multiple optimizers when fine-tuning a pre-trained Transformer for NLP tasks? Should I tune their hyperparameters?
Nefeli Gkouti, Prodromos Malakasiotis, Stavros Toumpis and Ion Androutsopoulos

GUMsley: Evaluating Entity Salience in Summarization for 12 English Genres
Jessica Lin and Amir Zeldes

Sensitivity, Performance, Robustness: Deconstructing the Effect of Sociodemographic Prompting
Tilman Beck, Hendrik Schuff, Anne Lauscher and Iryna Gurevych

Extraction of Narratives from Podcast Transcripts
Yosra Abdessamed, Steven R. Wilson and Shadi Rezapour

Frequency Explains the Inverse Correlation of Large Language Models' Size, Training Data Amount, and Surprisal's Fit to Reading Times
Byung-doh Oh, Shisen Yue and William Schuler

Presentations by the Humans and For the Humans: Harnessing LLMs for Generating Persona-Aware Slides from Documents
Ishani Mondal, Shwetha S, Anandhavelu Natarajan, Aparna Garimella, Sambaran Bandyopadhyay and Jordan Lee Boyd-graber

ToPro: Token-Level Prompt Decomposition for Cross-Lingual Sequence Labeling Tasks
Bolei Ma, Ercong Nie, Shuzhou Yuan, Helmut Schmid, Michael Färber, Frauke Kreuter and Hinrich Schuetze

Small Language Models Improve Giants by Rewriting Their Outputs
Giorgos Vernikos, Arthur Brazinskas, Jakub Adamek, Jonathan Mallinson, Aliaksei Severyn and Eric Malmi

Unintended Bias Detection and Mitigation in Misogynous Memes
Gitanjali Kumari, Anubhav Sinha and Asif Ekbal

A Weak Supervision Approach for Few-Shot Aspect Based Sentiment Analysis
Robert Vacareanu, Siddharth Varia, Kishaloy Halder, Shuai Wang, Giovanni Paolini, Neha Anna John, Miguel Ballesteros and Smaranda Muresan

Counterfactual Reasoning with Knowledge Graph Embeddings
Lena Zellinger, Andreas Stephan and Benjamin Roth

System-Level Natural Language Feedback
Weizhe Yuan, Kyunghyun Cho and Jason E Weston

Syntactic Preposing and Discourse Relations
Yunfang Dong, Xixian Liao and Bonnie L. Webber

Can we obtain significant success in RST discourse parsing by using Large Language Models?
Aru Maekawa, Tsutomu Hirao, Hidetaka Kamigaito and Manabu Okumura

Ameli: Enhancing Multimodal Entity Linking with Fine-Grained Attributes
Barry Menglong Yao, Sijia Wang, Yu Chen, Qifan Wang, Minqian Liu, Zhiyang Xu, Licheng Yu and Lifu Huang

Generative Dense Retrieval: Memory Can Be a Burden
Peiwen Yuan, Xinglin Wang, Shaoxiong Feng, Boyuan Pan, Yiwei Li, Heda Wang, Xupeng Miao and Kan Li

Backward Compatibility During Data Updates by Weight Interpolation
Raphael Schumann, Elman Mansimov, Yi-an Lai, Nikolaos Pappas, Xibin Gao and Yi Zhang

Gradient-Based Language Model Red Teaming
Nevan Wichers, Carson Denison and Ahmad Beirami

Do Moral Judgment and Reasoning Capability of LLMs Change with Language? A Study using the Multilingual Defining Issues Test
Aditi Khandelwal, Utkarsh Agarwal, Kumar Tanmay and Monojit Choudhury

Analyzing the Evaluation of Cross-Lingual Knowledge Transfer in Multilingual Language Models
Sara Rajaee and Christof Monz

Large-Scale Label Interpretation Learning for Few-Shot Named Entity Recognition
Jonas Golde, Felix Hamborg and Alan Akbik

MLCopilot: Unleashing the Power of Large Language Models in Solving Machine Learning Tasks
Lei Zhang, Yuge Zhang, Kan Ren, Dongsheng Li and Yuqing Yang

Text-Guided Image Clustering
Andreas Stephan, Lukas Miklautz, Kevin Sidak, Jan Philip Wahle, Bela Gipp, Claudia Plant and Benjamin Roth

CCPrefix: Counterfactual Contrastive Prefix-Tuning for Many-Class Classification
Yang Li, Canran Xu, Guodong Long, Tao Shen, Chongyang Tao and Jing Jiang

Threat Behavior Textual Search by Attention Graph Isomorphism
Chanwoo Bae, Guanhong Tao, Zhuo Zhang and Xiangyu Zhang

Short Papers

French GossipPrompts: Dataset For Prevention of Generating French Gossip Stories By LLMs
Msvpj Sathvik, Abhilash Dowpati and Revanth Kumar Narra

More Discriminative Sentence Embeddings via Semantic Graph Smoothing
Chakib Fettal, Lazhar Labiod and Mohamed Nadif

Multi-Level Attention Aggregation for Language-Agnostic Speaker Replication
Yejin Jeon and Gary Lee

Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Decoding
Rico Sennrich, Jannis Vamvas and Alireza Mohammadshahi

Injecting Wiktionary to improve token-level contextual representations using contrastive learning
Anna Mosolova, Marie Candito and Carlos Ramisch

Multilingual Gradient Word-Order Typology from Universal Dependencies
Emi Baylor, Esther Ploeger and Johannes Bjerva

Evaluating the Factuality of Zero-shot Summarizers Across Varied Domains
Sanjana Ramprasad, Kundan Krishna, Zachary Chase Lipton and Byron C Wallace

Leveraging Implicit Feedback from Deployment Data in Dialogue
Richard Yuanzhe Pang, Stephen Roller, Kyunghyun Cho, He He and Jason E Weston

Characterizing the Confidence of Large Language Model-Based Automatic Evaluation Metrics
Rickard Stureborg, Dimitris Alikaniotis and Yoshi Suhara

Equipping Language Models with Tool Use Capability for Tabular Data Analysis in Finance
Adrian Theuma and Ehsan Shareghi

Commonsense-augmented Memory Construction and Management in Long-term Conversations via Context-aware Persona Refinement
Hana Kim, Kai Tzu-iunn Ong, Seoyeon Kim, Dongha Lee and Jinyoung Yeo

Investigating the Potential of Task Arithmetic for Cross-Lingual Transfer
Marinela Parović, Ivan Vulić and Anna Korhonen

On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization
Lorenzo Jaime Yu Flores and Arman Cohan

Evaluating Unsupervised Argument Aligners via Generation of Conclusions of Structured Scientific Abstracts
Yingqiang Gao, Nianlong Gu, Jessica Lam, James Henderson and Richard Hahnloser

Over-Reasoning and Redundant Calculation of Large Language Models
Cheng-han Chiang and Hung-yi Lee

Multimodal Fallacy Classification in Political Debates
Eleonora Mancini, Federico Ruggeri and Paolo Torroni

The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in Classification Tasks
Anders Giovanni Møller, Arianna Pera, Jacob Aarup Dalsgaard and Luca Maria Aiello

Language Model Sentence Completion with a Parser-Driven Rhetorical Control Method
Joshua Zingale and Jugal Kalita

``It's how you do things that matters'': Attending to Process to Better Serve Indigenous Communities with Language Technologies
Ned Cooper, Courtney Heldreth and Ben Hutchinson

Source Identification in Abstractive Summarization
Yoshi Suhara and Dimitris Alikaniotis

From Partial to Strictly Incremental Constituent Parsing
Ana Ezquerro, Carlos Gómez-rodríguez and David Vilares

Predict the Next Word: <Humans exhibit uncertainty in this task and language models _____>
Evgenia Ilia and Wilker Aziz

A Prompt Response to the Demand for Automatic Gender-Neutral Translation
Beatrice Savoldi, Andrea Piergentili, Dennis Fucci, Matteo Negri and Luisa Bentivogli

Evaluating and Representing Uncertainty in NLP: Two (Conflicting?) Perspectives
Joris Baan, Raquel Fernández, Barbara Plank and Wilker Aziz

Smaller Language Models are Better Zero-shot Machine-Generated Text Detectors
Niloofar Mireshghallah, Justus Mattern, Sicun Gao, Reza Shokri and Taylor Berg-kirkpatrick

Robust Neural Machine Translation for Abugidas by Glyph Perturbation
Hour Kaing, Chenchen Ding, Hideki Tanaka and Masao Utiyama

Translation Errors Significantly Impact Low-Resource Languages in Cross-Lingual Learning
Barah Fazili, Ashish Sunil Agrawal and Preethi Jyothi

Less is More for Long Document Summary Evaluation by LLMs
Yunshu Wu, Hayate Iso, Pouya Pezeshkpour, Nikita Bhutani and Estevam Hruschka

Leveraging ChatGPT in Pharmacovigilance Event Extraction: An Empirical Study
Zhaoyue Sun, Gabriele Pergola, Byron C Wallace and Yulan He

A Comparative Analysis of Conversational Large Language Models in Knowledge-Based Text Generation
Phillip Schneider, Manuel Klettner, Elena Simperl and Florian Matthes

Extreme Fine-tuning: A Novel and Fast Fine-tuning Approach
Boonnithi Jiaramaneepinit, Thodsaporn Chay-intr, Kotaro Funakoshi and Manabu Okumura

Flow Matching for Conditional Text Generation in a Few Sampling Steps
Vincent Tao Hu, Di Wu, Yuki M Asano, Pascal Mettes, Basura Fernando, Björn Ommer and Cees G. M. Snoek

Corpus-Steered Query Expansion with Large Language Models
Yibin Lei, Yu Cao, Tianyi Zhou, Tao Shen and Andrew Yates

Defending Against Disinformation Attacks in Open-Domain Question Answering
Orion Weller, Aleem Khan, Nathaniel Weir, Dawn Lawrie and Benjamin Van Durme

Sentence Representations via Gaussian Embedding
Shohei Yoda, Hayato Tsukagoshi, Ryohei Sasano and Koichi Takeda

STORiCo: Storytelling TTS for Hindi with Character Voice Modulation
Pavan Kalyan Tankala, Preethi Jyothi, Preeti Rao and Pushpak Bhattacharyya

Rethinking Loss Functions for Fact Verification
Yuta Mukobara, Yutaro Shigeto and Masashi Shimbo

A Dataset for Metaphor Detection in Early Medieval Hebrew Poetry
Michael Toker, Yonatan Belinkov, Oren Mishali, Ophir Münz-manor and Benny Kimelfeld

SOCIALITE-LLAMA: An Instruction-Tuned Model for Social Scientific Tasks
Gourab Dey, Adithya V Ganesan, Yash Kumar Lal, Manal Shah, Shreyashee Sinha, Matthew Matero, Salvatore Giorgi, Vivek Kulkarni and H. Schwartz

Pre-Training Methods for Question Reranking
Stefano Campese, Ivano Lauriola and Alessandro Moschitti

Dynamic Masking Rate Schedules for MLM Pretraining
Zachary Ankner, Naomi Saphra, Davis Blalock, Jonathan Frankle and Matthew L Leavitt

CharSpan: Utilizing Lexical Similarity to Enable Zero-Shot Machine Translation for Extremely Low-resource Languages
Kaushal Kumar Maurya, Rahul Kejriwal, Maunendra Sankar Desarkar and Anoop Kunchukuttan