EMNLP 2022 Conference Recap - Talking Language AI Ep#4

Empirical Methods in Natural Language Processing (EMNLP) is one of the leading annual research conferences for language processing, and it’s also a great place to see some cutting-edge developments in AI. This past December, I attended EMNLP 2022 and brought my camera with me. I wanted to record my experience at this fascinating event, along with a few conversations with people I met there, and share it with you all in our Talking Language series.

Join me on the show floor at EMNLP 2022 and get a taste of what it’s like to attend the conference. View the full episode (also embedded below), and feel free to post questions or comments in the thread on this episode in the Cohere Discord channel.

EMNLP 2022 took place in Abu Dhabi, UAE and featured 24 workshops and a whopping 828 papers. I spoke with a number of folks and asked them to share their thoughts on the most exciting developments in NLP in 2020 and what they were looking forward to in 2023. I also wanted to know what new ideas they thought had great potential but were currently underrated or overlooked by the research community.

I spent most of the first day at the Generation, Evaluation & Metrics (GEM) workshop. GEM included a full day of sessions and three keynotes that focused on trust, collaboration, and safety regarding language generators and large language models (LLMs). I also dropped in on the Massively Multilingual NLU 2022 workshop, which focused on ways to overcome current limitations and bring natural language understanding technology to every language on earth, both for production systems and for research endeavors.

On the second day, I spent most of my time in the BlackboxNLP 2022 workshop, which brought together researchers focused on interpreting and explaining NLP models by taking inspiration from machine learning, psychology, linguistics, and neuroscience. One of the highlights was David Bau’s talk on direct model editing using the ROME method. I also have an interest in Arabic NLP and attended a few sessions of the Arabic Natural Language Processing (WANLP 2022) workshop.

A few poster sessions were particularly interesting, including a retrieval-augmented transformer, a GENIE evaluation leaderboard, the Bloom Library with multimodal datasets in 300+ languages, and a data-efficient music playlist captioning project.

A few poster sessions were particularly interesting, including a retrieval-augmented transformer, a GENIE evaluation leaderboard, the Bloom Library with multimodal datasets in 300+ languages, and a data-efficient music playlist captioning project.