later

In this episode, Robin Jia talks about how to build robust NLP systems. We discuss the different senses in which a system can be robust, reasons to care about system robustness, and the challenges involved in evaluating robustness of NLP models. We talk about how to build certifiably robust models through interval bound propagation and discrete encoding functions, as well as how to modify data collection procedures through active learning for more robust model development.

Robin Jia is currently a visiting researcher at Facebook AI Research, and will be an assistant professor in the Department of Computer Science at the University of Southern California starting Fall 2021.

123 - Robust NLP, with Robin Jia

We invited Jayant Krishnamurthy and Hao Fang, researchers at Microsoft Semantic Machines to discuss their platform for building task-oriented dialog systems, and their recent TACL paper on the topic. The paper introduces a new formalism for task-oriented dialog to effectively handle references and revisions in complex dialog, and a large realistic dataset that uses this formalism.

Leaderboard associated with the dataset: https://microsoft.github.io/task_oriented_dialogue_as_dataflow_synthesis/
Jayant's Twitter handle: https://twitter.com/jayantkrish
Hao's Twitter handle: https://twitter.com/hfang90

124 - Semantic Machines and Task-Oriented Dialog, with Jayant Krishnamurthy and Hao Fang

How can we build Visual Question Answering systems for real users? For this episode, we chatted with Danna Gurari, about her work in building datasets and models towards VQA for people who are blind. We talked about the differences between the existing datasets, and Vizwiz, a dataset built by Gurari et al., and the resulting algorithmic changes. We also discussed the unsolved challenges in this field, and the new tasks they result in.

Danna Gurari is an Assistant Professor as well as Founding Director of the Image and Video Computing group in the School of Information at University of Texas at Austin (UT-Austin).

Vizwiz project page: https://vizwiz.org/

The hosts for this episode are Ana Marasović and Pradeep Dasigi.

125 - VQA for Real Users, with Danna Gurari

We invited Lisa Li to talk about her recent work, Prefix-Tuning: Optimizing Continuous Prompts for Generation. Prefix tuning is a lightweight alternative to finetuning, and the idea is to tune only a fixed-length task-specific continuous vector, and to keep the pretrained transformer parameters frozen. We discussed how prefix tuning compares with finetuning and other efficient alternatives on two tasks in various experimental settings, and in what scenarios prefix tuning is preferable.

Lisa is a Phd student at Stanford University. Lisa's webpage: https://xiangli1999.github.io/

The hosts for this episode are Pradeep Dasigi and Ana Marasović.

126 - Optimizing Continuous Prompts for Generation, with Lisa Li

In this episode, we talk to Shunyu Yao about recent insights into how transformers can represent hierarchical structure in language. Bounded-depth hierarchical structure is thought to be a key feature of natural languages, motivating Shunyu and his coauthors to show that transformers can efficiently represent bounded-depth Dyck languages, which can be thought of as a formal model of the structure of natural languages. We went on to discuss some of the intuitive ideas that emerge from the proofs, connections to RNNs, and insights about positional encodings that may have practical implications. More broadly, we also touched on the role of formal languages and other theoretical tools in modern NLP.

Papers discussed in this episode:

- Self-Attention Networks Can Process Bounded Hierarchical Languages (https://arxiv.org/abs/2105.11115)
- Theoretical Limitations of Self-Attention in Neural Sequence Models (https://arxiv.org/abs/1906.06755)
- RNNs can generate bounded hierarchical languages with optimal memory (https://arxiv.org/abs/2010.07515)
- On the Practical Computational Power of Finite Precision RNNs for Language Recognition (https://arxiv.org/abs/1805.04908)

Shunyu Yao's webpage: https://ysymyth.github.io/

The hosts for this episode are William Merrill and Matt Gardner.

129 - Transformers and Hierarchical Structure, with Shunyu Yao

In this special episode, we chatted with Chris Callison-Burch about his testimony in the recent U.S. Congress Hearing on the Interoperability of AI and Copyright Law. We started by asking Chris about the purpose and the structure of this hearing. Then we talked about the ongoing discussion on how the copyright law is applicable to content generated by AI systems, the potential risks generative AI poses to artists, and Chris’ take on all of this. We end the episode with a recording of Chris’ opening statement at the hearing.

140 - Generative AI and Copyright, with Chris Callison-Burch

In this special episode of NLP Highlights, we discussed building and open sourcing language models. What is the usual recipe for building large language models? What does it mean to open source them? What new research questions can we answer by open sourcing them? We particularly focused on the ongoing Open Language Model (OLMo) project at AI2, and invited Iz Beltagy and Dirk Groeneveld, the research and engineering leads of the OLMo project to chat.

Blog post announcing OLMo: https://blog.allenai.org/announcing-ai2-olmo-an-open-language-model-made-by-scientists-for-scientists-ab761e4e9b76

Organizations interested in partnership can express their interest here: https://share.hsforms.com/1blFWEWJ2SsysSXFUEJsxuA3ioxm

You can find Iz at twitter.com/i_beltagy and Dirk at twitter.com/mechanicaldirk

141 - Building an open source LM, with Iz Beltagy and Dirk Groeneveld

Our first guest with this new format is Kyle Lo, the most senior lead scientist in the Semantic Scholar team at Allen Institute for AI (AI2), who kindly agreed to share his perspective on #Science of #Science (#scisci) on our podcast. SciSci is concerned with studying how people do science, and includes developing methods and tools to help people consume AND produce science. Kyle has made several critical contributions in this field which enabled a lot of SciSci work over the past 5+ years, ranging from novel NLP methods (eg, SciBERT https://lnkd.in/gTP_tYiF ), to open data collections (eg, S2ORK https://lnkd.in/g4J6tXCG), to toolkits for manipulating scientific documents (eg, PaperMage https://lnkd.in/gwU7k6mJ which JUST received a Best Paper Award 🏆 at EMNLP 2023).

Kyle Lo's homepage: https://kyleclo.github.io/

142 - Science Of Science, with Kyle Lo

This podcast episode features Dr. Mohamed Elhoseiny, a true luminary in the realm of computer vision with over a decade of groundbreaking research. As an Assistant Professor at KAUST, Dr. Elhoseiny's work delves into the intersections of Computer Vision, Language & Vision, and Computational Creativity in Art, Fashion, and AI. Notably, he co-organized the 1st and 2nd Workshops on Closing the Loop between Vision and Language, demonstrating his commitment to advancing interdisciplinary research. With a rich educational background from Stanford University's Graduate School of Business Ignite Program, and Rutgers University as MS/PhD Researcher, coupled with influential stints at Stanford, Baidu Research, Facebook AI Research, Adobe Research, and SRI International, Dr. Elhoseiny brings a wealth of experience to our discussion.

"Imaginative AI" with Mohamed Elhoseiny

Curious about the safety of LLMs? 🤔 Join us for an insightful new episode featuring Suchin Gururangan, Young Investigator at Allen Institute for Artificial Intelligence and Data Science Engineer at Appuri. 🚀 Don't miss out on expert insights into the world of LLMs!

Are LLMs safe?

In this episode, we invite Hao Tan and Mohit Bansal to talk about multi-modal training of transformers, focusing in particular on their EMNLP 2019 paper that introduced LXMERT, a vision+language transformer.  We spend the first third of the episode talking about why you might want to have multi-modal representations.  We then move to the specifics of LXMERT, including the model structure, the losses that are used to encourage cross-modal representations, and the data that is used.  Along the way, we mention latent alignments between images and captions, the granularity of captions, and machine translation even comes up a few times.  We conclude with some speculation on the future of multi-modal representations.

Hao's website: http://www.cs.unc.edu/~airsplay/
Mohit's website: http://www.cs.unc.edu/~mbansal/
LXMERT paper: https://www.aclweb.org/anthology/D19-1514/

107 - Multi-Modal Transformers, with Hao Tan and Mohit Bansal

**The podcast is currently on hiatus. For more active NLP content, check out the Holistic Intelligence Podcast linked below.**

Welcome to the NLP highlights podcast, where we invite researchers to talk about their work in various areas in natural language processing. All views expressed belong to the hosts/guests, and do not represent their employers.

107 - Multi-Modal Transformers, with Hao Tan and Mohit Bansal

NLP Highlights

Related tracks