Publications

* and ^ represent equal-contribution groups.

Non-Archival Tech Reports

  1. πŸ‘€ WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences
    Yujie Lu, Dongfu Jiang, Wenhu Chen, William Yang Wang, Yejin Choi, Bill Yuchen Lin
    πŸ’γ€€arXiv
    [πŸ€— Leaderboard] [πŸ’Ύ Github] [🐦 Tweet]

  2. 🦁 WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
    Bill Yuchen Lin, Yuntian Deng, Khyathi Chandu, Faeze Brahman, Abhilasha Ravichander, Valentina Pyatkin, Nouha Dziri, Ronan Le Bras, Yejin Choi
    πŸ’γ€€arXiv
    [πŸ€— Leaderboard] [πŸ’Ύ Github] [🐦 Tweet]

  3. 🚨 WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
    Seungju Han, Kavel Rao, Allyson Ettinger, Liwei Jiang, Bill Yuchen Lin, Nathan Lambert, Yejin Choi, Nouha Dziri
    πŸ’γ€€arXiv
    [πŸ’Ύ Github]

  4. πŸ¦β€β¬› Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
    Zhangchen Xu, Fengqing Jiang, Luyao Niu, Yuntian Deng, Radha Poovendran, Yejin Choi, Bill Yuchen Lin
    πŸ’γ€€arXiv
    [πŸ’» Website] [πŸ€— HF] [πŸ€— Demo (by @davanstrien)] [πŸ’Ύ Github] [🐦 Tweet]

  5. πŸ† RewardBench: Evaluating Reward Models for Language Modeling
    Nathan Lambert, Valentina Pyatkin, Jacob Morrison, LJ Miranda, Bill Yuchen Lin, Khyathi Chandu, Nouha Dziri, Sachin Kumar, Tom Zick, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi
    πŸ’γ€€arXiv

  6. πŸ”₯ Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
    Seungone Kim, Juyoung Suk, Shayne Longpre, Bill Yuchen Lin, Jamin Shin, Sean Welleck, Graham Neubig, Moontae Lee, Kyungjae Lee, Minjoon Seo
    πŸ’γ€€arXiv

  7. 🧩 L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects
    βœοΈγ€€Yutaro Yamada, Khyathi Chandu, Bill Yuchen Lin, Jack Hessel, Ilker Yildirim, Yejin Choi
    πŸ’γ€€arXiv

  8. 🍲 Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
    βœοΈγ€€Joel Jang, Seungone Kim, Bill Yuchen Lin, Yizhong Wang, Jack Hessel, Luke Zettlemoyer, Hannaneh Hajishirzi, Yejin Choi, Prithviraj Ammanabrolu
    πŸ’γ€€arXiv
    [πŸ’Ύ Github]

2024

  1. 🐏 The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning
    βœοΈγ€€Bill Yuchen Lin, Abhilasha Ravichander, Ximing Lu, Nouha Dziri, Melanie Sclar, Khyathi Chandu, Chandra Bhagavatula, Yejin Choi
    πŸ’γ€€ICLR 2024
    [πŸ“ƒ Website] [πŸ’Ύ Github]
    [πŸ€— Demo (BaseChat)] [πŸ€— URIAL-Bench] [🐦 Tweet 1] [🐦 2]

  2. πŸͺ„ Agent Lumos: Unified and Modular Training for Open-Source Language Agents
    βœοΈγ€€Da Yin, Faeze Brahman, Abhilasha Ravichander, Khyathi Chandu, Kai-Wei Chang, Yejin Choi, Bill Yuchen Lin
    πŸ’γ€€ACL 2024 Main Conference
    [πŸ“ƒ Website] [πŸ’Ύ Github] [🐦 Tweet]

  3. πŸ€– Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents
    βœοΈγ€€Yifan Song, Da Yin, Xiang Yue, Jie Huang, Sujian Li, Bill Yuchen Lin
    πŸ’γ€€ACL 2024 Main Conference
    [πŸ’Ύ Github]

  4. πŸ›‘οΈ SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
    βœοΈγ€€Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bill Yuchen Lin, Radha Poovendran
    πŸ’γ€€ACL 2024 Main Conference
    [πŸ’Ύ Github]

  5. πŸ‘€ Selective β€œSelective Prediction”: Reducing Unnecessary Abstention in Vision-Language Reasoning
    Tejas Srinivasan, Jack Hessel, Tanmay Gupta, Bill Yuchen Lin, Yejin Choi, Jesse Thomason, Khyathi Raghavi Chandu
    πŸ’γ€€ACL 2024 Findings

  6. πŸ’» OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
    βœοΈγ€€Tianyu Zheng, Ge Zhang, Tianhao Shen, Xueling Liu, Bill Yuchen Lin, Jie Fu, Wenhu Chen, Xiang Yue
    πŸ’γ€€ACL 2024 Findings [πŸ’Ύ Website] [πŸ’Ύ Demo] [πŸ’Ύ Code] [πŸ’Ύ Models]

  7. πŸƒ Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT4
    Jiaxian Guo*, Bo Yang*, Paul Yoo, Bill Yuchen Lin, Yusuke Iwasawa, Yutaka Matsuo
    πŸ’γ€€COLM 2024
    [πŸ’Ύ Github] [🐦 Tweet]

  8. βš›οΈ LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
    βœοΈγ€€Chengsong Huang*, Qian Liu*, Bill Yuchen Lin*, Tianyu Pang, Chao Du, Min Lin
    πŸ’γ€€COLM 2024
    [πŸ’Ύ Demo] [πŸ’Ύ Github] [🐦 Tweet]

  9. πŸ•ΈοΈ VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?
    Junpeng Liu, Yifan Song, Bill Yuchen Lin, Wai Lam, Graham Neubig, Yuanzhi Li, Xiang Yue
    πŸ’γ€€COLM 2024

  10. 🐯 TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks
    βœοΈγ€€Dongfu Jiang*, Yishan Li*, Ge Zhang, Wenhao Huang, Bill Yuchen Lin, Wenhu Chen
    πŸ’γ€€TMLR
    [πŸ’Ύ Github] [πŸ“ƒ Website] [🐦 Tweet]

2023

  1. πŸ”₯ SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks
    βœοΈγ€€Bill Yuchen Lin, Yicheng Fu, Karina Yang, Prithviraj Ammanabrolu, Faeze Brahman, Shiyu Huang, Chandra Bhagavatula, Yejin Choi, Xiang Ren
    πŸ’γ€€NeurIPS 2023 (spotlight)
    [πŸ“ƒ Website] [πŸ’Ύ Github] [🐦 Tweet] [πŸ“° Blog]

  2. πŸ”₯ Faith and Fate: Limits of Transformers on Compositionality
    βœοΈγ€€Nouha Dziri*, Ximing Lu*, Melanie Sclar*, Xiang Lorraine Li^, Liwei Jiang^, Bill Yuchen Lin^,
    Peter West, Chandra Bhagavatula, Ronan Le Bras,Jena Hwang,Soumya Sanyal,Sean Welleck,Xiang Ren, Allyson Ettinger, Zaid Harchaoui, Yejin Choi
    πŸ’γ€€NeurIPS 2023 (spotlight)
    [🐦 Tweet]

  3. πŸ”₯ LLM-Blender: Ensembling Large Language Models with Pairwise Comparison and Generative Fusion
    βœοΈγ€€Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
    πŸ’γ€€to appear in Proc. of ACL 2023
    [πŸ“ƒ Website] [πŸ’Ύ Github] [🐦 Tweet]
    Media coverage : MarkTechPost

  4. 🍺 Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
    βœοΈγ€€Ximing Lu, Faeze Brahman, Peter West, Jaehun Jang, Khyathi Chandu, Abhilasha Ravichander, Lianhui Qin, Prithviraj Ammanabrolu,
    Liwei Jiang, Sahana Ramnath, Nouha Dziri, Jillian Fisher, Bill Yuchen Lin, Skyler Hallinan, Xiang Ren, Sean Welleck, Yejin Choi
    πŸ’γ€€EMNLP 2023 (Main)

  5. NovaCOMET: Open Commonsense Foundation Models with Symbolic Knowledge Distillation
    βœοΈγ€€Peter West, Ronan Le Bras, Taylor Sorensen, Bill Yuchen Lin, Liwei Jiang, Ximing Lu, Khyathi Chandu, Jack Hessel, Ashutosh Baheti, Chandra Bhagavatula, Yejin Choi
    πŸ’γ€€EMNLP 2023 (Findings)

  6. On Grounded Planning for Embodied Tasks with Language Models
    βœοΈγ€€Bill Yuchen Lin*, Chengsong Huang*, Qian Liu, Wenda Gu, Sam Sommerer, Xiang Ren
    πŸ’γ€€in Proc. of AAAI 2023
    [πŸ“ƒ Website] [πŸ’Ύ Github] [πŸ€— Data]
    Media coverage : USC Viterbi News

  7. AutoTriggER: Named Entity Recognition with Auxiliary Trigger Extraction
    βœοΈγ€€Dong-Ho Lee, Ravi Kiran Selvam, Sheikh Muhammad Sarwar, Bill Yuchen Lin,
    Mahak Agarwal, Fred Morstatter, Jay Pujara, Elizabeth Boschee, James Allan, Xiang Ren
    πŸ’γ€€in Proc. of EACL 2023, also presented at TrustNLP @ NAACL 2021 (best paper award)

  8. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
    βœοΈγ€€442 authors including Bill Yuchen Lin
    Β [πŸ’Ύ Github]
    πŸ’γ€€in TMLR

2022

  1. Reflect, Not Reflex: Inference-Based Common Ground Improves Dialogue Response Quality
    βœοΈγ€€Pei Zhou, Hyundong J. Cho, Pegah Jandaghi, Dong-Ho Lee, Bill Yuchen Lin, Jay Pujara, Xiang Ren
    πŸ’γ€€in Proc. of EMNLP 2022
    [🐦 Tweet]

  2. Unsupervised Cross-Task Generalization via Retrieval Augmentation
    βœοΈγ€€Bill Yuchen Lin, Kangmin Tan, Chris Miller, Beiwen Tian, Xiang Ren
    πŸ’γ€€in Proc. of NeurIPS 2022
    Β [πŸ“ƒ Website] Β [πŸ’Ύ Github] [πŸ–ΌοΈ Slides] Β [🎦 Video] [🐦 Tweet]

  3. On Continual Model Refinement in Out-of-Distribution Data Streams
    βœοΈγ€€Bill Yuchen Lin, Sida Wang, Xi Victoria Lin, Robin Jia, Lin Xiao, Xiang Ren, Scott Yih
    πŸ’γ€€in Proc. of ACL 2022
    Β [πŸ“ƒ Website] Β [πŸ’Ύ Github] [πŸ–ΌοΈ Slides] Β [🎦 Video] [🐦 Tweet]

  4. FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing Tasks
    βœοΈγ€€Bill Yuchen Lin*, Chaoyang He*, Zihang Zeng, Hulin Wang, Yufen Huang, Mahdi Soltanolkotabi, Xiang Ren^, Salman Avestimehr^
    πŸ’γ€€in Proc. of NAACL 2022 Findings
    [πŸ’Ύ Github] [🐦 Tweet]

  5. On the Robustness of Reading Comprehension Models to Entity Renaming
    βœοΈγ€€Jun Yan, Yang Xiao, Sagnik Mukherjee, Bill Yuchen Lin, Robin Jia, Xiang Ren
    πŸ’γ€€in Proc. of NAACL 2022

2021

  1. CrossFit: A Few-shot Learning Challenge for Cross-Task Generalization in NLP
    βœοΈγ€€Qinyuan Ye, Bill Yuchen Lin, Xiang Ren
    πŸ’γ€€in Proc. of EMNLP 2021
    [πŸ’Ύ Github] [🐦 Tweet]

  2. Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning
    βœοΈγ€€Xisen Jin, Bill Yuchen Lin, Mohammad Rostami, Xiang Ren
    πŸ’γ€€in Proc. of EMNLP 2021 Findings
    [πŸ’Ύ Github]

  3. RockNER: A Simple Method to Create Adversarial Examples for Evaluating the Robustness of NER Models
    βœοΈγ€€Bill Yuchen Lin, Wenyang Gao, Jun Yan, Ryan Moreno, Xiang Ren
    πŸ’γ€€in Proc. of EMNLP 2021 (short)
    Β [πŸ“ƒ Website]

  4. RICA: Evaluating Robust Inference Capabilities Based on Commonsense Axioms
    βœοΈγ€€Pei Zhou, Rahul Khanna, Seyeon Lee, Bill Yuchen Lin, Daniel Ho, Jay Pujara, Xiang Ren
    πŸ’γ€€in Proc. of EMNLP 2021
    Β [πŸ“ƒ Website]

  5. Probing Commonsense Explanation in Dialogue Response Generation
    βœοΈγ€€Pei Zhou, Pegah Jandaghi, Hyundong Cho, Bill Yuchen Lin, Jay Pujara, Xiang Ren
    πŸ’γ€€in Proc. of EMNLP 2021 Findings

  6. Common Sense Beyond English: Evaluating and Improving Multilingual Language Models for Commonsense Reasoning
    βœοΈγ€€Bill Yuchen Lin, Seyeon Lee, Xiaoyang Qiao, Xiang Ren
    πŸ’γ€€in Proc. of ACL 2021
    [πŸ’Ύ Github] Β Β [πŸ“ƒ Website]

  7. RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge
    βœοΈγ€€Bill Yuchen Lin, Ziyi Wu, Yichi Yang, Dong-Ho Lee, Xiang Ren
    πŸ’γ€€in Proc. of ACL 2021 Findings
    [πŸ’Ύ Github] Β Β [πŸ“ƒ Website]

  8. Differentiable Open-Ended Commonsense Reasoning
    βœοΈγ€€Bill Yuchen Lin, Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Xiang Ren, William W. Cohen
    πŸ’γ€€in Proc. of NAACL 2021
    [πŸ–ΌοΈ Slides] Β Β [🎦 Video] Β Β [πŸ’Ύ Github] Β Β [πŸ“ƒ Website]

  9. Pre-training Text-to-Text Transformers for Concept-Centric Common Sense
    βœοΈγ€€Wangchunshu Zhou, Dong-Ho Lee, Ravi Kiran Selvam, Seyeon Lee, Bill Yuchen Lin, Xiang Ren
    πŸ’γ€€in Proc. of ICLR 2021 Β 
    [πŸ’Ύ Github]

  10. IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization
    βœοΈγ€€Wenxuan Zhou, Bill Yuchen Lin, Xiang Ren
    πŸ’γ€€in Proc. of AAAI 2021

2020

  1. CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning
    βœοΈγ€€Bill Yuchen Lin, Wangchunshu Zhou, Ming Shen, Pei Zhou, Chandra Bhagavatula, Yejin Choi, Xiang Ren
    πŸ’γ€€in Proc. of EMNLP 2020 Findings Β Β  (presented at AKBC 2020 as a non-archival paper.)
    [πŸ“ƒ Website]
    Media coverage : The Register , Tech Xplore , Techzine , Radio.com , ScienceDaily , USC Viterbi

  2. Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models
    βœοΈγ€€Bill Yuchen Lin, Seyeon Lee, Rahul Khanna, Xiang Ren
    πŸ’γ€€in Proc. of EMNLP 2020 (short)
    [πŸ“ƒ Website]

  3. Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering
    βœοΈγ€€Yanlin Feng*, Xinyue Chen*, Bill Yuchen Lin, Peifeng Wang, Jun Yan, Xiang Ren
    πŸ’γ€€in Proc. of EMNLP 2020
    [πŸ’Ύ Github]

  4. FreeDOM: A Transferable Neural Architecture for Structured Information Extraction on Web Documents
    βœοΈγ€€Bill Yuchen Lin, Ying Sheng, Nguyen Vo and Sandeep Tata
    πŸ’γ€€in Proc. of KDD 2020 (Research Track)
    [πŸ–ΌοΈ Slides] [🎦 Video]
  5. TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition
    βœοΈγ€€Bill Yuchen Lin*, Dongho Lee*, Ming Shen, Ryan Moreno, Xiao Huang, Prashant Shiralkar, Xiang Ren
    πŸ’γ€€in Proc. of ACL 2020 (short)
    [πŸ–ΌοΈ Slides] Β Β [🎦 Video] Β Β [πŸ’Ύ Github] Β Β [πŸ“ƒ Website]

  6. Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling.
    βœοΈγ€€Ouyu Lan, Xiao Huang, Bill Yuchen Lin, He Jiang, Liyuan Liu, Xiang Ren
    πŸ’γ€€in Proc. of ACL 2020
    [πŸ’Ύ Github]

  7. LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation
    βœοΈγ€€Dong-Ho Lee, Rahul Khanna, Bill Yuchen Lin, Jamin Chen, Seyeon Lee, Qinyuan Ye, Elizabeth Boschee, Leonardo Neves, Xiang Ren
    πŸ’γ€€in Proc. of ACL 2020 (Demo Track)
    [πŸ“ƒ Website]

  8. NERO: A Neural Rule Grounding Framework for Label-Efficient Relation Extraction.
    βœοΈγ€€Wenxuan Zhou, Hongtao Lin, Bill Yuchen Lin, Ziqi Wang, Junyi Du, Leonardo Neves, Xiang Ren
    πŸ’γ€€in Proc. of TheWebConf (WWW) 2020
    Best Paper Runner-up (2/1500+) Β Β  [πŸ’Ύ Github]

2019

  1. KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning.
    βœοΈγ€€Bill Yuchen Lin, Xinyue Chen, Jamin Chen, Xiang Ren
    πŸ’γ€€in Proc. of EMNLP-IJCNLP 2019
    [πŸ’Ύ Github]
  2. AlpacaTag: An Active Learning-based Crowd Annotation Framework for Sequence Tagging.
    βœοΈγ€€Bill Yuchen Lin*, Dongho Lee*, Frank F. Xu, Ouyu Lan, Xiang Ren
    πŸ’γ€€in Proc. of ACL 2019 (Demo Track)
    [πŸ“ƒ Website]

2018

  1. Neural Adaptation Layers for Cross-domain Named Entity Recognition.
    βœοΈγ€€Bill Yuchen Lin, Wei Lu
    πŸ’γ€€in Proc. of EMNLP 2018
    [πŸ’Ύ Github]
  2. ExtRA: Extracting Prominent Review Aspects from Customer Feedback.
    βœοΈγ€€Zhiyi Luo, Shanshan Huang, Frank F. Xu, Bill Yuchen Lin, Hanyuan Shi, Kenny Q. Zhu
    πŸ’γ€€in Proc. of EMNLP 2018
    [πŸ’Ύ Github]
  3. Mining Cross-Cultural Differences and Similarities in Social Media.
    βœοΈγ€€Bill Yuchen Lin*, Frank F. Xu*, Kenny Q. Zhu, Seung-won Hwang
    πŸ’γ€€in Proc. of ACL 2018
    [πŸ’Ύ Github]
  4. Automatic Extraction of Commonsense LocatedNear Knowledge.
    βœοΈγ€€Frank F. Xu*, Bill Yuchen Lin*, Kenny Q. Zhu
    πŸ’γ€€in Proc. of ACL 2018 (short)
    [πŸ’Ύ Github]

2017

  1. Multi-channel BiLSTM-CRF Model for Emerging Named Entity Recognition in Social Media.
    βœοΈγ€€Bill Y. Lin*, Frank F. Xu*, Zhiyi Luo, Kenny Q. Zhu
    πŸ’γ€€in Proc. of EMNLP 2017, Workshop on Noisy User-generated Text
    [πŸ’Ύ Github]