Publications
* and ^ represent equal-contribution groups.
Non-Archival Tech Reports
-
๐ฆโโฌ Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Zhangchen Xu, Fengqing Jiang, Luyao Niu, Yuntian Deng, Radha Poovendran, Yejin Choi, Bill Yuchen Lin
๐ขใarXiv
[๐ป Website] [๐ค HF] [๐ค Demo (by @davanstrien)] [๐พ Github] [๐ฆ Tweet] -
๐ WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences
Yujie Lu, Dongfu Jiang, Wenhu Chen, William Yang Wang, Yejin Choi, Bill Yuchen Lin
๐ขใarXiv
[๐ค Leaderboard] [๐พ Github] [๐ฆ Tweet] -
๐ฆ WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
Bill Yuchen Lin, Yuntian Deng, Khyathi Chandu, Faeze Brahman, Abhilasha Ravichander, Valentina Pyatkin, Nouha Dziri, Ronan Le Bras, Yejin Choi
๐ขใarXiv
[๐ค Leaderboard] [๐พ Github] [๐ฆ Tweet] -
๐ญ SimulBench: Evaluating Language Models with Creative Simulation Tasks
Qi Jia, Xiang Yue, Tianyu Zheng, Jie Huang, Bill Yuchen Lin
๐ขใarXiv
[๐ป Website] -
๐จ WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
Seungju Han, Kavel Rao, Allyson Ettinger, Liwei Jiang, Bill Yuchen Lin, Nathan Lambert, Yejin Choi, Nouha Dziri
๐ขใarXiv
[๐พ Github] -
๐ผ OfficeBench: Benchmarking Language Agents across Multiple Applications for Office Automation
Zilong Wang, Yuedong Cui, Li Zhong, Zimin Zhang, Da Yin, Bill Yuchen Lin, Jingbo Shang
๐ขใarXiv
[๐พ Github] -
๐ RewardBench: Evaluating Reward Models for Language Modeling
Nathan Lambert, Valentina Pyatkin, Jacob Morrison, LJ Miranda, Bill Yuchen Lin, Khyathi Chandu, Nouha Dziri, Sachin Kumar, Tom Zick, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi
๐ขใarXiv -
๐ฅ Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
Seungone Kim, Juyoung Suk, Shayne Longpre, Bill Yuchen Lin, Jamin Shin, Sean Welleck, Graham Neubig, Moontae Lee, Kyungjae Lee, Minjoon Seo
๐ขใarXiv -
๐งฉ L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects
โ๏ธใYutaro Yamada, Khyathi Chandu, Bill Yuchen Lin, Jack Hessel, Ilker Yildirim, Yejin Choi
๐ขใarXiv -
๐ฒ Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
โ๏ธใJoel Jang, Seungone Kim, Bill Yuchen Lin, Yizhong Wang, Jack Hessel, Luke Zettlemoyer, Hannaneh Hajishirzi, Yejin Choi, Prithviraj Ammanabrolu
๐ขใarXiv
[๐พ Github]
2024
-
๐ The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning
โ๏ธใBill Yuchen Lin, Abhilasha Ravichander, Ximing Lu, Nouha Dziri, Melanie Sclar, Khyathi Chandu, Chandra Bhagavatula, Yejin Choi
๐ขใICLR 2024
[๐ Website] [๐พ Github]
[๐ค Demo (BaseChat)] [๐ค URIAL-Bench] [๐ฆ Tweet 1] [๐ฆ 2] -
๐ช Agent Lumos: Unified and Modular Training for Open-Source Language Agents
โ๏ธใDa Yin, Faeze Brahman, Abhilasha Ravichander, Khyathi Chandu, Kai-Wei Chang, Yejin Choi, Bill Yuchen Lin
๐ขใACL 2024 Main Conference
[๐ Website] [๐พ Github] [๐ฆ Tweet] -
๐ค Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents
โ๏ธใYifan Song, Da Yin, Xiang Yue, Jie Huang, Sujian Li, Bill Yuchen Lin
๐ขใACL 2024 Main Conference
[๐พ Github] -
๐ก๏ธ SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
โ๏ธใZhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bill Yuchen Lin, Radha Poovendran
๐ขใACL 2024 Main Conference
[๐พ Github] -
๐ Selective โSelective Predictionโ: Reducing Unnecessary Abstention in Vision-Language Reasoning
Tejas Srinivasan, Jack Hessel, Tanmay Gupta, Bill Yuchen Lin, Yejin Choi, Jesse Thomason, Khyathi Raghavi Chandu
๐ขใACL 2024 Findings -
๐ป OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
โ๏ธใTianyu Zheng, Ge Zhang, Tianhao Shen, Xueling Liu, Bill Yuchen Lin, Jie Fu, Wenhu Chen, Xiang Yue
๐ขใACL 2024 Findings [๐พ Website] [๐พ Demo] [๐พ Code] [๐พ Models] -
๐ Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT4
Jiaxian Guo*, Bo Yang*, Paul Yoo, Bill Yuchen Lin, Yusuke Iwasawa, Yutaka Matsuo
๐ขใCOLM 2024
[๐พ Github] [๐ฆ Tweet] -
โ๏ธ LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
โ๏ธใChengsong Huang*, Qian Liu*, Bill Yuchen Lin*, Tianyu Pang, Chao Du, Min Lin
๐ขใCOLM 2024
[๐พ Demo] [๐พ Github] [๐ฆ Tweet] -
๐ธ๏ธ VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?
Junpeng Liu, Yifan Song, Bill Yuchen Lin, Wai Lam, Graham Neubig, Yuanzhi Li, Xiang Yue
๐ขใCOLM 2024 -
๐ฏ TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks
โ๏ธใDongfu Jiang*, Yishan Li*, Ge Zhang, Wenhao Huang, Bill Yuchen Lin, Wenhu Chen
๐ขใTMLR
[๐พ Github] [๐ Website] [๐ฆ Tweet]
2023
-
๐ฅ SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks
โ๏ธใBill Yuchen Lin, Yicheng Fu, Karina Yang, Prithviraj Ammanabrolu, Faeze Brahman, Shiyu Huang, Chandra Bhagavatula, Yejin Choi, Xiang Ren
๐ขใNeurIPS 2023 (spotlight)
[๐ Website] [๐พ Github] [๐ฆ Tweet] [๐ฐ Blog] -
๐ฅ Faith and Fate: Limits of Transformers on Compositionality
โ๏ธใNouha Dziri*, Ximing Lu*, Melanie Sclar*, Xiang Lorraine Li^, Liwei Jiang^, Bill Yuchen Lin^,
Peter West, Chandra Bhagavatula, Ronan Le Bras,Jena Hwang,Soumya Sanyal,Sean Welleck,Xiang Ren, Allyson Ettinger, Zaid Harchaoui, Yejin Choi
๐ขใNeurIPS 2023 (spotlight)
[๐ฆ Tweet] -
๐ฅ LLM-Blender: Ensembling Large Language Models with Pairwise Comparison and Generative Fusion
โ๏ธใDongfu Jiang, Xiang Ren, Bill Yuchen Lin
๐ขใto appear in Proc. of ACL 2023
[๐ Website] [๐พ Github] [๐ฆ Tweet]
Media coverage : MarkTechPost -
๐บ Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
โ๏ธใXiming Lu, Faeze Brahman, Peter West, Jaehun Jang, Khyathi Chandu, Abhilasha Ravichander, Lianhui Qin, Prithviraj Ammanabrolu,
Liwei Jiang, Sahana Ramnath, Nouha Dziri, Jillian Fisher, Bill Yuchen Lin, Skyler Hallinan, Xiang Ren, Sean Welleck, Yejin Choi
๐ขใEMNLP 2023 (Main) -
NovaCOMET: Open Commonsense Foundation Models with Symbolic Knowledge Distillation
โ๏ธใPeter West, Ronan Le Bras, Taylor Sorensen, Bill Yuchen Lin, Liwei Jiang, Ximing Lu, Khyathi Chandu, Jack Hessel, Ashutosh Baheti, Chandra Bhagavatula, Yejin Choi
๐ขใEMNLP 2023 (Findings) -
On Grounded Planning for Embodied Tasks with Language Models
โ๏ธใBill Yuchen Lin*, Chengsong Huang*, Qian Liu, Wenda Gu, Sam Sommerer, Xiang Ren
๐ขใin Proc. of AAAI 2023
[๐ Website] [๐พ Github] [๐ค Data]
Media coverage : USC Viterbi News -
AutoTriggER: Named Entity Recognition with Auxiliary Trigger Extraction
โ๏ธใDong-Ho Lee, Ravi Kiran Selvam, Sheikh Muhammad Sarwar, Bill Yuchen Lin,
Mahak Agarwal, Fred Morstatter, Jay Pujara, Elizabeth Boschee, James Allan, Xiang Ren
๐ขใin Proc. of EACL 2023, also presented at TrustNLP @ NAACL 2021 (best paper award) -
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
โ๏ธใ442 authors including Bill Yuchen Lin
ย [๐พ Github]
๐ขใin TMLR
2022
-
Reflect, Not Reflex: Inference-Based Common Ground Improves Dialogue Response Quality
โ๏ธใPei Zhou, Hyundong J. Cho, Pegah Jandaghi, Dong-Ho Lee, Bill Yuchen Lin, Jay Pujara, Xiang Ren
๐ขใin Proc. of EMNLP 2022
[๐ฆ Tweet] -
Unsupervised Cross-Task Generalization via Retrieval Augmentation
โ๏ธใBill Yuchen Lin, Kangmin Tan, Chris Miller, Beiwen Tian, Xiang Ren
๐ขใin Proc. of NeurIPS 2022
ย [๐ Website] ย [๐พ Github] [๐ผ๏ธ Slides] ย [๐ฆ Video] [๐ฆ Tweet] -
On Continual Model Refinement in Out-of-Distribution Data Streams
โ๏ธใBill Yuchen Lin, Sida Wang, Xi Victoria Lin, Robin Jia, Lin Xiao, Xiang Ren, Scott Yih
๐ขใin Proc. of ACL 2022
ย [๐ Website] ย [๐พ Github] [๐ผ๏ธ Slides] ย [๐ฆ Video] [๐ฆ Tweet] -
FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing Tasks
โ๏ธใBill Yuchen Lin*, Chaoyang He*, Zihang Zeng, Hulin Wang, Yufen Huang, Mahdi Soltanolkotabi, Xiang Ren^, Salman Avestimehr^
๐ขใin Proc. of NAACL 2022 Findings
[๐พ Github] [๐ฆ Tweet] -
On the Robustness of Reading Comprehension Models to Entity Renaming
โ๏ธใJun Yan, Yang Xiao, Sagnik Mukherjee, Bill Yuchen Lin, Robin Jia, Xiang Ren
๐ขใin Proc. of NAACL 2022
2021
-
CrossFit: A Few-shot Learning Challenge for Cross-Task Generalization in NLP
โ๏ธใQinyuan Ye, Bill Yuchen Lin, Xiang Ren
๐ขใin Proc. of EMNLP 2021
[๐พ Github] [๐ฆ Tweet] -
Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning
โ๏ธใXisen Jin, Bill Yuchen Lin, Mohammad Rostami, Xiang Ren
๐ขใin Proc. of EMNLP 2021 Findings
[๐พ Github] -
RockNER: A Simple Method to Create Adversarial Examples for Evaluating the Robustness of NER Models
โ๏ธใBill Yuchen Lin, Wenyang Gao, Jun Yan, Ryan Moreno, Xiang Ren
๐ขใin Proc. of EMNLP 2021 (short)
ย [๐ Website] -
RICA: Evaluating Robust Inference Capabilities Based on Commonsense Axioms
โ๏ธใPei Zhou, Rahul Khanna, Seyeon Lee, Bill Yuchen Lin, Daniel Ho, Jay Pujara, Xiang Ren
๐ขใin Proc. of EMNLP 2021
ย [๐ Website] -
Probing Commonsense Explanation in Dialogue Response Generation
โ๏ธใPei Zhou, Pegah Jandaghi, Hyundong Cho, Bill Yuchen Lin, Jay Pujara, Xiang Ren
๐ขใin Proc. of EMNLP 2021 Findings -
Common Sense Beyond English: Evaluating and Improving Multilingual Language Models for Commonsense Reasoning
โ๏ธใBill Yuchen Lin, Seyeon Lee, Xiaoyang Qiao, Xiang Ren
๐ขใin Proc. of ACL 2021
[๐พ Github] ย ย [๐ Website] -
RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge
โ๏ธใBill Yuchen Lin, Ziyi Wu, Yichi Yang, Dong-Ho Lee, Xiang Ren
๐ขใin Proc. of ACL 2021 Findings
[๐พ Github] ย ย [๐ Website] -
Differentiable Open-Ended Commonsense Reasoning
โ๏ธใBill Yuchen Lin, Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Xiang Ren, William W. Cohen
๐ขใin Proc. of NAACL 2021
[๐ผ๏ธ Slides] ย ย [๐ฆ Video] ย ย [๐พ Github] ย ย [๐ Website] -
Pre-training Text-to-Text Transformers for Concept-Centric Common Sense
โ๏ธใWangchunshu Zhou, Dong-Ho Lee, Ravi Kiran Selvam, Seyeon Lee, Bill Yuchen Lin, Xiang Ren
๐ขใin Proc. of ICLR 2021 ย
[๐พ Github] -
IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization
โ๏ธใWenxuan Zhou, Bill Yuchen Lin, Xiang Ren
๐ขใin Proc. of AAAI 2021
2020
-
CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning
โ๏ธใBill Yuchen Lin, Wangchunshu Zhou, Ming Shen, Pei Zhou, Chandra Bhagavatula, Yejin Choi, Xiang Ren
๐ขใin Proc. of EMNLP 2020 Findings ย ย (presented at AKBC 2020 as a non-archival paper.)
[๐ Website]
Media coverage : The Register , Tech Xplore , Techzine , Radio.com , ScienceDaily , USC Viterbi -
Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models
โ๏ธใBill Yuchen Lin, Seyeon Lee, Rahul Khanna, Xiang Ren
๐ขใin Proc. of EMNLP 2020 (short)
[๐ Website] -
Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering
โ๏ธใYanlin Feng*, Xinyue Chen*, Bill Yuchen Lin, Peifeng Wang, Jun Yan, Xiang Ren
๐ขใin Proc. of EMNLP 2020
[๐พ Github] - FreeDOM: A Transferable Neural Architecture for Structured Information Extraction on Web Documents
โ๏ธใBill Yuchen Lin, Ying Sheng, Nguyen Vo and Sandeep Tata
๐ขใin Proc. of KDD 2020 (Research Track)
[๐ผ๏ธ Slides] [๐ฆ Video] -
TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition
โ๏ธใBill Yuchen Lin*, Dongho Lee*, Ming Shen, Ryan Moreno, Xiao Huang, Prashant Shiralkar, Xiang Ren
๐ขใin Proc. of ACL 2020 (short)
[๐ผ๏ธ Slides] ย ย [๐ฆ Video] ย ย [๐พ Github] ย ย [๐ Website] -
Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling.
โ๏ธใOuyu Lan, Xiao Huang, Bill Yuchen Lin, He Jiang, Liyuan Liu, Xiang Ren
๐ขใin Proc. of ACL 2020
[๐พ Github] -
LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation
โ๏ธใDong-Ho Lee, Rahul Khanna, Bill Yuchen Lin, Jamin Chen, Seyeon Lee, Qinyuan Ye, Elizabeth Boschee, Leonardo Neves, Xiang Ren
๐ขใin Proc. of ACL 2020 (Demo Track)
[๐ Website] - NERO: A Neural Rule Grounding Framework for Label-Efficient Relation Extraction.
โ๏ธใWenxuan Zhou, Hongtao Lin, Bill Yuchen Lin, Ziqi Wang, Junyi Du, Leonardo Neves, Xiang Ren
๐ขใin Proc. of TheWebConf (WWW) 2020
Best Paper Runner-up (2/1500+) ย ย [๐พ Github]
2019
- KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning.
โ๏ธใBill Yuchen Lin, Xinyue Chen, Jamin Chen, Xiang Ren
๐ขใin Proc. of EMNLP-IJCNLP 2019
[๐พ Github] - AlpacaTag: An Active Learning-based Crowd Annotation Framework for Sequence Tagging.
โ๏ธใBill Yuchen Lin*, Dongho Lee*, Frank F. Xu, Ouyu Lan, Xiang Ren
๐ขใin Proc. of ACL 2019 (Demo Track)
[๐ Website]
2018
- Neural Adaptation Layers for Cross-domain Named Entity Recognition.
โ๏ธใBill Yuchen Lin, Wei Lu
๐ขใin Proc. of EMNLP 2018
[๐พ Github] - ExtRA: Extracting Prominent Review Aspects from Customer Feedback.
โ๏ธใZhiyi Luo, Shanshan Huang, Frank F. Xu, Bill Yuchen Lin, Hanyuan Shi, Kenny Q. Zhu
๐ขใin Proc. of EMNLP 2018
[๐พ Github] - Mining Cross-Cultural Differences and Similarities in Social Media.
โ๏ธใBill Yuchen Lin*, Frank F. Xu*, Kenny Q. Zhu, Seung-won Hwang
๐ขใin Proc. of ACL 2018
[๐พ Github] - Automatic Extraction of Commonsense LocatedNear Knowledge.
โ๏ธใFrank F. Xu*, Bill Yuchen Lin*, Kenny Q. Zhu
๐ขใin Proc. of ACL 2018 (short)
[๐พ Github]
2017
- Multi-channel BiLSTM-CRF Model for Emerging Named Entity Recognition in Social Media.
โ๏ธใBill Y. Lin*, Frank F. Xu*, Zhiyi Luo, Kenny Q. Zhu
๐ขใin Proc. of EMNLP 2017, Workshop on Noisy User-generated Text
[๐พ Github]