Jinjie Ni

Jinjie Ni

a.k.a. Oliver

I'm an AI Researcher at NUS working with Prof. Michael Shieh.

At present, I'm interested in investigating:

When I'm not deeply engrossed in research, you can find me pondering all the time.

Google Scholar  /  X (Twitter)  /  Blog Posts  /  Github  /  LinkedIn  /  Zhi Hu  /  Email

Experiences

Academia

2023
National University of Singapore 2023 - now
Research Fellow
- Foundation Models.
2020
Nanyang Technological University 2020 - 2023
Ph.D. in Computer Science
- Efficient Language Models and Dialogue Systems.
2016
Northwestern Polytechnical University 2016 - 2020
B.Eng. in Electrical Engineering
- Multimodal Models.

Industry

2025
SEA AI Lab, Singapore 2024.10 - present
Research Associate
- Work on LLM pretraining and architectures, (multi-modal) reinforcement learning for reasoning, and diffusion language models.
2022
DAMO Academy, Alibaba Group, Singapore 2022.04 - 2022.10
Research Intern
- Work on modality alignment for pre-trained models.

Featured Research

2025 >> Diffusion Language Models are Super Data Learners
- Diffusion Language Models are Super Data Learners [blog][tweet]
- Jinjie Ni and the team.
- The first work empirically showing diffusion language models have >3x data potential compared with autoregressive language models, at scale (up to 8B parameters, 480B tokens, 480 epochs). Clear crossovers are seen across model sizes and data budgets.
>> NoisyRollout
- NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation [paper][tweet1][tweet2]
- Xiangyan Liu*, Jinjie Ni*, Zijian Wu*, Chao Du, Longxu Dou, Haonan Wang, Tianyu Pang, Michael Qizhe Shieh.
- NoisyRollout is a simple, zero-cost method that improves visual-language reinforcement learning and achieves state-of-the-art visual reasoning and perception results across five out-of-domain benchmarks, demonstrating exceptional sample efficiency (2.1K training samples) and scalability without requiring additional training costs or complex modifications to the RL objective.
>> SynthRL
- SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis [paper][tweet1][tweet2]
- Zijian Wu*, Jinjie Ni*, Xiangyan Liu*, Zichen Liu, Hang Yan, Michael Qizhe Shieh
- SynthRL is a scalable and guaranteed method that automatically synthesizes verifiably correct and more challenging training questions at scale for visual reasoning models from an initial 8K seed dataset, achieving consistent and significant performance gains across five out-of-domain visual math reasoning benchmarks, with improvements most pronounced on the hardest evaluation samples where deeper, more complex reasoning is required.
2024 >> MixEval-X
- MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures [paper][tweet]
- ICLR 2025 (Spotlight, top 5.1% Papers)
- Jinjie Ni, Yifan Song, Deepanway Ghosal, Bo Li, David Junhao Zhang, Xiang Yue, Fuzhao Xue, Zian Zheng, Kaichen Zhang, Mahir Shah, Kabir Jain, Yang You, Michael Qizhe Shieh
- MixEval-X is the first any-to-any, real-world benchmark featuring diverse input-output modalities, real-world task distributions, consistent high standards across modalities, and dynamism. It achieves up to 0.98 correlation with arena-like multi-modal evaluations while being way more efficient.
>> MixEval
- MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures [paper][tweet]
- NeurIPS 2024 main track (poster)
- Jinjie Ni, Fuzhao Xue, Xiang Yue, Yuntian Deng, Mahir Shah, Kabir Jain, Graham Neubig, Yang You
- Building golden-standard LLM evaluation from off-the-shelf benchmark mixtures. The best LLM evaluation at the time of release for its SOTA model ranking accuracy (0.96 correlation with Chatbot Arena) and efficiency (6% the time and cost of running MMLU). Moreover, it’s dynamic.
>> OpenMoE
- OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models [paper][tweet]
- ICML 2024 (poster)
- Fuzhao Xue, Zian Zheng, Yao Fu, Jinjie Ni, Zangwei Zheng, Wangchunshu Zhou, Yang You
- The first fully open MoE-based Decoder-only LLM trained over chinchilla scaling law.
2023 >> InstructWild
- Instruction in the Wild: A User-Based Instruction Dataset [Github]
- Jinjie Ni, Fuzhao Xue, Yuntian Deng, Jason Phang, Kabir Jain, Mahir Hitesh Shah, Zangwei Zheng, Yang You.
- The first large-scale instruction tuning dataset harvested from the web.
>> GHA
- Finding the Pillars of Strength for Multi-head Attention. [paper]
- ACL 2023 main track (poster)
- Jinjie Ni, Rui Mao, Zonglin Yang, Han Lei, Erik Cambria
- Cutting off redundancy for Transformer layers. SOTA efficiency and performance among efficient transformers. Concurrent work of GQA, cited and discussed in the GQA paper.
>> PAD
- Adaptive Knowledge Distillation between Text and Speech Pre-trained Models [paper]
- Jinjie Ni, Yukun Ma, Wen Wang, Qian Chen, Dianwen Ng, Han Lei, Trung Hieu Nguyen, Chong Zhang, Bin Ma, Erik Cambria
- Knowledge distillation between text and speech pre-trained models. The SOTA text-speech distillation method at the time of release.
2022 >> HiTKG
- HiTKG: Towards Goal-Oriented Conversations Via Multi-Hierarchy Learning [paper]
- AAAI 2022 (oral)
- Jinjie Ni, Vlad Pandelea, Tom Young, Haicang Zhou, Erik Cambria
- The first work that trains agents to actively guide the conversations. It ushers in a new era of intelligence for dialogue agents. The SOTA approach for turn-level dialogue reasoning tasks.
>> FusedChat
- FusedChat: Towards Fusing Task-Oriented Dialogues and Chitchat in Multi-turn Conversational Agents [paper]
- AAAI 2022 (oral)
- Tom Young, Frank Xing, Vlad Pandelea, Jinjie Ni, Erik Cambria
- The first attempt of fusing task-oriented and open-domain dialogue systems.
2021 >> Recent Advances in Deep Learning Based Dialogue Systems
- Recent Advances in Deep Learning Based Dialogue Systems [paper]
- Jinjie Ni, Tom Young, Vlad Pandelea, Fuzhao Xue, Erik Cambria
- An 80-page systematic review for dialogue systems. One of the most cited dialogue system reviews.

Activities

Teaching

2021 NTU-SC1003: Introduction to Computational Thinking and Programming
Teaching Assistant
NTU-CE2100: Probability and Statistics for Computing
Lecturer
2020 NTU-CE1113: Physics for Computing
Teaching Assistant
NTU-CZ2007: Introduction To Databases
Teaching Assistant
NTU-CZ2004: Human Computer Interaction
Teaching Assistant

Services

Conference Reviewer Neurips 2025, ICML 2025, ICLR 2025, Neurips 2024, ACL 2024, EMNLP 2024, ACL 2023, EMNLP 2023, AAAI 2023, ICASSP 2023
Journal Reviewer Knowledge-Based Systems, Information Fusion, Artificial Intelligence Review, Cognitive Computation
Co-organizer MLNLP community