I am a fourth-year Ph.D. student in the joint program between the University of Science and Technology of China (USTC) and Microsoft Research Asia (MSRA), co-supervised by Prof. Qiang Huo at MSRA and Prof. Jun Du at USTC. My Ph.D. research focuses on Document Intelligence (including OCR, document layout analysis, and document understanding) and Large Language Models (including MLLM, Agent, and RAG). Prior to this, I received my B.S. degree from the School of the Gifted Young (a.k.a. 少年班) at the University of Science and Technology of China in 2021, majoring in Computer Science.

During my Ph.D. studies, I gained valuable industry experience through internships at MSRA, DeepSeek, and ByteDance. At DeepSeek, I contributed to the development of DeepSeek VL2 and DeepSeek V3. My internship at MSRA involved working on the Microsoft OneOCR project and the Microsoft Document Intelligence project under the guidance of Researcher Qiang Huo and Lei Sun. Most recently, I began an internship with the ByteDance Seed team, where I am working on LLM/MLLM Agent projects. I have published over 10 papers (Citation: 3200+) at top-tier international AI journals and conferences.

I am currently seeking full-time job opportunities. If you are interested in my resume, please feel free to email me at jarvisustc@gmail.com. I am currently based in Beijing, China. If you would like to have a coffee chat, please feel free to reach out! ☕😊✨

🔥 News

🔥 More News

💻 Experiences

  • 2025.4-Now: Research Intern, Seed Team, ByteDance , Beijing, China.
  • 2024.09-2025.03: Research Intern, Multimodal Interaction Group, Microsoft Research Asia , Beijing, China.
  • 2024.06-2024.08: AGI Research Intern, Multimodal LLM Team, DeepSeek , Beijing, China.
  • 2020.09-2024.05: Research Intern, Multimodal Interaction Group, Microsoft Research Asia , Beijing, China.

📖 Educations

  • 2021.09-2026.6: Ph.D. in Information and Communication Engineering, University of Science and Technology of China, Hefei, Anhui, China.
  • 2017.09-2021.06: B.S. in the School of the Gifted Young (major in Computer Science), University of Science and Technology of China, Hefei, Anhui, China.

📝 Publications

  • ✉️ means Corresponding Author; * means Equal Contribution

🤖 LLMs & MLLMs

  1. ICCV 2025 (CCF-A) Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks, Jiawei Wang*$^✉️$, Yushen Zuo*, Yuanjun Chai, Zhendong Liu, Yicheng Fu, Yichun Feng$^✉️$, Kin-man Lam$^✉️$
  2. Submitted to Nature Communications (2025) [A Scalable Retrieval-Augmented Reasoning Framework Based on Large Language Models for Knowledge Mining in Biomedical Literature], Yichun Feng, Jiawei Wang, Lu Zhou, Yixue Li$^✉️$
  3. arXiv 2025 (Cutting-edge Project) DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning, DeepSeek AI
  4. arXiv 2024 (Cutting-edge Project) DeepSeek-V3 Technical Report, DeepSeek-AI
  5. arXiv 2024 (Cutting-edge Project) DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding, Zhiyu Wu*, Xiaokang Chen*, Zizheng Pan*, Xingchao Liu*, Wen Liu*, Damai Dai, Huazuo Gao, Yiyang Ma, Chengyue Wu, Bingxuan Wang, Zhenda Xie, Yu Wu, Kai Hu, Jiawei Wang, Yaofeng Sun, Yukun Li, Yishi Piao, Kang Guan, Aixin Liu, Xin Xie, Yuxiang You, Kai Dong, Xingkai Yu, Haowei Zhang, Liang Zhao, Yisong Wang, Chong Ruan$^✉️$
  6. Submitted to AAAI 2026 DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue, Yichun Feng*, Jiawei Wang*, Lu Zhou, Yixue Li$^✉️$
  7. arXiv 2025 WideSearch: Benchmarking Agentic Broad Info-Seeking, Ryan Wong*, Jiawei Wang*, Junjie Zhao, Li Chen, Yan Gao, Long Zhang, Xuan Zhou, Zuo Wang, Kai Xiang, Ge Zhang, Wenhao Huang, Yang Wang$^✉️$, Ke Wang$^✉️$

📄 Document Intelligence

  1. Pattern Recognition 2025 (SCI Q1 Journal) UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis, Jiawei Wang$^✉️$, Kai Hu, Qiang Huo
  2. ICDAR 2024 (Oral) DLAFormer: An End-to-End Transformer For Document Layout Analysis, Jiawei Wang*$^✉️$, Kai Hu*$^✉️$, Qiang Huo
  3. ICDAR 2024 (Oral) Dynamic Relation Transformer for Contextual Text Block Detection, Jiawei Wang*$^✉️$, Shunchi Zhang*$^✉️$, Kai Hu*$^✉️$, Chixiang Ma, Zhuoyao Zhong, Lei Sun, Qiang Huo
  4. ICDAR 2024 (Oral) UniVIE: A Unified Label Space Approach to Visual Information Extraction from Form-Like Documents, Kai Hu$^✉️$, Jiawei Wang, Weihong Lin, Zhuoyao Zhong, Lei Sun, Qiang Huo
  5. Pattern Recognition 2024 (SCI Q1 Journal) Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis, Jiawei Wang$^✉️$, Kai Hu, Zhuoyao Zhong, Lei Sun, Qiang Huo
  6. ICDAR 2023 (Oral) DQ-DETR: Dynamic Queries Enhanced Detection Transformer for Arbitrary Shape Text Detection, Chixiang Ma$^✉️$, Lei Sun, Jiawei Wang, Qiang Huo
  7. ICDAR 2023 A Hybrid Approach to Document Layout Analysis for Heterogeneous Document Images, Zhuoyao Zhong$^✉️$, Jiawei Wang, Haiqing Sun, Kai Hu, Erhan Zhang, Lei Sun, Qiang Huo
  8. Pattern Recognition 2023 (SCI Q1 Journal) Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer, Jiawei Wang, Weihong Lin, Chixiang Ma, Mingze Li, Zheng Sun, Lei Sun$^✉️$, Qiang Huo
  9. ACM Multimedia 2022 (CCF-A) TSRFormer: Table Structure Recognition with Transformers, Weihong Lin, Zheng Sun, Chixiang Ma, Mingze Li, Jiawei Wang, Lei Sun, Qiang Huo$^✉️$
  1. The European Physical Journal Plus 2022 Study of nonequilibrium phase transitions mechanisms in exclusive network and node model of heterogeneous assignment based on real experimental data of KIF3AC and KIF3CC motors, Yuqing Wang$^✉️$, Chang Xu, Molin Fang, Tianze Li, Liwen Zhang, Dasen Wei, Kaichen Ouyang, Tunyu Zhang, Chuzhao Xu, Haosong Sun, Yunzhi Wang, Jiawei Wang
  2. International Journal of Modern Physics B 2019 Physical mechanisms in impacts of interaction factors on totally asymmetric simple exclusion processes, Yuqing Wang$^✉️$, Jiawei Wang, Binghong Wang
  3. International Journal of Modern Physics B 2019 Stochastic dynamics in nonequilibrium phase transitions of multiple totally asymmetric simple exclusion processes coupled with strong and weak interacting effects, Yuqing Wang*$^✉️$, Jiawei Wang*, Ziang Zhu*, Binghong Wang
  4. International Journal of Modern Physics B 2019 Evolvement laws and stability analyses of traffic network constituted by changing ramps and main road, Yuqing Wang$^✉️$, Chaofan Zhou, Jiawei Wang, Xinpeng Ni
  5. Modern Physics Letters B 2018 A macroscopic model for VOC emissions process complemented by real data, Yuqing Wang$^✉️$, Chaofan Zhou, Ziang Zhu, Jiawei Wang, Zimeng Wang, Chenhao Fang, Bin Jia
  6. ICMRA 2018 (Best Presentation Award) Control Strategies for Reducing VOCs Emission Process Based on Empirical Data, Yuqing Wang*$^✉️$, Jiawei Wang*, Ziang Zhu*, Chaofan Zhou, Yiyao Kou, Jing Sun, Zhengwei Mei, Ziwu Li, Peng Wu, Donghu Wang, Si Zhang, Wenli Zhang

📚 Academic Services

  • ICDAR Reviewer (2023, 2024)
  • IJDAR Reviewer (2024)
  • ACM MM Reviewer (2025)
  • AAAI Reviewer (2026)

🎖 Honors and Awards

  • 2021‑2024: Core Contributor of Microsoft Azure AI Document Intelligence, Outstanding Contribution Award 📍 Microsoft
  • 2023: 2nd Prize, Visual Prompt Tuning Challenge @ CVPR 2023 HIT Workshop (CNY 200,000 bonus) (2/200+) 📍 China
  • 2022: 2nd Prize, Panoptic Scene Graph Challenge @ ECCV 2022 SenseHuman Workshop (CNY 100,000 bonus) (2/100+) 📍 China
  • 2021: Provincial excellent graduate (Top 1%) 📍 Anhui, China
  • 2020: Outstanding Student Scholarship Gold Award 📍 USTC
  • 2019: Tang Lixin Scholarship (Annual funding of CNY 10,000 until Ph.D., Top 1%) 📍 USTC
  • 2019: Suzhou Yucai Scholarship (Top 10 undergraduates per year) 📍 USTC
  • 2018: Outstanding Student Scholarship Gold Award 📍 USTC
  • 2018: First prize for freshman seminar papers 📍 USTC
  • 2017‑2021: Cyrus Tang Scholarship (Awarded to college students who are both good in academics and enthusiastic about social welfare) 📍 USTC

💬 Invited Talks

  • 2024.10: Towards Universal Layout Analysis. Hosted by Microsoft.