🔥 News

2026.02: 🔥 We release a blog about Experience-Driven Learning paradigm and share many interesting findings. 🚧 Please find all details in our notion blog.
2025.12: 🎉 We are thrilled to announce that WideSearch has been selected as a key benchmark for evaluating agent capabilities in Seed 1.8. We are also honored to contribute to the Seed 1.8 API as core developers of the new context management feature.
2025.09: 🔥 We introduce EMPG: a new framework that solves the credit assignment bottleneck in long-horizon agent training by fixing a fundamental flaw in policy gradients. 🚧 Please find all details in our project page.
2025.09: 🎉 We're excited to have contributed to MCP Mark, a solid benchmark for stress-testing comprehensive MCP use. We have open-sourced all details in Github. Welcome to join us!
2025.08: 🔥 We introduce WideSearch: a new benchmark to test if AI agents can handle large-scale, repetitive information gathering — the real bottleneck in productivity. 🚧 Please find all details in our project page.
2025.06: 🎉 Our paper on VLM robustness, "Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks," has been accepted by ICCV 2025! See you in Hawaii!
2025.05: 🔥 Our latest research, DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue, is now available. This work highlights a crucial principle in Human-Agent Interaction: Agents must proactively request necessary information to excel, as humans may not always volunteer it. This "Agent-must-ask" paradigm is central to DoctorAgent-RL's ability to facilitate better task completion in complex multi-turn dialogues.
2025.04: 🎉 Thrilled to kick off my new internship with the ByteDance Seed Team!
2025.04: 🔥 Our latest work on VLM robustness, "Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks," has been released on Arxiv. We've open-sourced the Robust-VLGuard dataset and DiffPure-VLM defense.
2025.03: 🎉 Our paper "UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis" has been accepted by Pattern Recognition Journal!
2024.12: 💻 We've launched a new GitHub project: Awesome-Multimodal-RAG! Check out the latest in multimodal RAG and contribute!
2024.12: 🤝 We're excited to have contributed to DeepSeek-VL2, an advanced Vision-Language Model with strong performance and fewer parameters.

🔥 More News

2024.08-09: 🗣️ Presented DLAFormer and DRFormer at ICDAR in Athens! Photos can be found here. A memorable experience meeting colleagues and exploring the city.
2024.08: ✍️ The complete version of DLAFormer, titled "UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis", has been submitted to Pattern Recognition Journal.
2024.07: 🎉 Our Detect-Order-Construct have been accepted by Pattern Recognition!
2024.06: 🗣️ Our DLAFormer, UniVIE, and DRFormer selected for oral presentation at ICDAR 2024!
2024.03: 🚀 Azure AI Document Intelligence now supports Hierarchical Document Structure Analysis (HDSA), based on our "Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis" paper. Details on arXiv and the official announcement.
2024.03: 💻 Source code released for our Language-Enhanced Image New Category Discovery solution from the CVPR 2023 HIT Workshop.
2024.02: ✍️ Our new work on Document Layout Analysis, DLAFormer: A End-to-End Transformer for Document Layout Analysis, submitted to ICDAR 2024.
2024.01: 💡 Introduced UniVIE: A Unified Label Space Approach to Visual Information Extraction from Form-like Documents! Reframing VIE as relation prediction with a unified label space.
2024.01: 📄 New technical paper released: Dynamic Relation Transformer for Contextual Text Block Detection!
2023.12: 🏆 2nd Prize, 2023 International Algorithm Case Competition (Visual Prompt Tuning Challenge @ CVPR 2023 HIT Workshop), 200,000 RMB bonus!
2023.11: ✍️ Our new progress on Hierarchical Document Structure Analysis submitted to Pattern Recognition Journal.
2023.07: 🎉 "Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer" accepted by Pattern Recognition Journal!
2023.04: 🎉 Two papers accepted by ICDAR 2023!
2023.03: 💡 Proposed a new Dynamic Queries based Detection Transformer for more robust table structure recognition!
2022.12: 🏆 2nd Prize, 2022 International Algorithm Case Competition (Panoptic Scene Graph Challenge @ ECCV 2022 SenseHuman Workshop), 100,000 RMB bonus!
2022.09: 🎉 One paper accepted by ACM MM 2022!

Jiawei Wang

🔥 News