๐ฅ News
- 2025.09: ๐ฅ We introduce EMPG: a new framework that solves the credit assignment bottleneck in long-horizon agent training by fixing a fundamental flaw in policy gradients. ๐ง Please find all details in our project page.
- 2025.09: ๐ We're excited to have contributed to MCP Mark, a solid benchmark for stress-testing comprehensive MCP use. We have open-sourced all details in Github. Welcome to join us!
- 2025.08: ๐ฅ We introduce WideSearch: a new benchmark to test if AI agents can handle large-scale, repetitive information gathering โ the real bottleneck in productivity. ๐ง Please find all details in our project page.
- 2025.06: ๐ Our paper on VLM robustness, "Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks," has been accepted by ICCV 2025! See you in Hawaii!
- 2025.05: ๐ฅ Our latest research, DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue, is now available. This work highlights a crucial principle in Human-Agent Interaction: Agents must proactively request necessary information to excel, as humans may not always volunteer it. This "Agent-must-ask" paradigm is central to DoctorAgent-RL's ability to facilitate better task completion in complex multi-turn dialogues.
- 2025.04: ๐ Thrilled to kick off my new internship with the ByteDance Seed Team!
- 2025.04: ๐ฅ Our latest work on VLM robustness, "Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks," has been released on Arxiv. We've open-sourced the Robust-VLGuard dataset and DiffPure-VLM defense.
- 2025.03: ๐ Our paper "UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis" has been accepted by Pattern Recognition Journal!
- 2024.12: ๐ป We've launched a new GitHub project: Awesome-Multimodal-RAG! Check out the latest in multimodal RAG and contribute!
- 2024.12: ๐ค We're excited to have contributed to DeepSeek-VL2, an advanced Vision-Language Model with strong performance and fewer parameters.
๐ฅ More News
- 2024.08-09: ๐ฃ๏ธ Presented DLAFormer and DRFormer at ICDAR in Athens! Photos can be found here. A memorable experience meeting colleagues and exploring the city.
- 2024.08: โ๏ธ The complete version of DLAFormer, titled "UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis", has been submitted to Pattern Recognition Journal.
- 2024.07: ๐ Our Detect-Order-Construct have been accepted by Pattern Recognition!
- 2024.06: ๐ฃ๏ธ Our DLAFormer, UniVIE, and DRFormer selected for oral presentation at ICDAR 2024!
- 2024.03: ๐ Azure AI Document Intelligence now supports Hierarchical Document Structure Analysis (HDSA), based on our "Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis" paper. Details on arXiv and the official announcement.
- 2024.03: ๐ป Source code released for our Language-Enhanced Image New Category Discovery solution from the CVPR 2023 HIT Workshop.
- 2024.02: โ๏ธ Our new work on Document Layout Analysis, DLAFormer: A End-to-End Transformer for Document Layout Analysis, submitted to ICDAR 2024.
- 2024.01: ๐ก Introduced UniVIE: A Unified Label Space Approach to Visual Information Extraction from Form-like Documents! Reframing VIE as relation prediction with a unified label space.
- 2024.01: ๐ New technical paper released: Dynamic Relation Transformer for Contextual Text Block Detection!
- 2023.12: ๐ 2nd Prize, 2023 International Algorithm Case Competition (Visual Prompt Tuning Challenge @ CVPR 2023 HIT Workshop), 200,000 RMB bonus!
- 2023.11: โ๏ธ Our new progress on Hierarchical Document Structure Analysis submitted to Pattern Recognition Journal.
- 2023.07: ๐ "Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer" accepted by Pattern Recognition Journal!
- 2023.04: ๐ Two papers accepted by ICDAR 2023!
- 2023.03: ๐ก Proposed a new Dynamic Queries based Detection Transformer for more robust table structure recognition!
- 2022.12: ๐ 2nd Prize, 2022 International Algorithm Case Competition (Panoptic Scene Graph Challenge @ ECCV 2022 SenseHuman Workshop), 100,000 RMB bonus!
- 2022.09: ๐ One paper accepted by ACM MM 2022!