Magellan Technology Research Institute (MTRI) is pleased to announce that two papers authored by its research scientists have been accepted to the 43rd International Conference on Machine Learning (ICML 2026).
ICML is one of the world’s most influential conferences in machine learning, bringing together researchers and practitioners who are shaping the future of artificial intelligence. The acceptance of these papers highlights MTRI’s continued commitment to advancing fundamental AI research while exploring technologies that can contribute to real-world applications.
The accepted papers present new approaches to two important challenges in real-world AI deployment: efficient multimodal retrieval over long documents and robust video understanding across different domains. Both works reflect MTRI’s commitment to advancing machine learning research that is not only technically innovative, but also practical for scalable AI systems.
1. “Very Efficient Listwise Multimodal Reranking for Long Documents”
Authors: Yiqun Sun, Pengfei Wei, Lawrence B. Hsieh
This paper introduces ZipRerank, a highly efficient multimodal reranking framework designed for long documents such as PDFs, reports, webpages, and visually rich enterprise documents. In modern retrieval and multimodal retrieval-augmented generation systems, reranking is essential for improving search quality, but it often becomes a major computational bottleneck when documents contain many pages and visual tokens.
ZipRerank addresses this challenge through two key innovations: it reduces input length with a lightweight query-image early interaction mechanism, and it eliminates slow autoregressive decoding by scoring candidates in a single forward pass. Experiments on the MMDocIR benchmark show that ZipRerank matches or surpasses state-of-the-art multimodal rerankers while reducing LLM inference latency by up to an order of magnitude, making it promising for latency-sensitive real-world systems.
This research has potential applications in enterprise search, AI assistants, document intelligence, and multimodal knowledge systems where users need accurate answers from long and visually complex documents.
2. “Return of Frustratingly Easy Unsupervised Video Domain Adaptation”
Authors: Pengfei Wei, Yiqun Sun, Zhiqiang Xu, Yiping Ke, Lawrence B. Hsieh
This paper proposes MetaTrans, a streamlined method for unsupervised video domain adaptation, a practical yet under-explored problem in video AI. In many real-world scenarios, video models trained in one environment must be deployed in another, where backgrounds, cameras, lighting, motion patterns, and visual styles may differ. Collecting labeled data for every new environment is costly, making unsupervised adaptation an important research direction.
MetaTrans adopts a concise learning objective with only two fundamental loss terms, while using a temporal-static subtraction module to separately address spatial and temporal divergence in cross-domain videos. Extensive experiments on cross-domain action recognition tasks show substantial adaptation improvements and strong relative gains compared with state-of-the-art UVDA baselines.
The method may contribute to more adaptable video AI systems for video content analysis, workplace safety, smart environments, media understanding, and other applications where deployment conditions can vary significantly from training data.
Looking Ahead
Having two papers accepted to ICML 2026 is an important recognition of our team’s research efforts. MTRI continues to strengthen its research in multimodal AI, efficient machine learning, and robust visual understanding. The results demonstrate MTRI’s focus on both academic excellence and practical AI innovation, including technologies that can improve the speed, reliability, and adaptability of AI systems deployed in real-world environments.
