Ph.D Student, Computer Science
Nanyang Technological University, Singapore
About Me
I am a third-year PhD student and luckily advised by Prof. Ziwei Liu. My research focuses on multimodal models and building true intelligence.
I am lucky to work with many brilliant researchers in a non-profit research-oriented organization, LMMs-Lab, we share the sincere passion for developing multimodal intelligence.
Email: drluodian[at]gmail[dot]com
Selected Publications
- [17]LLaVA-OneVision: Easy Visual Task Transfer
- [16]Long Context Transfer from Language to Vision
- [12]LLaVA-NeXT: Improved reasoning, OCR, and world knowledge
[code] - [11]MIMIC-IT: Multi-modal In-Context Instruction Tuning
[code] - [10]Otter: A multi-modal model with in-context instruction tuning
[code] - [9]Coordinating Multiple Vision-Language Models for Visual Reasoning
NeurIPS 2023, In Conference on Neural Information Processing Systems. - [8]Sparse Mixture-of-Experts are Domain Generalizable Learners
ICLR 2023 (Oral), In International Conference on Representation Learning 2023.[code] Short version in NeurIPS 2022 Workshop on Distribution Shift. - [6]Invariant information bottleneck for domain generalization
AAAI 2022, In Proceedings of the AAAI Conference on Artificial Intelligence.[code] - [5]Energy-Based Open-World Uncertainty Modeling for Confidence Calibration
ICCV 2021, In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).[code] - [4]Learning invariant representations and risks for semi-supervised domain adaptation
CVPR 2021, In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.[code] - [3]MADAN: multi-source adversarial domain aggregation network for domain adaptation
IJCV 2021, International Journal of Computer Vision. - [2]Rethinking distributional matching based domain adaptation
- [1]Multi-source domain adaptation for semantic segmentation
NeurIPS 2019, In Neural Information Processing Systems.[code]
Experiences
I have been fortunately collaborating and doing research at/with
-
Sep. 2020 - Dec. 2021: Microsoft Research, Shanghai
Supervised by Dr. Dongsheng Li in the beautiful and relaxing WestBud office, with chill and smart colleagues.
-
Oct. 2019 - Aug. 2020 (remote till May 2021): Berkeley AI Research, CA, USA
Supervised by Prof. Kurt Keutzer and Prof. Sicheng Zhao, Prof. Xiangyu Yue, Prof. Shanghang Zhang and Dr. Colorado Reed. Enjoy the weather and front-tier research atmosphere. Go Cal and Roll on your Golden Bears!
-
Jan 2020 - Nov 2022: Dr. Tong Che, MILA/Nvidia Research
Great appreciation on guiding me to explore many fascinating ML topics.
-
May 2020 - Dec. 2021: Prof. Han Zhao, UIUC
Learn to write a paper with machine learning taste.
-
May 2018 - Oct. 2019: DiDi Visual Perception Team, Beijing
First internship and two papers there.
Professional Services
- Talk/Technical Sharing:
- LMMs-Lab Projects@TwelveLabs (2024), Hosted by James Le
- LMMs-Lab Projects@Tiktok (2024)
- Otter & MIMICIT@Alibaba, Damo Academy, Hosted by Dr. Lidong Bing, Sep. 2023.
- Otter & MIMICIT@HITSZ, Hosted by Prof. Rui Shao, Jul. 2023.
- Slab@NTU: Cluster Adminstrator (70+ users, 400+ GPUs)
- The AI Talk: Organizer
-
Conference Reviewer / Program Committee:
-
ICCV (2021,2023), NeurIPS (2022), BMVC (2023), AAAI (2023), CVPR (2022,2023), AISTATS (2023), ICML (2023).
-
Workshop: ICLR 2023 (DG)
-
-
Journal Reviewer:
- Pattern Recognition (PR)
- Transactions on Multimedia (TMM)
- Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
- International Journal of Computer Vision (IJCV)
Acknowledgements: this website builds on al-folio and Jiaming Song.