Haodong Chen

04s | From EE to CV
Xi'an City, China

About Me

I'm a year-3 undergraduate student from School of Automation, Northwestern Polytechnical University.

I am very fortunate to be advised by Prof. Dian Shao from both DianLab at NPU and Shanghai AI Laboratory, which has significantly contributed to my learning and growth in this field.

As an EE major student, I have more than a passing interest in computer vision and cognitive intelligence. My research interests include:

  • Video Understanding: multi-modal/fine-grained, memory-assisted
  • Generation & Editing: human-centric interactive, cognition-driven
  • Large Language Model: machine cognition, human-like LLM/Agent


  • [07/2024] FineCLIPER and CREST are accepted to MM'24!
  • [05/2024] Served as a reviewer for NeurIPS'24.
  • [02/2024] Served as a reviewer for ECCV'24.
  • [01/2024] Served as a reviewer for MM'24.
  • [01/2024] UrbanCLIP is accepted to WWW'24!


Research Intern | HKGAI, HKUST
Time: 5/2024 - Present. Topic: Video Generation


GaussianVTON: 3D Human Virtual Try-ON via Multi-Stage Gaussian Splatting Editing with Image Prompting
Haodong Chen, Yongle Huang, Haojian Huang, Xiangsheng Ge, Dian Shao
arXiv, 2024
[Arxiv] [Project] [Code]

FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs
Haodong Chen, Haojian Huang, Junhao Dong, Mingzhe Zheng, Dian Shao
ACM International Conference on Multimedia (MM), 2024
[Arxiv] [Project]

CREST: Cross-modal Resonance through Evidential Deep Learning for Enhanced Zero-Shot Learning
Haojian Huang, Xiaozhen Qiao, Zhuo Chen, Haodong Chen, Bingyu Li, Zhe Sun, Mulin Chen, Xuelong Li
ACM International Conference on Multimedia (MM), 2024
[Arxiv] [Code]

UrbanCLIP: Learning Text-enhanced Urban Region Profiling with Contrastive Language-Image Pretraining from the Web
Yibo Yan, Haomin Wen, Siru Zhong, Wei Chen, Haodong Chen, Qingsong Wen, Roger Zimmermann, Yuxuan Liang
ACM International World Wide Web Conference (WWW), 2024
[Paper] [Arxiv] [Video] [Code]
Oral Presentation


  • Conference Reviewer,
     Neural Information Processing Systems (NeurIPS)
     European Conference on Computer Vision (ECCV)
     ACM International Conference on Multimedia (MM)