Haodong Chen

04s | From EE to CV
Xi'an City, China
NPU

About Me

I'm a final-year undergraduate student in School of Automation, Northwestern Polytechnical University (NPU). I'm fortunate to work closely with Prof. Ser-Nam Lim from University of Central Florida (UCF) and Prof. Harry Yang from The Hong Kong University of Science and Technology (HKUST).

Previously, I was advised by Prof. Dian Shao from both NPU and Shanghai AI Laboratory, which significantly contributed to my development in this field. I also worked with Dr. Jinliang Deng and Prof. Wenhan Luo at the Hong Kong Generative AI Research and Development Center (HKGAI), as well as Prof. Yuxuan Liang from HKUST(GZ).

As an EE major student, I have more than a passing interest in computer vision, and welcome all forms of collaboration! My research interests include:

  • Video Understanding: multi-modal/fine-grained, memory-assisted
  • Generation & Editing: human-centric, personalized, fashion
  • Large Language Model: machine cognition, human-like LLM/Agent

News

  • [10/2024] Served as a reviewer for AISTATS'25.
  • [08/2024] Served as a reviewer for ICLR'25.
  • [07/2024] FineCLIPER and CREST are accepted to MM'24!
  • [05/2024] Served as a reviewer for NeurIPS'24.
  • [02/2024] Served as a reviewer for ECCV'24.
  • [01/2024] Served as a reviewer for MM'24.
  • [01/2024] UrbanCLIP is accepted to WWW'24!

Experience

Research Intern | UCF
Time: 7/2024 - Present. Advisor: Prof. Ser-Nam Lim

Research Intern | HKGAI
Time: 5/2024 - 8/2024. Mentor: Prof. Wenhan Luo

Research Intern | DianLab, NPU
Time: 6/2023 - 8/2024. Advisor: Prof. Dian Shao

Publications

Beyond Uncertainty: Evidential Deep Learning for Robust Video Temporal Grounding
Kaijing Ma*, Haojian Huang*, Jin Chen*, Haodong Chen, Pengliang Ji, Xianghao Zang, Han Fang, Chao Ban, Hao Sun, Mulin Chen, Xuelong Li
arXiv, 2024
[Arxiv] [Project] [Code]

GaussianVTON: 3D Human Virtual Try-ON via Multi-Stage Gaussian Splatting Editing with Image Prompting
Haodong Chen, Yongle Huang, Haojian Huang, Xiangsheng Ge, Dian Shao
arXiv, 2024
[Arxiv] [Project] [Code]

FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs
Haodong Chen, Haojian Huang, Junhao Dong, Mingzhe Zheng, Dian Shao
ACM International Conference on Multimedia (MM), 2024
[Arxiv] [Project]

CREST: Cross-modal Resonance through Evidential Deep Learning for Enhanced Zero-Shot Learning
Haojian Huang, Xiaozhen Qiao, Zhuo Chen, Haodong Chen, Bingyu Li, Zhe Sun, Mulin Chen, Xuelong Li
ACM International Conference on Multimedia (MM), 2024
[Arxiv] [Code]

UrbanCLIP: Learning Text-enhanced Urban Region Profiling with Contrastive Language-Image Pretraining from the Web
Yibo Yan, Haomin Wen, Siru Zhong, Wei Chen, Haodong Chen, Qingsong Wen, Roger Zimmermann, Yuxuan Liang
ACM International World Wide Web Conference (WWW), 2024
[Paper] [Arxiv] [Video] [Code]
Oral Presentation

Services

  • Conference Reviewer,
     International Conference on Learning Representations (ICLR)
     Neural Information Processing Systems (NeurIPS)
     European Conference on Computer Vision (ECCV)
     International Conference on Artificial Intelligence and Statistics (AISTATS)
     ACM International Conference on Multimedia (MM)