Haodong Chen

04s | From EE to CV
Xi'an City, China
NPU

About Me

I'm a year-3 undergraduate student from School of Automation, Northwestern Polytechnical University.

As an EE major student, I have more than a passing interest in deep learning. My research interests include Multi-Modal Video Understanding, 3D Generation and Editing, and Large Language Model.

I am very fortunate to be advised by Prof. Dian Shao from both DianLab at NPU and Shanghai AI Laboratory, which has significantly contributed to my learning and growth in this field.

If you are interested in my work or have some ideas want to share, feel free to contact me with haroldchen328@gmail.com.

I'm applying for 25FALL PhD, wish me good luck~

News

  • [04/2024] One paper regarding Multi-modal Video Understanding is submitted to MM'24.
  • [04/2024] One paper regarding Zero-shot Learning is submitted to MM'24.
  • [03/2024] One paper regarding 3D Gaussian Splatting Editing is submitted to ECCV'24.
  • [03/2024] One paper regarding Fine-grained Video Understanding is submitted to ECCV'24.
  • [02/2024] Served as a reviewer for ECCV'24.
  • [01/2024] Served as a reviewer for MM'24.
  • [01/2024] UrbanCLIP is accepted to WWW'24!
  • [01/2024] We are currently maintaining the FineGym dataset to address some proposed issues. The refined version will be updated soon!
  • [11/2023] One paper regarding Video Understanding is submitted to CVPR'24.
  • [10/2023] One paper regarding Multi-modal Representation Learning is submitted to WWW'24.

Experience

Publications

GaussianVTON: 3D Human Virtual Try-ON via Multi-Stage Gaussian Splatting Editing with Image Prompting
Haodong Chen, Yongle Huang, Haojian Huang, Xiangsheng Ge, Dian Shao
Under Review
[Arxiv] [Project] [Code]

CREST: Cross-modal Resonance through Evidential Deep Learning for Enhanced Zero-Shot Learning
Haojian Huang, Xiaozhen Qiao, Zhuo Chen, Haodong Chen, Bingyu Li, Zhe Sun, Mulin Chen, Xuelong Li
Under Review
[Arxiv] [Code]

UrbanCLIP: Learning Text-enhanced Urban Region Profiling with Contrastive Language-Image Pretraining from the Web
Yibo Yan, Haomin Wen, Siru Zhong, Wei Chen, Haodong Chen, Qingsong Wen, Roger Zimmermann, Yuxuan Liang
ACM International World Wide Web Conference (WWW), 2024
[Paper] [Arxiv] [Video] [Code]
Oral Presentation

Services

  • Conference Reviewer,
     European Conference on Computer Vision (ECCV)
     ACM International Conference on Multimedia (MM)

Extra-curricular

  • 2023, Academic Advancement Individual Honor of NPU.
  • 2023, Exchange visit to The University of Melbourne. Thanks to NPU!
  • 2023, Selected for the 12th Leading Talent Development Program, Yikun Class of NPU.
  • 2023, United Nations IOCDP Outstanding Trainee in Macau, China.
  • 2023, United Nations Messenger for the Sustainable Development Goals (SDGs).