
Biography
I'm a final-year undergraduate student at Northwestern Polytechnical University (NPU) and a research intern at Everlyn AI. I'm fortunate to work closely with Prof. Ser-Nam Lim, Prof. Harry Yang, Prof. Qifeng Chen and Prof. Dian Shao.
I am always open to all forms of research collaboration. Feel free to contact me if you are interested in working with me! My research interests include:
- Generation & Editing: generic, personalized
- Video Understanding: multi-modal, fine-grained
- Large Language Model: hallucinations, unified understanding & generation
News
More
Experience
Research Intern | Everlyn AI |
Research Intern | HKGAI |
Research Intern | DianLab, NPU |
Selected Preprints
VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models |
Temporal Regularization Makes Your Video Generator Stronger |
LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization |
Beyond Generation: Unlocking Universal Editing via Self-Supervised Fine-Tuning |
GaussianVTON: 3D Human Virtual Try-ON via Multi-Stage Gaussian Splatting Editing with Image Prompting |
Publications
FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance |
SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization |
FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs |
CREST: Cross-modal Resonance through Evidential Deep Learning for Enhanced Zero-Shot Learning |
UrbanCLIP: Learning Text-enhanced Urban Region Profiling with Contrastive Language-Image Pretraining from the Web |
Awards and Honors
2024 | |
2024 | |
2024 | |
2023 | |
2023 | |
2023 |
Services
Computer Vision and Pattern Recognition (CVPR), 2025
International Conference on Computer Vision (ICCV), 2025
European Conference on Computer Vision (ECCV), 2024
International Conference on Machine Learning (ICML), 2025
International Conference on Learning Representations (ICLR), 2025
Neural Information Processing Systems (NeurIPS), 2024-2025
ACM International Conference on Multimedia (MM), 2024
International Conference on Artificial Intelligence and Statistics (AISTATS), 2025
Pattern Recognition
ACM Transactions on Multimedia Computing Communications and Applications (TOMM)