I am a first-year Ph.D. student in Data Science & Analytics at the Hong Kong University of Science and Technology (Guangzhou), supervised by Prof. Yuxuan Liang and Prof. Yangqiu Song. My research focuses on Spatial-Temporal Data Mining, Multimodal Learning, and Time Series Analysis. Currently, I am also interning at Huawei 2012 Lab, where I am conducting research on the Foundation Model for Spatio-Temporal Data.
I hold an M.Phil. degree in Data Science & Analytics from the HKUST (Guangzhou), and a B.Eng. degree in Computer Science & Information Engineering from the Hefei University of Technology. Prior to that, I gained industry experience as an Algorithm Research Intern at the Autonomous Driving Center of XPENG and as a full-time Software Engineer at the CloudIDE & Workflow Engine Center of Tencent for one year.
",
which does not match the baseurl
("
") configured in _config.yml
.
baseurl
in _config.yml
to "
".
Siru Zhong, Weilin Ruan, Min Jin, Huan Li, Qingsong Wen, Yuxuan Liang
Under review. 2025
Propose Time-VLM, a novel multimodal framework that leverages pre-trained Vision-Language Models (VLMs) to bridge temporal, visual, and textual modalities for enhanced time series forecasting.
Weilin Ruan, Siru Zhong, Haomin Wen, Yuxuan Liang
Under review. 2025
Propose LDM4TS, a novel framework that leverages the powerful image reconstruction capabilities of latent diffusion models for vision-enhanced time series forecasting.
Xixuan Hao, Wei Chen, Yibo Yan, Siru Zhong, Kun Wang, Qingsong Wen, Yuxuan Liang
AAAI Conference on Artificial Intelligence (AAAI) 2025 Poster
Present UrbanVLP, a novel vision-language pretraining framework that integrates both macro and micro-level urban data and enhances interpretability through automatic text generation, achieving superior performance in urban region profiling.
Qiongyan WANG, Yutong Xia, Siru Zhong, Weichuang Li, Yuankai Wu, Shi Fen Cheng, Junbo Zhang, Yu Zheng, Yuxuan Liang
AAAI Conference on Artificial Intelligence (AAAI) 2025 Poster
Introduce AirRadar, a deep neural network inferring unmonitored air quality. It uses learnable mask tokens in two-stage process for feature reconstruction. Validated by a dataset, it outperforms baselines, contributing to air quality monitoring with its design and performance.
Siru Zhong, Xixuan Hao, Yibo Yan, Ying Zhang, Yangqiu Song, Yuxuan Liang
ACM International Conference on Multimedia (ACM MM) 2024 Poster
Introduced UrbanCross, a cross-domain satellite image-text retrieval framework that leverages multimodal enhancements and adaptive domain adaptation techniques to bridge diverse urban landscapes, achieving up to a 15% improvement in retrieval performance.
Yutong Feng, Qiongyan Wang, Yutong Xia, Junlin Huang, Siru Zhong, Kun Wang, Shifen Cheng, Yuxuan Liang
The International Joint Conference on Artificial Intelligence (IJCAI) 2024 Spotlight
Present the Spatio-Temporal Field Neural Network and Pyramidal Inference framework, which integrate field and graph perspectives to achieve state-of-the-art nationwide air quality inference in Mainland China.
Huaiwu Zhang, Yutong Xia, Siru Zhong, Kun Wang, Zekun Tong, Qingsong Wen, Roger Zimmermann, Yuxuan Liang
The International Joint Conference on Artificial Intelligence (IJCAI) 2024 Spotlight
Introduce DeepPA, a deep-learning framework and the SINPA dataset for accurately predicting real-time parking availability across Singapore, outperforming existing models and supporting urban planning through a deployed web platform.
Yibo Yan, Haomin Wen, Siru Zhong, Wei Chen, Haodong Chen, Qingsong Wen, Roger Zimmermann, Yuxuan Liang
The International World Wide Web Conference (WWW) 2024 Oral
Introduce UrbanCLIP, the first large language model–enhanced framework that integrates textual descriptions with satellite imagery through contrastive language-image pretraining, significantly improving urban region profiling performance across major cities.