Jianxiang Zhou, Erdong Liu, Wei Chen, Siru Zhong, Yuxuan Liang
Under review. 2024
Introduce STGormer, a Spatio-Temporal Graph Transformer that integrates traffic data attributes and structures with a mixture-of experts module to capture spatio-temporal heterogeneity, achieving state-of-the-art performance in traffic forecasting
Siru Zhong, Xixuan Hao, Yibo Yan, Ying Zhang, Yangqiu Song, Yuxuan Liang
ACM International Conference on Multimedia (ACM MM) 2024 Poster
Introduced UrbanCross, a cross-domain satellite image-text retrieval framework that leverages multimodal enhancements and adaptive domain adaptation techniques to bridge diverse urban landscapes, achieving up to a 15% improvement in retrieval performance.
Yutong Feng, Qiongyan Wang, Yutong Xia, Junlin Huang, Siru Zhong, Kun Wang, Shifen Cheng, Yuxuan Liang
The International Joint Conference on Artificial Intelligence (IJCAI) 2024
Present the Spatio-Temporal Field Neural Network and Pyramidal Inference framework, which integrate field and graph perspectives to achieve state-of-the-art nationwide air quality inference in Mainland China.
Huaiwu Zhang, Yutong Xia, Siru Zhong, Kun Wang, Zekun Tong, Qingsong Wen, Roger Zimmermann, Yuxuan Liang
The International Joint Conference on Artificial Intelligence (IJCAI) 2024
Introduce DeepPA, a deep-learning framework and the SINPA dataset for accurately predicting real-time parking availability across Singapore, outperforming existing models and supporting urban planning through a deployed web platform.
Xixuan Hao, Wei Chen, Yibo Yan, Siru Zhong, Kun Wang, Qingsong Wen, Yuxuan Liang
Under review. 2024
Present UrbanVLP, a novel vision-language pretraining framework that integrates both macro and micro-level urban data and enhances interpretability through automatic text generation, achieving superior performance in urban region profiling.
Yibo Yan, Haomin Wen, Siru Zhong, Wei Chen, Haodong Chen, Qingsong Wen, Roger Zimmermann, Yuxuan Liang
The International World Wide Web Conference (WWW) 2024 Oral
Introduce UrbanCLIP, the first large language model–enhanced framework that integrates textual descriptions with satellite imagery through contrastive language-image pretraining, significantly improving urban region profiling performance across major cities.