最近我们在筹备CVPR 2023 workshop准备工作，筹集了相关国际知名团队一块组局，相信在PRCV 2022 上提前举办相关论坛会进一步夯实我们在这方面的积累。
嘉宾简介： Xiaodan Liang is currently an Associate Professor at Sun Yat-sen University. She was a Project Scientist at Carnegie Mellon University, working with Prof. Eric Xing (IEEE/AAAI Fellow). She graduated the Ph.D degree in Computer Science in 2016. She focuses on interpretable and cognitive intelligence and its applications on large-scale visual recognition, cross-modal analysis and understanding and digital human analysis. She has published over 100 cutting-edge papers which have appeared in the most prestigious journals (e.g. TPAMI) and conferences (e.g. CVPR/ICCV/ECCV/Neurips) in the field, Google Citation 15000+. She serves as an Area Chair of ICCV 2019, WACV 2020, Neurips 2021-2022, CVPR 2020 and Tutorial Chair (Organization committee) of CVPR 2021 and Ombud Committee of CVPR 2023. She also serves as the Associate Editor of Neural Network Journal (Impact Factor >8). She has been awarded the ACM China (only 2 in China) and CCF Best Doctoral Dissertation Award, the Alibaba DAMO Academy Young Fellow (Top10 under 35 in China), and the ACL 2019 Best Demo paper nomination. She is named one of the young innovators 30 under 30 by Forbes (China). She is a senior member of IEEE.
报告题目：Unified Autonomous Driving via Multi-modality Multi-task Learning
报告摘要：Aiming towards a holistic understanding of multiple downstream tasks simultaneously, there is a need for extracting features with better transferability. Though many latest self-supervised pre-training methods have achieved impressive performance on various vision tasks under the prevailing pretrain-finetune paradigm, their generalization capacity to multi-task learning scenarios is yet to be explored. Here we present a simple yet effective pretrain-adapt-finetune paradigm for general multi-task training, where the off-the-shelf pretrained models can be effectively adapted without increasing the training overhead. Besides, we propose a novel adapter named LV-Adapter, which incorporates language priors in the multi-task model via task-specific prompting and alignment between visual and textual features. Moreover, we collect a series of real-world cases with noisy data distribution for developing 3D detection fusion methods and systematically formulate a robustness benchmark toolkit that can simulate these cases on any clean dataset, which has the camera and LiDAR input modality. Finally the future of how to develop efficient multi-modality multi-task learning paradigm is discussed.
嘉宾简介： Bolei Zhou is an Assistant Professor in the Computer Science Department at the University of California, Los Angeles (UCLA). He earned his Ph.D. from MIT in 2018. He has been a faculty member at The Chinese University of Hong Kong (CUHK) for the past 3 years. His research interest lies at the intersection of computer vision and machine autonomy, focusing on enabling interpretable and trustworthy human-AI interaction. He has developed many widely used interpretation methods such as CAM and Network Dissection, as well as computer vision benchmarks Places and ADE20K. He has been area chair for CVPR, ECCV, ICCV, and AAAI. He received MIT Tech Review's Innovators under 35 in Asia-Pacific Award. More about his research is at https://boleizhou.github.io/.
报告题目：Toward Generalizable Embodied AI in Machine Autonomy
报告摘要：Embodied AI as an emerging research topic has been studied in various visuomotor tasks such as indoor navigation and autonomous driving. Most embodied AI studies are conducted in fixed simulation environments, where the generalizability and safety of the autonomous agents in unseen complex scenes remain questionable. In this talk, I will introduce the research work in my lab for building three pillars to facilitate generalizable embodied AI for machine autonomy: the training data/environment, the representation, and the learning pipeline. First, I will introduce our effort in building the MetaDrive driving simulator by incorporating the capability of importing real-world scenarios and learning to generate novel ones. Then I will talk about learning generalizable representations for decision-making from hours of uncurated YouTube driving videos. Finally, I will discuss how our work on Human-in-the-loop learning brings safe training and inference for human-AI shared control.
嘉宾简介： Chunjing Xu, received Bachelor degree on Math from Wuhan University 1999, Master degree on Math from Peking University 2002, and PhD from Chinese University of Hong Kong 2009. He was Assistant Professor and then Associate Professor at Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences. He joined Huawei since April, 2012 as advanced research engineer, then principal research engineer in Media lab. He became director of computer vision lab in Noah's Ark lab, Central research institute. His main research interests focus on machine learning and computer vision. He has about 40 research papers published on top tier conferences and Journals such as TPAMI, CVPR, ICCV, IJCAI, NeurIPS, AAAI, ICML.
嘉宾简介： Prof. Ziwei Liu is currently an Assistant Professor at Nanyang Technological University, Singapore. Previously, he was a senior research fellow at the Chinese University of Hong Kong and a postdoctoral researcher at University of California, Berkeley. Ziwei received his PhD from the Chinese University of Hong Kong. His research revolves around computer vision, machine learning and computer graphics. He has published extensively on top-tier conferences and journals in relevant fields, including CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML, TPAMI, TOG and Nature - Machine Intelligence. He is the recipient of Microsoft Young Fellowship, Hong Kong PhD Fellowship, ICCV Young Researcher Award and HKSTP Best Paper Award. He also serves as an Area Chair of ICCV, NeurIPS and ICLR.
报告题目：Robust and Data-Efficient 3D Perception
报告摘要：Perceiving the underlying 3D world behind RGB and LiDAR sensors has been a long-pursuing goal of computer vision, with extensive real-life applications. It is at the core of embodied intelligence. In this talk, I will discuss our work in robust and data-efficient 3D perception, with an emphasis on learning structural deep representations under incomplete inputs or supervisions. I will also discuss the challenges related to naturally-distributed data (e.g. long-tailed and open-ended) emerged from real-world sensors, and how we can overcome these challenges by incorporating new neural computing mechanisms such as dynamic memory and routing. Our approach has shown its effectiveness and generalizability on a wide range of tasks.
嘉宾简介： 李弘扬博士，上海人工智能实验室青年科学家。研究方向为通用视觉下游应用研发、自动驾驶感知与决策算法研发等。香港中文大学博士学位。以第一作者身份完成的相关成果，发表于相关国际会议如CVPR/ICCV/NeurIPS/ICML等，累计引用率1400余次，专利授权10余项。2021年至今，担任清华大学研究生课程高等计算机视觉主讲人。带领团队斩获自动驾驶国际挑战赛Waymo Open Challenge 2022第一名，在纯视觉、激光雷达等赛道上取得国际领先地位，提出的BEVFormer工作为自动驾驶量产落地提供了实际解决方案。