• 讲座信息

4.15 | Deep high-resolution representation learning for visual recognition

2019.04.03

演讲者王井东博士
头衔职位微软亚洲研究院高级研究员
时间2019 年 4 月 15 日下午 4 点
地点张江校区软件楼 105IBM 会议室
联系人李析燃 lixiran@fudan.edu.cn

演讲简介

High-resolution representation learning plays an essential role in vision problems and has been attracting more and more attention. Most existing techniques recover high-resolution representations mainly from low-resolution representations output by one network similar to a classification network. In this work, we propose a high-resolution network (HRNet). The HRNet maintains high-resolution representations by connecting high-to-low resolution convolutions in parallel and strengthens high-resolution representations by repeatedly performing multi-scale fusions across parallel convolutions. We demonstrate the effectives on pixel-level classification (semantic segmentation, face alignment and human pose estimation), region-level classification (COCO object detection), and image-level classification. The project page is https://jingdongwang2017.github.io/Projects/HRNet/

关于讲者

Jingdong Wang is a Senior Researcher with the Visual Computing Group, Microsoft Research, Beijing, China. His areas of current interest include CNN architecture design, human pose estimation, semantic segmentation, person re-identification, large-scale indexing, and salient object detection. He has authored one book and 100+ papers in top conferences and prestigious international journals in computer vision, multimedia, and machine learning. He authored a comprehensive survey on learning to hash in TPAMI. His paper was selected into the Best Paper Finalist at the ACM MM 2015. Dr. Wang is an Associate Editor of IEEE TPAMI, IEEE TCSVT and IEEE TMM. He was an Area Chair or a Senior Program Committee Member of top conferences, such as CVPR, ICCV, ECCV, AAAI, IJCAI, and ACM Multimedia. He is an ACM Distinguished Member and a Fellow of the IAPR. His homepage is https://jingdongwang2017.github.io/.

His representative works include deep high-resolution representation learning (HRNets), interleaved group convolutions, supervised saliency detection (discriminative regional feature integration, DRFI), neighborhood graph search (NGS) for large scale similarity search, composite quantization for compact coding, the Market-1501 dataset for person re-identification, and so on. He has shipped a dozen of technologies to Microsoft products, including Bing search, Bing Ads, Cognitive service, and XiaoIce Chatbot. His NGS algorithm is a foundational element of many products. He has developed Bing image search color filter using his efficient salient object algorithm. He has developed the first commercial color-sketch image search system.