讲者: Patrick P. C. Lee, the Chinese University of Hong Kong,China
时间: 10 月 31 日周五上午 9:30-11:00
地点: 张江校区计算机楼 405
联系人:王新 xinw@fudan.edu.cn
Abstract:
Modern clustered storage systems increasingly adopt erasure coding to reduce the storage overhead of traditional 3-way replication. However, there remain challenging issues of maintaining high performance in erasure-coded clustered storage systems. In this talk, I will share our experiences of deploying erasure-coding-based storage in Hadoop, a popular cluster platform for big data analytics. I will present two new designs: (1) CORE, which augments existing optimal regenerating coding for the recovery of a general number of failures including single and concurrent failures, and (2) Degraded-First Scheduling, which improves MapReduce performance in erasure-coded storage. I will present new analytical results, as well as experimental findings based on our prototypes in a Hadoop cluster. If time is allowed, I will discuss other topics in distributed storage systems done by our group.
Bio:
Patrick P. C. Lee received the Ph.D. degree in Computer Science from Columbia University in 2008. He is now an assistant professor of the Department of Computer Science and Engineering at the Chinese University of Hong Kong. He is interested in various applied/systems topics including cloud computing and storage, distributed systems and networks, operating systems, and security/resilience. His current research interests focus on building dependable storage systems, and in particular, improving the fault tolerance, recovery, security, and performance of different types of storage architectures including cloud file systems and SSDs.