Position: Ph.D. Candidate

Current Institution: University of Illinois at Urbana-Champagne

Abstract:
Data-centric scheduling: novel algorithms, state space collapse and delay minimization in heavy traffic

Data-processing applications are posing increasingly significant challenges to scheduling in
today’s computing clusters. The presence of data induces an extremely heterogeneous cluster
where processing speed depends on the task-server pair. The situation is further complicated by ever-changing technologies of networking, memory, and software architecture. As a result, a suboptimal scheduling algorithm causes unnecessary delay in job completion and wastes system capacity.

We propose a versatile model featuring a multi-class parallel-server system that readily incorporates the different characteristics of a variety of systems. The model had been studied by Harrison, Williams and Stolyar. However, to achieve delay optimality in heavy-traffic with unknown arrival rate vectors has remained an open problem.

We propose a novel algorithm that achieves delay optimality with unknown arrival rates. This enables the application of the algorithm to data-centric clusters. New proof techniques were required including construction of an ideal load decomposition. To demonstrate the effectiveness of the algorithm, we implemented a Hadoop MapReduce scheduler and showed that it achieves >10X improvement over existing schedulers.

Bio:
Qiaomin Xie is a final-year Ph.D. candidate in the Department of Electrical and Computer Engineering at the University of Illinois Urbana-Champaign. She received a B.E. in Electronic Engineering from Tsinghua University in 2010. Her research focuses on scalability, multi-resource packing and scheduling of efficient data-centric systems. She is the recipient of the Yi-Min Wang and Pi-Yu Chung Research Award (2015), and best paper award from IFIP Performance Conference (2011).