Research
My research interest lies in ML-related large-scale scheduling and management, e.g., ML cluster scheduling and datacenter network management.
|
MuxFlow: Efficient GPU Sharing in Production-Level Clusters with More Than 10,000 GPUs
Xuanzhe Liu (Advisor), Yihao Zhao, Shufan Liu, Xiang Li, Yibo Zhu, Xin Liu, Xin Jin.
SCIS, 2024
to appear
|
GreenFlow: A Carbon Efficient Scheduler for Deep Learning Workloads
Diandian Gu, Yihao Zhao, Peng Sun, Xuanzhe Liu, Xin Jin.
TPDS, 2024
paper
|
Klotski: Efficient and Safe Network Migration of Large Production Datacenters
Yihao Zhao*, Xiaoxiang Zhang*, Hang Zhu, Ying Zhang, Zhaodong Wang, Yuandong Tian, Alex Nikulkov, Joao Ferreira, Xuanzhe Liu, Xin Jin.
SIGCOMM, 2023
paper
|
ElasticFlow: An Elastic Serverless Training Platform for Distributed Deep Learning
Diandian Gu*, Yihao Zhao*, Yinmin Zhong, Yifan Xiong, Zhenhua Han, Peng Cheng, Fan Yang, Gang Huang, Xin Jin, Xuanzhe Liu.
ASPLOS, 2023
paper
/
code
|
Multi-Resource Interleaving for Deep Learning Training
Yihao Zhao, Yuanqiang Liu, Yanghua Peng, Yibo Zhu, Xuanzhe Liu, Xin Jin.
SIGCOMM, 2022
paper
/
code
|
Unpaired Image-to-Image Translation using Adversarial Consistency Loss
Yihao Zhao,
Ruihai Wu,
Hao Dong.
ECCV, 2020
paper
/
code
/
bibtex
|
Intern
Research Intern @ ByteDance, Beijing
- Jan. 2021 - Oct. 2021, May 2022 - Oct. 2023
- Mentor: Yanghua Peng, Xin Liu, Yibo Zhu
|
Teaching Experience
TA, Distributed Maching Learning, PKU (Fall 2022, Fall 2023)
TA, Introduction to Computing (A), PKU (Fall 2021)
TA, Introduction to Computer Systems, PKU (Fall 2019)
|
|