news

Feb 12, 2024 My work “Flover”, will be presented on NVIDIA GTC 2024 keynote session. Look forward to seeing you at San Jose in March.
Dec 23, 2023 My paper “Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference” has been accepted by IPDPS’24.
Nov 11, 2023 Join the amazing SC'23 conference as student volunteer, at Denver, Colorado. Together with our extraordinary team at OSU!
Oct 04, 2023 My paper “Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference” has been accepted to HiPC’23.