publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2024
- Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model InferenceAdvances in 38th IEEE International Parallel & Distributed Processing Symposium (IPDPS 24), 2024
2023
- Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel InferenceAdvances in 30th IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, & ANALYTICS (HiPC 23), 2023
- A novel framework for efficient offloading of communication operations to bluefield smartnicsIn 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS 23) , 2023
- MPI-xCCL: A Portable MPI Library over Collective Communication Libraries for Various AcceleratorsIn Proceedings of the SC’23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis , 2023
2021
- Soft: Softmax-free transformer with linear complexityAdvances in Neural Information Processing Systems (NeurIPS 21), 2021
2020
- SPRNet: single-pixel reconstruction for one-stage instance segmentationIEEE transactions on cybernetics, 2020