Improving service availability of cloud systems by predicting disk error Y Xu, K Sui, R Yao, H Zhang, Q Lin, Y Dang, P Li, K Jiang, W Zhang, ... 2018 USENIX Annual Technical Conference (USENIX ATC 18), 481-494, 2018 | 157 | 2018 |
Dynamic TCP initial windows and congestion control schemes through reinforcement learning X Nie, Y Zhao, Z Li, G Chen, K Sui, J Zhang, Z Ye, D Pei IEEE Journal on Selected Areas in Communications 37 (6), 1231-1247, 2019 | 130 | 2019 |
Characterizing and improving wifi latency in large-scale operational networks K Sui, M Zhou, D Liu, M Ma, D Pei, Y Zhao, Z Li, T Moscibroda Proceedings of the 14th Annual International Conference on Mobile Systems …, 2016 | 127 | 2016 |
Predicting node failure in cloud service systems Q Lin, K Hsieh, Y Dang, H Zhang, K Sui, Y Xu, JG Lou, C Li, Y Wu, R Yao, ... Proceedings of the 2018 26th ACM joint meeting on European software …, 2018 | 124 | 2018 |
Practical root cause localization for microservice systems via trace analysis Z Li, J Chen, R Jiao, N Zhao, Z Wang, S Zhang, Y Wu, L Jiang, L Yan, ... 2021 IEEE/ACM 29th International Symposium on Quality of Service (IWQOS), 1-10, 2021 | 92 | 2021 |
Cross-dataset time series anomaly detection for cloud systems X Zhang, J Kim, Q Lin, K Lim, SO Kanaujia, Y Xu, K Jamieson, ... 2019 USENIX Annual Technical Conference (USENIX ATC 19), 1063-1076, 2019 | 88 | 2019 |
Root-cause metric location for microservice systems via log anomaly detection L Wang, N Zhao, J Chen, P Li, W Zhang, K Sui 2020 IEEE international conference on web services (ICWS), 142-150, 2020 | 75 | 2020 |
Neural feature search: A neural architecture for automated feature engineering X Chen, Q Lin, C Luo, X Li, H Zhang, Y Xu, Y Dang, K Sui, X Zhang, ... 2019 IEEE International Conference on Data Mining (ICDM), 71-80, 2019 | 64 | 2019 |
EDUM: classroom education measurements via large-scale WiFi networks M Zhou, M Ma, Y Zhang, K SuiA, D Pei, T Moscibroda Proceedings of the 2016 acm international joint conference on pervasive and …, 2016 | 64 | 2016 |
Identifying bad software changes via multimodal anomaly detection for online service systems N Zhao, J Chen, Z Yu, H Wang, J Li, B Qiu, H Xu, W Zhang, K Sui, D Pei Proceedings of the 29th ACM Joint Meeting on European Software Engineering …, 2021 | 61 | 2021 |
An empirical investigation of practical log anomaly detection for online service systems N Zhao, H Wang, Z Li, X Peng, G Wang, Z Pan, Y Wu, Z Feng, X Wen, ... Proceedings of the 29th ACM Joint Meeting on European Software Engineering …, 2021 | 61 | 2021 |
Understanding and handling alert storm for online service systems N Zhao, J Chen, X Peng, H Wang, X Wu, Y Zhang, Z Chen, X Zheng, ... Proceedings of the ACM/IEEE 42nd International Conference on Software …, 2020 | 60 | 2020 |
Causal inference-based root cause analysis for online service systems with intervention recognition M Li, Z Li, K Yin, X Nie, W Zhang, K Sui, D Pei Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and …, 2022 | 58 | 2022 |
Real-time incident prediction for online service systems N Zhao, J Chen, Z Wang, X Peng, G Wang, Y Wu, F Zhou, Z Feng, X Nie, ... Proceedings of the 28th ACM Joint Meeting on European Software Engineering …, 2020 | 55 | 2020 |
Actionable and interpretable fault localization for recurring failures in online service systems Z Li, N Zhao, M Li, X Lu, L Wang, D Chang, X Nie, L Cao, W Zhang, K Sui, ... Proceedings of the 30th ACM Joint European Software Engineering Conference …, 2022 | 54 | 2022 |
Fluxrank: A widely-deployable framework to automatically localizing root cause machines for software service failure mitigation P Liu, Y Chen, X Nie, J Zhu, S Zhang, K Sui, M Zhang, D Pei 2019 IEEE 30th International Symposium on Software Reliability Engineering …, 2019 | 45 | 2019 |
Generic and robust localization of multi-dimensional root causes Z Li, C Luo, Y Zhao, Y Sun, K Sui, X Wang, D Liu, X Jin, Q Wang, D Pei 2019 IEEE 30th International Symposium on Software Reliability Engineering …, 2019 | 44 | 2019 |
Automatically and adaptively identifying severe alerts for online service systems N Zhao, P Jin, L Wang, X Yang, R Liu, W Zhang, K Sui, D Pei IEEE INFOCOM 2020-IEEE Conference on Computer Communications, 2420-2429, 2020 | 42 | 2020 |
Practical root cause localization for microservice systems via trace analysis. In 2021 IEEE/ACM 29th International Symposium on Quality of Service (IWQOS) Z Li, J Chen, R Jiao, N Zhao, Z Wang, S Zhang, Y Wu, L Jiang, L Yan, ... IEEE, Tokyo, Japan, 1-10, 2021 | 36 | 2021 |
Identifying root-cause metrics for incident diagnosis in online service systems C Wu, N Zhao, L Wang, X Yang, S Li, M Zhang, X Jin, X Wen, X Nie, ... 2021 IEEE 32nd International Symposium on Software Reliability Engineering …, 2021 | 29 | 2021 |