Följ
Weixin Chen
Titel
Citeras av
Citeras av
År
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
B Wang, W Chen, H Pei, C Xie, M Kang, C Zhang, C Xu, Z Xiong, R Dutta, ...
Advances in Neural Information Processing Systems (NeurIPS), 2023
1142023
Effective Backdoor Defense by Exploiting Sensitivity of Poisoned Samples
W Chen, B Wu, H Wang
Advances in Neural Information Processing Systems (NeurIPS), 2022
412022
TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
W Chen, D Song, B Li
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
342023
GRATH: Gradual Self-Truthifying for Large Language Models
W Chen, D Song, B Li
International Conference on Machine Learning (ICML), 2024
12024
Systemet kan inte utföra åtgärden just nu. Försök igen senare.
Artiklar 1–4