Följ
Micah Carroll
Micah Carroll
PhD student, UC Berkeley
Verifierad e-postadress på berkeley.edu - Startsida
Titel
Citeras av
Citeras av
År
On the Utility of Learning About Humans for Human-AI Coordination
M Carroll, R Shah, MK Ho, T Griffiths, S Seshia, P Abbeel, A Dragan
Advances in Neural Information Processing Systems, 2019, 5174-5185, 2019
4462019
Open problems and fundamental limitations of reinforcement learning from human feedback
S Casper, X Davies, C Shi, TK Gilbert, J Scheurer, J Rando, R Freedman, ...
arXiv preprint arXiv:2307.15217, 2023
4002023
Harms from Increasingly Agentic Algorithmic Systems
A Chan, R Salganik, A Markelius, C Pang, N Rajkumar, D Krasheninnikov, ...
Proceedings of the 2023 ACM Conference on Fairness, Accountability, and …, 2023
93*2023
Estimating and Penalizing Induced Preference Shifts in Recommender Systems
M Carroll, A Dragan, S Russell, D Hadfield-Menell
International Conference on Machine Learning, 2022 (Spotlight), 2686-2708, 2022
75*2022
Characterizing Manipulation from AI Systems
M Carroll*, A Chan*, H Ashton, D Krueger
EEAMO 2023, 2023
512023
Uni[MASK]: Unified inference in sequential decision problems
M Carroll, O Paradise, J Lin, R Georgescu, M Sun, D Bignell, S Milani, ...
NeurIPS 2022 (Oral), 2022
39*2022
Evaluating the Robustness of Collaborative Agents
P Knott, M Carroll, S Devlin, K Ciosek, K Hofmann, AD Dragan, R Shah
AAMAS 2021 (Extended Abstract), 2021
312021
Twitter’s algorithm: Amplifying anger, animosity, and affective polarization
S Milli, M Carroll, Y Wang, S Pandey, S Zhao, AD Dragan
arXiv preprint arXiv:2305.16941, 2023
272023
Engagement, user satisfaction, and the amplification of divisive content on social media
S Milli, M Carroll, Y Wang, S Pandey, S Zhao, AD Dragan
arXiv preprint arXiv:2305.16941, 2023
232023
Optimal Behavior Prior: Data-Efficient Human Models for Improved Human-AI Collaboration
M Yang, M Carroll, A Dragan
NeurIPS 2022 Human in the Loop Learning (HiLL) Workshop, 2022
102022
AI Alignment with Changing and Influenceable Reward Functions
M Carroll, D Foote, A Siththaranjan, S Russell, A Dragan
arXiv preprint arXiv:2405.17713, 2024
72024
Time-Efficient Reward Learning via Visually Assisted Cluster Ranking
D Zhang, M Carroll, A Bobu, A Dragan
NeurIPS 2022 Human in the Loop Learning (HiLL) Workshop, 2022
62022
Beyond preferences in ai alignment
T Zhi-Xuan, M Carroll, M Franklin, H Ashton
Philosophical Studies, 1-51, 2024
52024
Who Needs to Know? Minimal Knowledge for Optimal Coordination
N Lauffer, A Shah, M Carroll, MD Dennis, S Russell
International Conference on Machine Learning 2023, 18599-18613, 2023
42023
Overview of current AI alignment approaches
M Carroll
42018
On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback
M Williams*, M Carroll*, A Narang, C Weisser, B Murphy, A Dragan
arXiv preprint arXiv:2411.02306, 2024
2024
Systemet kan inte utföra åtgärden just nu. Försök igen senare.
Artiklar 1–16