Don't stop pretraining: Adapt language models to domains and tasks S Gururangan, A Marasović, S Swayamdipta, K Lo, I Beltagy, D Downey, ... arXiv preprint arXiv:2004.10964, 2020 | 1983 | 2020 |
Annotation artifacts in natural language inference data S Gururangan, S Swayamdipta, O Levy, R Schwartz, SR Bowman, ... arXiv preprint arXiv:1803.02324, 2018 | 1142 | 2018 |
Realtoxicityprompts: Evaluating neural toxic degeneration in language models S Gehman, S Gururangan, M Sap, Y Choi, NA Smith arXiv preprint arXiv:2009.11462, 2020 | 724 | 2020 |
All that's' human'is not gold: Evaluating human evaluation of generated text E Clark, T August, S Serrano, N Haduong, S Gururangan, NA Smith arXiv preprint arXiv:2107.00061, 2021 | 277 | 2021 |
Show your work: Improved reporting of experimental results J Dodge, S Gururangan, D Card, R Schwartz, NA Smith arXiv preprint arXiv:1909.03004, 2019 | 250 | 2019 |
Editing models with task arithmetic G Ilharco, MT Ribeiro, M Wortsman, S Gururangan, L Schmidt, ... arXiv preprint arXiv:2212.04089, 2022 | 145 | 2022 |
Variational pretraining for semi-supervised text classification S Gururangan, T Dang, D Card, NA Smith arXiv preprint arXiv:1906.02242, 2019 | 125 | 2019 |
Detoxifying language models risks marginalizing minority voices A Xu, E Pathak, E Wallace, S Gururangan, M Sap, D Klein arXiv preprint arXiv:2104.06390, 2021 | 92 | 2021 |
Demix layers: Disentangling domains for modular language modeling S Gururangan, M Lewis, A Holtzman, NA Smith, L Zettlemoyer arXiv preprint arXiv:2108.05036, 2021 | 86 | 2021 |
Branch-train-merge: Embarrassingly parallel training of expert language models M Li, S Gururangan, T Dettmers, M Lewis, T Althoff, NA Smith, ... arXiv preprint arXiv:2208.03306, 2022 | 85 | 2022 |
Time waits for no one! analysis and challenges of temporal misalignment K Luu, D Khashabi, S Gururangan, K Mandyam, NA Smith arXiv preprint arXiv:2111.07408, 2021 | 53 | 2021 |
Nearest neighbor zero-shot inference W Shi, J Michael, S Gururangan, L Zettlemoyer Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022 | 31 | 2022 |
Silo language models: Isolating legal risk in a nonparametric datastore S Min, S Gururangan, E Wallace, H Hajishirzi, NA Smith, L Zettlemoyer arXiv preprint arXiv:2308.04430, 2023 | 17 | 2023 |
Analysis of graph invariants in functional neocortical circuitry reveals generalized features common to three areas of sensory cortex SS Gururangan, AJ Sadovsky, JN MacLean PLoS computational biology 10 (7), e1003710, 2014 | 17 | 2014 |
Scaling expert language models with unsupervised domain discovery S Gururangan, M Li, M Lewis, W Shi, T Althoff, NA Smith, L Zettlemoyer arXiv preprint arXiv:2303.14177, 2023 | 14 | 2023 |
M2D2: A massively multi-domain language modeling dataset M Reid, V Zhong, S Gururangan, L Zettlemoyer arXiv preprint arXiv:2210.07370, 2022 | 13 | 2022 |
Whose language counts as high quality? measuring language ideologies in text data selection S Gururangan, D Card, SK Dreier, EK Gade, LZ Wang, Z Wang, ... arXiv preprint arXiv:2201.10474, 2022 | 13 | 2022 |
Emergent coordination underlying learning to reach to grasp with a brain-machine interface M Vaidya, K Balasubramanian, J Southerland, I Badreldin, A Eleryan, ... Journal of neurophysiology 119 (4), 1291-1304, 2018 | 11 | 2018 |
Less: Selecting influential data for targeted instruction tuning M Xia, S Malladi, S Gururangan, S Arora, D Chen arXiv preprint arXiv:2402.04333, 2024 | 9 | 2024 |
Expected Validation Performance and Estimation of a Random Variable's Maximum J Dodge, S Gururangan, D Card, R Schwartz, NA Smith arXiv preprint arXiv:2110.00613, 2021 | 5 | 2021 |