Follow
Tim Dettmers
Tim Dettmers
Verified email at cs.washington.edu - Homepage
Title
Cited by
Cited by
Year
Convolutional 2d knowledge graph embeddings
T Dettmers, P Minervini, P Stenetorp, S Riedel
AAAI 2018, 2018
25482018
Bloom: A 176b-parameter open-access multilingual language model
T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ...
10952022
Qlora: Efficient finetuning of quantized llms
T Dettmers, A Pagnoni, A Holtzman, L Zettlemoyer
NeurIPS 2023 (Oral), 2023
5972023
Llm. int8 (): 8-bit matrix multiplication for transformers at scale
T Dettmers, M Lewis, Y Belkada, L Zettlemoyer
NeurIPS 2022, 2022
399*2022
Sparse networks from scratch: Faster training without losing performance
T Dettmers, L Zettlemoyer
arXiv preprint arXiv:1907.04840, 2019
3192019
8-bit Approximations for Parallelism in Deep Learning
T Dettmers
ICLR 2016, 2016
2072016
Base layers: Simplifying training of large, sparse models
M Lewis, S Bhosale, T Dettmers, N Goyal, L Zettlemoyer
ICML 2021, 2021
1592021
8-bit Optimizers via Block-wise Quantization
T Dettmers, M Lewis, S Shleifer, L Zettlemoyer
ICLR 2022 (Spotlight), 2022
1242022
Branch-train-merge: Embarrassingly parallel training of expert language models
M Li, S Gururangan, T Dettmers, M Lewis, T Althoff, NA Smith, ...
arXiv preprint arXiv:2208.03306, 2022
782022
The case for 4-bit precision: k-bit inference scaling laws
T Dettmers, L Zettlemoyer
ICML 2023, 2023
672023
Spqr: A sparse-quantized representation for near-lossless llm weight compression
T Dettmers, R Svirschevski, V Egiazarian, D Kuznedelev, E Frantar, ...
arXiv preprint arXiv:2306.03078, 2023
572023
Petals: Collaborative inference and fine-tuning of large models
A Borzunov, D Baranchuk, T Dettmers, M Ryabinin, Y Belkada, ...
ACL 2022, Demonstration, 2022
292022
Jack the reader-A machine reading framework
D Weissenborn, P Minervini, T Dettmers, I Augenstein, J Welbl, ...
arXiv preprint arXiv:1806.08727, 2018
112018
Swarm parallelism: Training large models can be surprisingly communication-efficient
M Ryabinin, T Dettmers, M Diskin, A Borzunov
NeurIPS 2023, 2023
102023
Stable and low-precision training for large-scale vision-language models
M Wortsman, T Dettmers, L Zettlemoyer, A Morcos, A Farhadi, L Schmidt
NeurIPS 2023, 2023
92023
Training transformers together
A Borzunov, M Ryabinin, T Dettmers, Q Lhoest, L Saulnier, M Diskin, ...
NeurIPS 2021 Demonstration, 2022
92022
High performance natural language processing
G Ilharco, C Ilharco, I Turc, T Dettmers, F Ferreira, K Lee
EMNLP 2020, Tutorial, 2020
32020
Bloom: A 176b-parameter open-access multilingual language model
BS Workshop, TL Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, ...
arXiv preprint arXiv:2211.05100, 2022
22022
Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
LZ Liu, T Dettmers, XV Lin, V Stoyanov, X Li
EMNLP 2023, 2023
12023
MatFormer: Nested Transformer for Elastic Inference
S Kudugunta, A Kusupati, T Dettmers, K Chen, I Dhillon, Y Tsvetkov, ...
arXiv preprint arXiv:2310.07707, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–20