Sparsified SGD with memory SU Stich, JB Cordonnier, M Jaggi NeurIPS 2018, 4447-4458, 2018 | 174 | 2018 |
On the Relationship between Self-Attention and Convolutional Layers JB Cordonnier, A Loukas, M Jaggi ICLR 2020, 2019 | 35 | 2019 |
Convex optimization using sparsified stochastic gradient descent with memory JB Cordonnier | 9 | 2018 |
Robust Cross-lingual Embeddings from Parallel Sentences A Sabet, P Gupta, JB Cordonnier, R West, M Jaggi arXiv preprint arXiv:1912.12481, 2019 | 2 | 2019 |
Extrapolating paths with graph neural networks JB Cordonnier, A Loukas IJCAI 2019, 2019 | 2 | 2019 |
Multi-Head Attention: Collaborate Instead of Concatenate JB Cordonnier, A Loukas, M Jaggi arXiv preprint arXiv:2006.16362, 2020 | 1 | 2020 |
Group Equivariant Stand-Alone Self-Attention For Vision DW Romero, JB Cordonnier arXiv preprint arXiv:2010.00977, 2020 | | 2020 |