Full Library

Return to list of results

1
...
242
243
244
245
246
...
6,976

Page 244 of 6,976

LLaMA: Open and Efficient Foundation Language Models

Open in Zotero

View on zotero.org

Resource type

Preprint

Authors/contributors

Touvron, Hugo (Author)
Lavril, Thibaut (Author)
Izacard, Gautier (Author)
Martinet, Xavier (Author)
Lachaux, Marie-Anne (Author)
Lacroix, Timothée (Author)
Rozière, Baptiste (Author)
Goyal, Naman (Author)
Hambro, Eric (Author)
Azhar, Faisal (Author)
Rodriguez, Aurelien (Author)
Joulin, Armand (Author)
Grave, Edouard (Author)
Lample, Guillaume (Author)

Title

LLaMA: Open and Efficient Foundation Language Models

Abstract

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

Repository

arXiv

Archive ID

arXiv:2302.13971

Date

2023-02-27

DOI

10.48550/arXiv.2302.13971

URL

http://arxiv.org/abs/2302.13971

Accessed

24/02/2024, 17:41

Short Title

LLaMA

Library Catalogue

arXiv.org

Extra

arXiv:2302.13971 [cs]

Citation

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E., & Lample, G. (2023). LLaMA: Open and Efficient Foundation Language Models (arXiv:2302.13971). arXiv. https://doi.org/10.48550/arXiv.2302.13971

Theme

Artificial Intelligence

Link to this record

https://docs.opendeved.net/lib/ZZ392IMK

1
...
242
243
244
245
246
...
6,976

Page 244 of 6,976