May 4, 2022

Meta AI is sharing OPT-175B, the first 175-billion-parameter language model to be made available to the broader AI research community

Posted by in category: robotics/AI

Large language models — natural language processing (NLP) systems with more than 100 billion parameters — have transformed NLP and AI research over the last few years. Trained on a massive and varied volume of text, they show surprising new capabilities to generate creative text, solve basic math problems, answer reading comprehension questions, and more. While in some cases the public can interact with these models through paid APIs, full research access is still limited to only a few highly resourced labs. This restricted access has limited researchers’ ability to understand how and why these large language models work, hindering progress on efforts to improve their robustness and mitigate known issues such as bias and toxicity.

In line with Meta AI’s commitment to open science, we are sharing Open Pretrained Transformer (OPT-175B), a language model with 175 billion parameters trained on publicly available data sets, to allow for more community engagement in understanding this foundational new technology. For the first time for a language technology system of this size, the release includes both the pretrained models and the code needed to train and use them. To maintain integrity and prevent misuse, we are releasing our model under a noncommercial license to focus on research use cases. Access to the model will be granted to academic researchers; those affiliated with organizations in government, civil society, and academia; along with industry research laboratories around the world.

We believe the entire AI community — academic researchers, civil society, policymakers, and industry — must work together to develop clear guidelines around responsible AI in general and responsible large language models in particular, given their centrality in many downstream language applications. A much broader segment of the AI community needs access to these models in order to conduct reproducible research and collectively drive the field forward. With the release of OPT-175B and smaller-scale baselines, we hope to increase the diversity of voices defining the ethical considerations of such technologies.

Leave a reply