Goal, the company before known as Facebook, will have this year the most powerful supercomputer in the world dedicated to artificial intelligence tasks.
The machine, known as Ai Research Supercluster, or CSR, is already in operation, although not with its final calculation capacity, and will be used to generate machine learning models capable of working on all kinds of scenarios, from comments mode
The design of virtual environments.
“We hope that RSC will help us build completely new systems that can, for example, promote voice translations in real time to large groups of people, each speaking a different language, so they can collaborate without problems in a research project
Or play a game of augmented reality together, “explain Kevin Lee and Shubho Sengupta, engineers in charge of the project.
RSC will be able to perform trillions of operations per second once it is completed thanks to the use of more than 16,000 graphic process units (which are used in artificial intelligence environments due to their high parallel calculation capacity).
Today the computer has almost a third of those units.
This capacity for calculation, according to the company, is necessary to continue advancing in domains such as artificial vision or recognition of language, in which machine learning techniques have generated very useful tools in recent years but in which there are still obstacles
Important to overcome.
“We look for an infrastructure that can train models with more than 1,000 million parameters in such large data sets as an exabyte, which, to give a little context, is the equivalent of 36,000 years of high quality video,” says Lee and
Sengupta
The previous supercomputer that Facebook used to train its models of artificial intelligence is formed by 22,000 NVIDIA V100 process units.
The new machine, which will be 20 times faster in artificial vision processes, will instead use the NVIDIA DGX A100 architecture.
For practical purposes, this greater power will allow training machine learning models much more quickly.
A model with tens of billions of parameters, for example, can terminate training in three weeks, compared to nine weeks that would take the previous model.
Once created, these models can be executed very quickly, and this is the technology behind many of the tools that today we consider regularly within the goal services, such as automatic photo labeling or transcription of a video.
To train these machine learning algorithms, Meta uses data from Facebook, Instagram or WhatsApp users, but the company ensures that several mechanisms have been integrated into RSC to protect the privacy of users.
The data, for example, remain encrypted until the time they must be processed and RSC lacks a direct connection to the network, the information can only be sent or received from the company’s data processing centers.