Zuckerberg says Meta will want 10x extra computing energy to coach Llama 4 than Llama 3

0
24

[ad_1]

Meta, which develops one of many greatest foundational open-source giant language fashions, Llama, believes it’ll want considerably extra computing energy to coach fashions sooner or later.

Mark Zuckerberg mentioned on Meta’s second-quarter earnings name on Tuesday that to coach Llama 4 the corporate will want 10x extra compute than what was wanted to coach Llama 3. However he nonetheless desires Meta to construct capability to coach fashions relatively than fall behind its opponents.

“The quantity of computing wanted to coach Llama 4 will possible be nearly 10 occasions greater than what we used to coach Llama 3, and future fashions will proceed to develop past that,” Zuckerberg mentioned.

“It’s arduous to foretell how it will pattern a number of generations out into the longer term. However at this level, I’d relatively danger constructing capability earlier than it’s wanted relatively than too late, given the lengthy lead occasions for spinning up new inference initiatives.”

Meta launched Llama 3 with 80 billion parameters in April. The corporate final week launched an upgraded model of the mannequin, referred to as Llama 3.1 405B, which had 405 billion parameters, making it Meta’s greatest open-source mannequin.

Meta’s CFO, Susan Li, additionally mentioned the corporate is considering completely different information middle initiatives and constructing capability to coach future AI fashions. She mentioned Meta expects this funding to extend capital expenditures in 2025.

Coaching giant language fashions could be a expensive enterprise. Meta’s capital expenditures rose practically 33% to $8.5 billion in Q2 2024, from $6.4 billion a yr earlier, pushed by investments in servers, information facilities and community infrastructure.

In line with a report from The Data, OpenAI spends $3 billion on coaching fashions and a further $4 billion on renting servers at a reduction price from Microsoft.

“As we scale generative AI coaching capability to advance our basis fashions, we’ll proceed to construct our infrastructure in a approach that gives us with flexibility in how we use it over time. It will permit us to direct coaching capability to gen AI inference or to our core rating and advice work, after we count on that doing so could be extra priceless,” Li mentioned in the course of the name.

Through the name, Meta additionally talked about its consumer-facing Meta AI’s utilization and mentioned India is the biggest market of its chatbot. However Li famous that the corporate doesn’t count on Gen AI merchandise to contribute to income in a big approach.

[ad_2]