Tech and AILatam-GPT: The Free, Open Source, and Collaborative AI of...

Latam-GPT: The Free, Open Source, and Collaborative AI of Latin America

-


Latam-GPT is new large language model being developed in and for Latin America. The project, led by the nonprofit Chilean National Center for Artificial Intelligence (CENIA), aims to help the region achieve technological independence by developing an open source AI model trained on Latin American languages and contexts.

“This work cannot be undertaken by just one group or one country in Latin America: It is a challenge that requires everyone’s participation,” says Álvaro Soto, director of CENIA, in an interview with WIRED en Español. “Latam-GPT is a project that seeks to create an open, free, and, above all, collaborative AI model. We’ve been working for two years with a very bottom-up process, bringing together citizens from different countries who want to collaborate. Recently, it has also seen some more top-down initiatives, with governments taking an interest and beginning to participate in the project.”

The project stands out for its collaborative spirit. “We’re not looking to compete with OpenAI, DeepSeek, or Google. We want a model specific to Latin America and the Caribbean, aware of the cultural requirements and challenges that this entails, such as understanding different dialects, the region’s history, and unique cultural aspects,” explains Soto.

Thanks to 33 strategic partnerships with institutions in Latin America and the Caribbean, the project has gathered a corpus of data exceeding eight terabytes of text, the equivalent of millions of books. This information base has enabled the development of a language model with 50 billion parameters, a scale that makes it comparable to GPT-3.5 and gives it a medium to high capacity to perform complex tasks such as reasoning, translation, and associations.

Latam-GPT is being trained on a regional database that compiles information from 20 Latin American countries and Spain, with an impressive total of 2,645,500 documents. The distribution of data shows a significant concentration in the largest countries in the region, with Brazil the leader with 685,000 documents, followed by Mexico with 385,000, Spain with 325,000, Colombia with 220,000, and Argentina with 210,000 documents. The numbers reflect the size of these markets, their digital development, and the availability of structured content.

“Initially, we’ll launch a language model. We expect its performance in general tasks to be close to that of large commercial models, but with superior performance in topics specific to Latin America. The idea is that, if we ask it about topics relevant to our region, its knowledge will be much deeper,” Soto explains.

The first model is the starting point for developing a family of more advanced technologies in the future, including ones with image and video, and for scaling up to larger models. “As this is an open project, we want other institutions to be able to use it. A group in Colombia could adapt it for the school education system or one in Brazil could adapt it for the health sector. The idea is to open the door for different organizations to generate specific models for particular areas like agriculture, culture, and others,” explains the CENIA director.



Source link

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest news

WLFI token price tanks, price down 55% as whales dump millions

Shortly after the launch of Trump’s...

The internet is laughing at El Salvador’s ‘quantum-safe’ bitcoin

El Salvador’s Bitcoin Office earned skepticism and laughter with bitcoin (BTC) wallet movements to protect against quantum computing. Source...

62 Best Labor Day Sales on Gear We’ve Tested—Just a Few Hours Left

Labor Day weekend is almost over, but there is still a pontoon boat load of deals to score...

Ether Technical Analysis: Neutral Indicators Hide a Brewing Volatility Storm

Ether is trading at $4,392 with a market capitalization of $530 billion and a 24-hour trading volume of...

Advertisement

The Vatican isn’t excommunicating crypto gamblers

A fake document mimicking the Holy See Press Office said Polymarket had reduced the solemn conclave to a...

Must read

The internet is laughing at El Salvador’s ‘quantum-safe’ bitcoin

El Salvador’s Bitcoin Office earned skepticism and laughter...

You might also likeRELATED
Recommended to you