Holding virtual icon of social network and open source in a hand
Together, an open-source AI startup, announced today that it has raised $20 million in seed funding to support its mission of creating decentralized alternatives to closed AI systems and democratizing AI for everyone.The company says its goal is to establish open source as the default way to incorporate AI, and to help create open models that outperform closed models. To this end, the company has already collaborated with decentralized infrastructure providers, open-source groups and academic and corporate research labs. Together believes that it has assembled an impressive team of researchers, engineers and AI practitioners.
“Our mission is to empower innovation and creativity by providing leading open-source generative AI models and an innovative cloud platform that makes AI accessible to anyone, anywhere,” Jamie de Guerre, founding SVP of product at Together, told VentureBeat.
Together has released several generative AI projects, including GPT-JT, OpenChatKit, and RedPajama, which have received support from hundreds of thousands of AI developers. The company stated that it will use the new $20 million seed funding to expand the team, research, product and infrastructure.
“Foundation models are a new general-purpose technology broadly applicable to industries and applications. We believe an open-source ecosystem for these models will truly unlock their potential and create vast value,” said de Guerre. “Further, as enterprises define their generative AI strategies, they seek privacy, transparency, customization and ease of deployment. Open-source models pre-trained on open datasets enable organizations to fully inspect, understand and customize models to their applications.”
Open-source for data transparency
The company said its priority is to lay the groundwork for open-source AI by providing datasets, models and research. The RedPajama project is a promising initial undertaking, but represents only the start of the company’s efforts. Together’s second goal is to make computational resources for training, fine-tuning and operating large models more accessible. It means to provide this with a groundbreaking AI-specific cloud platform built on a decentralized computing network.
De Guerre noted that closed models also present risks for liability and challenges, as the customer has no visibility into how the model works or what it was trained on. That’s why Together makes datasets and models fully open-source to counteract this trend and enable more accessible computing infrastructure for training or using large models.
“With closed models, researchers cannot access the training data, they are not able to download the models and customize them, and they cannot learn from the training process for future research,” said de Guerre. “Open-source generative AI models and datasets enable the open community to do more advanced research and to build on these models creating new ones that push innovation in new directions.
“We’ve already seen this in incredible ways. Since releasing RedPajama, in just the past few weeks, we have seen new … models available in open-source, implementations to run foundation models on a laptop or mobile phone, and new models tuned for specific applications.”
The company aims to use the newly acquired funding to improve its specialized cloud platform, which is designed to efficiently scale training and inference for large models through distributed optimization. This will make it possible to quickly customize and connect foundation models with production tasks.
“In the coming months, we plan to open up access to this platform, enabling rapid customization and coupling foundation models with production tasks confidentially and securely,” said de Guerre. “This will further enable the open community by making the computing resources needed for training and operating these large models more efficient and accessible.”
Tackling open-source bottlenecks to foster AI progress
According to de Guerre, the company promotes open-source AI advancements in two ways. First, it partners with open-source groups and corporate research labs to release open research, datasets and models. Second, the company is partnering with decentralized infrastructure providers to provide improved access to computing for training, fine-tuning and running large models.
He noted that networking is one of the key bottlenecks for training large foundation models.
“Not only do you need a large number of powerful GPUs, but you also need those GPUs to be connected by incredibly fast networking, typically in a single physical location. Unfortunately, this type of data center is only available to a handful of organizations,” he said. “Our research enables [a more than] 200-times reduction in the network traffic during model training or fine-tuning. This means you can now leverage GPUs across multiple disparate networks to participate in the training or fine-tuning of large models without losing the quality of the model produced.”
He added that this enables a more scalable infrastructure and gives the customer various computing options at different performance and cost levels. This makes the platform accessible to more individuals.
Additionally, the company has developed technologies that improve inference throughput by an order of magnitude.
The company claims it does not store or use customer or training data by default. Customers can opt in and share their data with Together for training models.
“We are investing heavily in building the best open-source generative AI models and an AI-specific cloud platform. We will continue to release open-source models and other projects to support this goal,” said de Guerre. “We believe that AI will be pervasive and have a huge impact on our culture and society. We want this future to be based on open and participatory systems so that we all, as a society, can shape this future.”
Together’s seed funding round was led by Lux Capital and supported by several venture funds, angel investors and prominent entrepreneurs, including the cofounder of PayPal, Scott Banister, the cofounder of Cloudera, Jeff Hammerbacher, and Lip-Bu Tan, the founder of Cadence Systems.