NeuralNation: Do you see this as a weakness for OpenAI for Enterprise? If customers are limited to using only Azure, for example?
Kon: Different enterprises have different priorities. Through conversations with over 100 senior executives and enterprises since joining Cohere, I’ve learned that data privacy, data protection, and the ability to customize models with their own data in a secure environment are crucial. We’ve made choices that align with these priorities.
NeuralNation: Cohere has an impressive list of investors, including Oracle, Nvidia, Salesforce, and renowned researchers like Geoffrey Hinton and Fei-Fei Li. How important is this variety?
Gomez: Having such a diverse group of investors is a major asset for Cohere. In our latest funding round, we aimed to bring together international strategies and institutional investors to support us now and in the future. This is quite unique and sets us apart from other companies that rely on single-player investments. We wanted to create a financially healthy future for Cohere.
NeuralNation: There are rumors that Cohere is in talks to raise more funds. Can you comment on expanding the range of investors?
Kon: I find it amusing that rumors are already circulating, although I haven’t read about it. We don’t comment on speculation. However, I can say that our major investors are not just looking for financial returns but also supporting an independent provider like us. We share common priorities with partners like Oracle when it comes to data protection and security.
NeuralNation: Aidan, as a co-founder of Cohere and coming from Google Brain, along with Nick Frosst and having Geoffrey Hinton as an investor, what are your thoughts on his recent comments about AI risk and leaving Google Brain?
Gomez: I have great respect for Geoff and his expertise in AI and deep learning. While we have differing opinions on the profile of risks associated with this technology, I value his insights. Geoff focuses more on long-term risks to humanity, while I prioritize more immediate risks like synthetic media and the dissemination of false information. It’s important to have a spectrum of perspectives on risks, and I appreciate Geoff’s attention to that side.
NeuralNation: Surveys show that a significant percentage of CEOs believe AI could lead to humanity’s extinction in the next 10 years. Do you encounter similar concerns from the executives you speak with?
Kon: I haven’t personally heard that belief from the executives we engage with. They are concerned about issues like bias and the risks associated with deploying AI systems today. Our AI research group, led by Sara Hooker, focuses on addressing these current risks through collaboration with researchers worldwide.
NeuralNation: How do you explain to customers that large language models can control or address problems like hallucination and bias, which are currently in the news?
Gomez: Educating customers is crucial when it comes to deploying large language models. We discuss the opportunities and capabilities of the technology, but also highlight its limitations. We inform them about potential failure modes and guide them on implementing systems and processes to mitigate risks, such as constant benchmarking and evaluation of models. We release new models every week, and we want to ensure that adopting them enhances the user experience and aligns with their risk profiles.
NeuralNation: What are your thoughts on the debate between enterprises implementing open-source models on their own data versus using a platform like Cohere?
Gomez: Open source has made significant progress in AI technology, and it’s commendable. However, there is still a gap between open-source models and what Cohere offers. Our models continuously improve, with weekly updates and the ability for customers to influence the model’s direction. This level of customization and rapid development sets us apart from open-source models, making Cohere a unique value proposition for enterprises.
NeuralNation: Some argue that LLMs from companies like Cohere, OpenAI, and Anthropic are black boxes, with limited visibility into training data and underlying processes. How do you address this concern?
Gomez: We strive to be transparent and provide as much visibility as possible. While there may be limitations in sharing all aspects of our training data and processes, we work closely with customers to educate them about our methods. We encourage constant communication and feedback, and if any issues arise, we promptly investigate and address them. Transparency and accountability are important to us.With our customers, we prioritize transparency and responsiveness. Whenever they have questions about the data our models are trained on, we provide concrete answers. We genuinely care about the origin of our data, ensuring that we can trace its sources and screen out any toxic elements before it influences our models. We also strictly adhere to robots.txt and respect permissions when training on data. Our commitment to our customers is unwavering.
NeuralNation: The recent study on how large language models would comply with the new EU AI Act caught attention. What does it mean to comply with such regulations?
Gomez: While the EU AI Act is still a draft, we are pleased with our position alongside industry leaders. It’s important to note that this is just the beginning, and there is still much work to be done in terms of deployment. However, it demonstrates that we are already aligned with the intent of the regulations. We don’t wait for regulations to dictate our actions; proactive adherence is a core value at Cohere.
NeuralNation: Let’s talk about the future of LLMs. A recent paper discussed the possibility of model collapse when trained with synthetic data over time. As an author of the Transformers paper, what are your thoughts on the future limits of LLMs? Will they become smaller?
Gomez: I don’t anticipate LLMs becoming smaller; that would be quite surprising. Contrary to the mentioned paper, I believe the future lies in synthetic data. Model collapse is an issue in certain domains, but I see it as a symptom of our current methodologies rather than a fundamental flaw of synthetic data. Synthetic data has the potential to expand a model’s knowledge and lead to information and knowledge discovery. This is the next frontier, where models can self-improve and acquire new knowledge without human intervention.
NeuralNation: Why do you believe that?
Gomez: As we approach the limits of human knowledge and the breadth of available data, models need to be able to discover new knowledge independently. Relying solely on existing human knowledge becomes insufficient. It’s inevitable that models will have to expand their knowledge autonomously. I see this as the next major breakthrough.
NeuralNation: And you’re confident that it will happen?
Gomez: Absolutely. Without a doubt.