Larry Ellison has delivered a pointed message to the artificial intelligence industry, arguing that the real differentiator in AI is no longer models, compute power, or research talent, but data. Speaking about the current landscape, the Oracle Corporation cofounder said leading AI systems such as ChatGPT, Gemini, Grok, and Llama are largely trained on the same body of public internet data, including Wikipedia pages, Reddit discussions, and news articles. According to Ellison, this shared training foundation means many models are converging toward similar outputs and capabilities, effectively becoming commodities distinguished more by branding than by substance.
Ellison’s assessment centers on the idea that while model architectures and optimization techniques vary, the underlying public datasets remain largely identical. As a result, he suggested that innovation in model training alone may offer diminishing returns if competitors are drawing from the same digital commons. His argument shifts the focus from training scale to data exclusivity, particularly private enterprise data that is not accessible on the open web. Medical records stored in hospital systems, financial transaction histories held by banks, and supply chain intelligence maintained by major corporations represent what he described as the real gold in the AI economy. Much of that high value information, Ellison noted, already resides within Oracle databases, positioning the company at the center of a potential shift in AI development.
To capitalize on this opportunity, Oracle has introduced AI Database 26ai, a platform designed to allow leading AI models to reason over a company’s private data without requiring that data to leave secure environments. The system relies on Retrieval Augmented Generation, or RAG, a technique that enables AI to search and retrieve relevant information in real time rather than incorporating it into the training process. Under this approach, enterprise data remains inside existing database systems while AI models query it dynamically to generate responses or insights. For example, a bank could analyze decades of loan performance data without exposing customer records externally. A hospital might use AI to assist in diagnosis by referencing a patient’s complete medical history while remaining compliant with health privacy regulations. Defense contractors and other sensitive organizations could similarly apply AI across classified datasets without transferring information outside secure perimeters.
Ellison has framed this database centric AI strategy as potentially more significant than the current surge in training infrastructure and graphics processing unit demand. Oracle’s financial metrics underscore the scale of its ambition. The company recently reported remaining performance obligations totaling 523 billion dollars in contracted revenue not yet delivered, with approximately 300 billion dollars attributed to OpenAI agreements. Cloud revenue reached 8 billion dollars in a single quarter, while Oracle Cloud Infrastructure recorded 66 percent growth and GPU related revenue climbed 177 percent. These figures reflect growing enterprise demand for AI ready cloud services and database integration. Ellison’s broader contention is that if private data becomes the primary competitive moat in artificial intelligence, then control over enterprise databases could shape the next phase of the industry’s power structure, raising strategic and governance questions about who ultimately steers the future of AI deployment at scale.
Follow the SPIN IDG WhatsApp Channel for updates across the Smart Pakistan Insights Network covering all of Pakistan’s technology ecosystem.




