IBM announced on Tuesday that it had struck a deal to acquire DataStax, a company known for its database platform, data streaming technology, and tools for developing data-intensive AI applications that utilize retrieval augmented generation (RAG). The acquisition aims to bolster IBM’s Watsonx AI portfolio by integrating DataStax’s capabilities to manage vast amounts of unstructured data. With this acquisition, IBM looks to accelerate the deployment of generative AI at scale and unlock value from significant volumes of data.
This move aligns with IBM’s ongoing commitment to open-source AI, as DataStax’s Astra DB cloud platform and DataStax Enterprise NoSQL and vector database are based on the open-source Apache Cassandra database. Ritika Gunnar, IBM’s General Manager for Data and AI, stated, “The strategic acquisition of DataStax brings cutting-edge capabilities in managing unstructured and semi-structured data to Watsonx. This builds on our investments in open-source Cassandra for enterprise applications and enables clients to develop next-generation AI solutions.”
Gunnar added that with DataStax’s AstraDB and DataStax Enterprise, Watsonx will benefit from advanced NoSQL and vector representation capabilities, resulting in more efficient and accurate AI outcomes.
DataStax CEO Chet Kapoor commented, “With our technologies and IBM’s hybrid open data lakehouse, we will enable vector and AI search across the entire data estate, making IBM’s capabilities available to every developer.
In addition to its database, DataStax offers a range of products that will be integrated into IBM’s portfolio, including Astra Streaming for real-time data pipelines, the DataStax AI Platform for building and deploying AI applications, and an enterprise AI platform incorporating Nvidia technology.
Integrating DataStax technologies into Watsonx
Gunnar further highlighted that the synergy between DataStax’s data management expertise and Watsonx’s data AI solutions will deliver enterprise-ready data for AI applications, enhancing data performance, search relevancy, and overall operational efficiency.
Another key aspect of the acquisition is DataStax’s Langflow, an open-source, low-code tool for developing AI applications that use RAG. “Langflow empowers developers to quickly prototype, build, and deploy RAG and multi-agent AI applications,” noted Gunnar. “This low-code interface simplifies the integration of generative AI models, data processing, and AI workflows, allowing developers to concentrate on innovation rather than technical complexities.”
IBM has not disclosed the financial terms of the acquisition, which is expected to be finalized in the second quarter of this year, pending customary closing conditions and regulatory approvals.
DataStax, founded in 2010 and headquartered in Santa Clara, California, is privately held and has raised $115 million in a funding round led by Goldman Sachs in 2022. This acquisition comes as IBM navigates regulatory hurdles for another significant purchase of HashiCorp for $6.4 billion. While originally expected to close by the end of 2024, the acquisition faced delays due to regulatory reviews.
However, the U.K.’s Competition and Markets Authority recently cleared IBM’s acquisition of HashiCorp, moving it closer to completion.
Image Credits: Photo by Carson Masterson on Unsplash
Cameron is a highly regarded contributor in the rapidly evolving fields of artificial intelligence (AI) and machine learning. His articles delve into the theoretical underpinnings of AI, the practical applications of machine learning across industries, ethical considerations of autonomous systems, and the societal impacts of these disruptive technologies.





















