Varada, the data lake query acceleration innovator, is releasing version 3.0 of its data analytics platform, now delivering at-scale big data analytics users who rely on the power of indexing to extract insights from massive, unstructured data sets.
The new version marries the power of cloud elasticity and the query power of indexing for big data analytics, giving data teams the ability to scale analytics workloads rapidly and meet fluctuating demand, according to the vendor.
It delivers a dramatic increase in cost performance and cluster elasticity as compared to the previous version. In addition, version 3.0 eliminates the need to keep high-performance and expensive SSD NVMe (Solid-State Drive Nonvolatile Memory Express) compute instances idling when the cluster is not in use.
“Varada was built on the premise that indexing can transform big data analytics, if done correctly,” said Eran Vanounou, CEO of Varada. “With version 3.0, the Varada platform is now the most powerful and cost-effective way to leverage the power of big data directly atop your data lake.”
This third iteration of the Varada platform marks the latest step in a journey that began last December with version 1.0, in which adaptive indexing chooses the optimal index for each data set to deliver 10x-100x faster performance compared to other data lake query engines.
The platform used pre-defined materialization to enable indexing. This past spring, version 2.0 eliminated the need for materialization and added a dynamic, smart observability layer that automatically decides which data to index and when to index, making it easy to use and giving users a dramatic improvement in TCO (Total Cost of Ownership).
Version 3.0 extends these advantages with rapid and elastic scaling capabilities that let users add and remove nodes and clusters rapidly depending on current workload needs, further improving TCO for large-scale users.
Also Read: Marketing With Data Lakes and Data Warehouses
Version 3.0 of the Varada platform includes three layers. The first is the hot data and index layer, in which SSD NVMe attached nodes (in the customer’s Virtual Private Cloud) are used to process queries and store hot data and cache for optimal performance.
The second is the warm index and data layer, where an object storage bucket on the customer’s data lake is used to store all indexes for scaling purposes. The third layer is the customer data layer (“cold”), which remains the single source of truth.
Varada’s platform is based on a multi-cluster approach, which allows different clusters to share warm indexed data by accessing the designated bucket on the data lake.
In addition to behavior-based indexing, data platform teams can opt for indexing in the background by low-cost nodes on spot instances. Indexing will be stored on the “warm data” layer for fast warming up in the future. This can be used to prepare in advance for upcoming spikes in analytics requirements or to significantly reduce TCO.