Crate.io on Wednesday announced the general availability of the first non-beta release of CrateDB 1.0, an open source SQL database that enables real-time analytics for machine data applications. This release is an upgrade from version 0.57.
CrateDB is an SQL database alternative to NOSQL machine data management solutions. It gives mainstream SQL developers access to machine data applications that previously were available only using NoSQL solutions.
“CrateDB is one of the few systems in the space that can enable JOIN to handle a large amount of machine data,” said Christian Lutz, CEO of Crate.io.
Founded in 2014, the company’s goal was to reinvent SQL for the machine data era, he told LinuxInsider. Today, 75 percent of its customers use CrateDB to manage machine and Internet of Things data because of its ease of use, performance and versatility.
CrateDB provides an alternative to existing analytic data stores, combining the familiarity of SQL with the versatility of search and the ease of scalability of containers.
“The growth of machine data and the opportunities that businesses have to capitalize on it are outstripping the ability of their data management infrastructure to act on it,” said Jason Stamper, an analyst for data platforms and analytics at 451 Research.
CrateDB’s power lies in its ability “to enable users to collect and analyze vast amounts of data in real time, using SQL commands they already know,” he said.
Crate.io focused on developing a database for the billion-dollar machine data market, Lutz said. “This market is very broad with very special characteristics. It involves more than just industrial machines.”
The machine data management market involves IoT sensors, wearables, industrial IoT, network monitoring, IT/cloud infrastructure monitoring, security audit monitoring and machine learning.
CrateDB manages the unique challenges of machine data management and analysis. It makes possible the handling of millions of data points per second, structured and unstructured data diversity with real-time query performance, and complex queries of big data volumes.
As part of a new machine data stack, CrateDB sits in the stack between input software and specialized apps, Lutz said.
“It should work well in cloud environments. In fact, one of the customer testimonials for CrateDB came from Skyhigh Networks, a successful cloud access security broker,” Pund-IT’s King pointed out.
CrateDB 1.0 has Postgres wire protocol, which enables easier access and integration. It also allows Outer JOINs and Sub-queries.
New operations include Trigonometric, Percentile, Conditional: IF, CASE and Schema, as well as metadata discovery functions. Version 1.0 also has performance and quality improvements.
“The Query engine is the secret sauce,” Lutz said.
Crate.io built something that lets you choose between trade-offs and analysis without waiting. You can do this at scale with simplicity, said George Gilbert, big data and machine learning analyst at Wikibon.
“I haven’t seen others do it with the same flexibility and simplicity. It can do more of the traditional in-warehouse style analysis at the same time, ingesting data at almost the speed of a transactional database,” he told LinuxInsider.
CrateDB has advantages over other products, according to Jodok Batlogg, COO at Crate.io. It is easy to integrate in any enterprise activity. Only one configuration is necessary for the entire cluster — no special nodes are needed. It is very easy to run in a containerized environment. It supports all types of data.
Columnar field caches and a fully distributed query planner enable CrateDB to perform complex queries in real time and overcome many of the performance and flexibility limitations of first-generation distributed SQL databases.
CrateDB also provides SQL with integrated search for data and query versatility. This innovation enables a wide range of analytics, including machine learning and predictive analytics, on time series, full text, JSON, geospatial, and other structured and unstructured data. It does this without having to use different database engines to do so.
Several factors recently have been driving interest in more effective management of machine data, and CrateDB is tapping into this growth, observed Charles King, principal analyst at Pund-IT.
“First and foremost, as companies utilize cloud services and infrastructures, getting the most from those investments requires automating related processes and services,” he told LinuxInsider. “That demands the efficient gathering of machine data automatically generated by systems applications, transaction applications, customers and users which has resulted in success for machine data players like Splunk.”
In addition, the anticipated rise of IoT devices and deployments will increase the volume of machine data by orders of magnitude, he said, so the efforts that companies make today around managing machine data should pay significant dividends in the future.
CrateDB is the first open source SQL database that enables real-time analytics for machine data applications, according to Crate.io.
If that is indeed the case, that would make it superior to previous applications based on NoSQL technologies.
“That should make CrateDB easier to use for developers familiar and comfortable with SQL databases,” King said. “If the company can deliver on its promises, it should become a significant player in this space.”
This release supports container architecture and automatic data sharding for simple scaling. Database scalability is vital for handling variations in machine data volume, but this is normally difficult to do, according to Lutz.
CrateDB can run as a cluster of containers, which enables it to be scaled easily with Docker, Kubernetes or Mesos container platforms. In addition, CrateDB automatically shards and redistributes data across the cluster as it changes size to optimize performance and high availability.
CrateDB is available immediately from Crate.io and the Docker Store and is open source under the Apache 2.0 license.