Apache Kudu Guide
Apache Kudu is a columnar storage manager developed for the Hadoop platform. Kudu shares the common technical properties of Hadoop ecosystem applications: It runs on commodity hardware, is horizontally scalable, and supports highly available operation.
Apache Kudu is a top-level project in the Apache Software Foundation.
Kudu's benefits include:
- Fast processing of OLAP workloads.
- Integration with MapReduce, Spark, Flume, and other Hadoop ecosystem components.
- Tight integration with Apache Impala, making it a good, mutable alternative to using HDFS with Apache Parquet.
- Strong but flexible consistency model, allowing you to choose consistency requirements on a per-request basis, including the option for strict serialized consistency.
- Strong performance for running sequential and random workloads simultaneously.
- Easy administration and management through Cloudera Manager.
- High availability. Tablet Servers and Master use the Raft consensus algorithm, which ensures availability as long as more replicas are available than unavailable. Reads can be serviced by read-only follower tablets, even in the event of a leader tablet failure.
- Structured data model.
By combining all of these properties, Kudu targets support applications that are difficult or impossible to implement on currently available Hadoop storage technologies. Applications
for which Kudu is a viable solution include:
- Reporting applications where new data must be immediately available for end users
- Time-series applications that must support queries across large amounts of historic data while simultaneously returning granular queries about an individual entity
- Applications that use predictive models to make real-time decisions, with periodic refreshes of the predictive model based on all historical data
Continue reading:
- Apache Kudu Concepts and Architecture
- Apache Kudu Usage Limitations
- Overview of Apache Kudu Installation and Upgrade in CDH
- Apache Kudu Configuration
- Apache Kudu Administration
- Developing Applications With Apache Kudu
- Using Apache Impala with Kudu
- Apache Kudu Schema Design
- Apache Kudu Transaction Semantics
- Apache Kudu Background Maintenance Tasks
- Troubleshooting Apache Kudu
- More Resources for Apache Kudu
Related information throughout the CDH library:
In CDH 5 and CDH 6, the Kudu documentation for Release Notes, Installation, Upgrading, and Security has been integrated alongside the corresponding information for other Hadoop components.
Page generated July 25, 2018.
<< Kafka Frequently Asked Questions | ©2016 Cloudera, Inc. All rights reserved | Apache Kudu Concepts and Architecture >> |
Terms and Conditions Privacy Policy |