Partitioning What is partitioning? Partitioning is a way to make queries faster by grouping similar rows together when writing. For example, queries for log entries from a logs ...
Custom Catalog It’s possible to read an iceberg table either from an hdfs path or from a hive table. It’s also possible to use a custom metastore in place of hive. The steps to do...
Introduction Record format Configuration General configuration values Authentication No credentials Using certificates Using bucket password Document level expiration 1 - Ex...
Querying with SQL Querying Mixed-Format table by merge on read Query on change store Querying with DataFrames Querying with SQL Querying Mixed-Format table by merge on read ...
Native implementation Client Side caching is implemented using client tracking listener through RESP3 protocol available in Redis or Valkey. It’s used to speed up read operation...
Gobblin Execution Modes Overview One important feature of Gobblin is that it can be run on different platforms. Currently, Gobblin can run in standalone mode (which runs on a sing...
Development Roadmap List of Features and Milestones Connector Implementation Status Event Store Implementation Status Development Roadmap The development roadmap of Apache Ev...