Iceberg Java API Tables The main purpose of the Iceberg API is to manage table metadata, like schema, partition spec, metadata, and data files that store table data. Table metad...
Branching and Tagging Overview Iceberg table metadata maintains a snapshot log, which represents the changes applied to a table. Snapshots are fundamental in Iceberg as they are ...
Introduction Dataset Config Management Requirement Data Model Versioning Client library Config Store Current Dataset Config Management Implementation Data model Client appli...
Object holder Java implementation of Redis or Valkey based RBucket object is a holder for any type of object. Size is limited to 512Mb. Code example: RBucket < AnyObject > buc...
Gobblin General Questions What is Gobblin? What programming languages does Gobblin support? Does Gobblin require any external software to be installed? What Hadoop versions can ...
Flink Queries Iceberg support streaming and batch read With Apache Flink ‘s DataStream API and Table API. Reading with SQL Iceberg support both streaming and batch read in Flink...
Kyuubi provides several auxiliary SQL functions as supplement to Flink’s Built-in Functions Name Description Return Type Since kyuubi_version Return the version of K...
Managing Watermarks in a Job Basics Task Failures Multi-Dataset Jobs Gobblin State Deep Dive State class hierarchy How States are Used in a Gobblin Job This page has two p...