Spark Structured Streaming Iceberg uses Apache Spark’s DataSourceV2 API for data source and catalog implementations. Spark DSv2 is an evolving API with different levels of support...
Flink Queries Iceberg support streaming and batch read With Apache Flink ‘s DataStream API and Table API. Reading with SQL Iceberg support both streaming and batch read in Flink...
Documentation Apache Iceberg is an open table format for huge analytic datasets. Iceberg adds tables to compute engines including Spark, Trino, PrestoDB, Flink, Hive and Impala u...
Set Flink configuration information in the job How to set up a simple Flink job How to run a job in a project Flink is a powerful high-performance distributed stream processing...
Apache AGE is a PostgreSQL extension that provides graph database functionality. AGE is an acronym for A Graph Extension, and is inspired by Bitnine’s fork of PostgreSQL 10, Agens...
Gobblin General Questions What is Gobblin? What programming languages does Gobblin support? Does Gobblin require any external software to be installed? What Hadoop versions can ...
Branching and Tagging Overview Iceberg table metadata maintains a snapshot log, which represents the changes applied to a table. Snapshots are fundamental in Iceberg as they are ...
Java API Quickstart Create a table Tables are created using either a Catalog or an implementation of the Tables interface. Using a Hive catalog The Hive catalog connects to...