Spark Structured Streaming Iceberg uses Apache Spark’s DataSourceV2 API for data source and catalog implementations. Spark DSv2 is an evolving API with different levels of support...
Overview Information Recorded Job Execution Information Task Execution Information Default Implementation Rest Query API Example Queries Job Execution History Server Over...
Syncing to Glue Data Catalog This document walks through the steps to register an Apache XTable™ (Incubating) synced table in Glue Data Catalog on AWS. Pre-requisites Source ta...
Features and Limitations Features Apache XTable™ (Incubating) provides users with the ability to translate metadata from one table format to another. Apache XTable™ (Incubatin...
Overview Information Recorded Job Execution Information Task Execution Information Default Implementation Rest Query API Example Queries Job Execution History Server Over...
Metrics Reporting As of 1.1.0 Iceberg supports the MetricsReporter and the MetricsReport APIs. These two APIs allow expressing different metrics reports while supporting a plu...
Partitioning What is partitioning? Partitioning is a way to make queries faster by grouping similar rows together when writing. For example, queries for log entries from a logs ...
Flink Connector Apache Flink supports creating Iceberg table directly without creating the explicit Flink catalog in Flink SQL. That means we can just create an iceberg table by s...