To read a OneTable synced target table (regardless of the table format) in Apache Spark locally or on services like Amazon EMR, Google Cloud’s Dataproc, Azure HDInsight, or Databr...
Scan planning Metadata filtering Data filtering Iceberg is designed for huge tables and is used in production where a single table can contain tens of petabytes of data. Even ...
Apache Paimon (Incubating) Integration Dependencies Configurations Apache Paimon (Incubating) Operations Apache Paimon(incubating) is a streaming data lake platform that suppo...
Hudi Integration Configurations Hudi Operations Apache Hudi (pronounced “hoodie”) is the next generation streaming data lake platform. Apache Hudi brings core warehouse and dat...
Support Those Engines Description Using Dependency Key features Data Type Mapping Sink Options hosts [array] index [string] primary_keys [list] key_delimiter [string] user...
Support Apache RocketMQ Version Support These Engines Key Features Description Source Options start.mode.offsets Task Example Simple: Specified format consumption Simple: S...