Spark DataSource API Daft Spark DataSource API The hudi-spark module offers the DataSource API to read a Hudi table into a Spark DataFrame. A time-travel query example: val ...
Introduction Support Those Engines API Event Data API Event Listener API Event Collect API Configuration Listener Zeta Engine Flink Engine Spark Engine Introduction The...
Building With Maven Building A Submodule Individually Building Submodules Individually Skipping Some Modules Building Kyuubi Against Different Apache Spark Versions Building K...
Support Those Engines Key features Description Data Type Mapping Options How to Create a Socket Data Synchronization Jobs Socket source connector Support Those Engines ...
Hudi Integration Dependencies Configurations Hudi Operations Apache Hudi (pronounced “hoodie”) is the next generation streaming data lake platform. Apache Hudi brings core war...
Catalogs configuration Using Mixed-Format in a standalone catalog Using Mixed-Format in session catalog The high availability configuration Catalogs configuration Using Mixe...
Support Those Engines Key features Description Data Type Mapping Options Task Example Simple: Changelog new version Slack sink connector Support Those Engines Spark...
Incremental collection Use in single connections Change incremental collection mode in session Typically, when a user submits a SELECT query to Spark SQL engine, the Driver cal...
Building(with velox Backend) Build gluten velox backend package Usage Installing Configure Gluten is a Spark plugin developed by Intel, designed to accelerate Apache Spark w...