Spark DataSource API The hudi-spark module offers the DataSource API to write a Spark DataFrame into a Hudi table. There are a number of options available: HoodieWriteConfig :...
Approaches Use Hudi for new partitions alone Convert existing table to Hudi Using Hudi Streamer Using Spark Datasource Writer Using Spark SQL CALL Procedure Using Hudi CLI C...
Connector V2 Health SeaTunnel uses a grading system for connectors to help you understand what to expect from a connector: Alpha Beta General Availability (GA) Expec...
Introduce multi-catalog How to use Future work Introduce multi-catalog A catalog is a metadata namespace that stores information about databases, tables, views, indexes, users...
Syncing to Hive Metastore Pre-requisites Steps Running sync Register the target table in Hive Metastore Conclusion Syncing to Hive Metastore This document walks through the...
Key Generators SimpleKeyGenerator ComplexKeyGenerator NonpartitionedKeyGenerator CustomKeyGenerator Bring your own implementation TimestampBasedKeyGenerator Timestamp is GMT ...
Table format (aka. format) was first proposed by Iceberg, which can be described as follows: It defines the relationship between tables and files, and any engine can query and r...
Pre-requisites Steps Create BigLake Catalog Create BigLake Database Running sync Validating the results Conclusion This document walks through the steps to register a OneTa...