When is Hudi useful for me or my organization? What are some non-goals for Hudi? What is incremental processing? Why does Hudi docs/talks keep talking about it? How is Hudi opti...
Writing Tables org.apache.parquet.io.InvalidRecordException: Parquet/Avro schema mismatch: Avro field ‘col1’ not found java.lang.UnsupportedOperationException: org.apache.parquet....
Background How is compaction different from clustering? Clustering Architecture Overall, there are 2 steps to clustering Schedule clustering Execute clustering Clustering Use...
http asynchronous write Write with Apache StreamPark™ http asynchronous write support type Configuration list of HTTP asynchronous write HTTP writes data asynchronously Other ...
Preparation when using Flink SQL Client Flink’s Python API Adding catalogs. Catalog Configuration Hive catalog Creating a table Writing Branch Writes Reading Type conversi...
Deploying Hudi Streamer Spark Datasource Writer Jobs Upgrading Downgrading Migrating This section provides all the help you need to deploy and operate Hudi tables at scale. ...
Support Those Engines Key Features Description Supported DataSource Info Source Options Task Example Simple Regex Topic AWS MSK SASL/SCRAM AWS MSK IAM Kerberos Authenticat...