Spark Tuning Guide Writing General Tips Spark failures Hudi consumes too much space in a temp folder while upsert How to tune shuffle parallelism of Hudi jobs ? GC Tuning ...
Spark DataSource API The hudi-spark module offers the DataSource API to write a Spark DataFrame into a Hudi table. There are a number of options available: HoodieWriteConfig :...
first initial last rest flatten without union intersection difference uniq zip unzip object chunk indexOf lastIndexOf sortedIndex findIndex findLastIndex range ...
Introduction How to use How to implement user-defined encryption and decryption Introduction In most production environments, sensitive configuration items such as passwords a...
Support Iceberg Version Support Those Engines Key features Description Supported DataSource Info Database Dependency Data Type Mapping Source Options Task Example Simple: ...
all any append concat drop zipWith zip xprod uniq filter find flatten head indexOf join lastIndexOf map nth pluck prepend range reduce reduceRight reject re...