A Demo using Docker containers Prerequisites Setting up Docker Cluster Build Hudi Bringing up Demo Cluster Demo Step 1 : Publish the first batch to Kafka Step 2: Incrementall...
Referencing the JDBC Driver Libraries Using the Driver in Java Code Maven sbt Gradle Using the Driver in a JDBC Application Registering the Driver Class Building the Connect...
Setup Spark 3 Support Matrix Spark Shell/SQL Setup project Create Table Insert data Query data Update data Merging Data Delete data Time Travel Query Incremental query ...
What are some ways to write a Hudi table? How is a Hudi writer job deployed? Can I implement my own logic for how input records are merged with record on storage? How do I delet...
Deployment models with supported concurrency controls Model A: Single writer with inline table services Single Writer Guarantees Model B: Single writer with async table services ...
When is Hudi useful for me or my organization? What are some non-goals for Hudi? What is incremental processing? Why does Hudi docs/talks keep talking about it? How is Hudi opti...
Writing Tables org.apache.parquet.io.InvalidRecordException: Parquet/Avro schema mismatch: Avro field ‘col1’ not found java.lang.UnsupportedOperationException: org.apache.parquet....
Background How is compaction different from clustering? Clustering Architecture Overall, there are 2 steps to clustering Schedule clustering Execute clustering Clustering Use...