Using Apache Hadoop resource in Flink on Kubernetes 1. Apache HDFS 1.1 Add the shaded jar 1.2. add core-site.xml and hdfs-site.xml 2. Apache Hive 2.1. Add Hive-related jars 2...
Kyuubi On Apache Kudu What is Apache Kudu Why Kyuubi on Kudu Kudu Integration with Apache Spark Kudu Integration with Kyuubi Install Kudu Spark Dependency Start Kyuubi Start B...
Simple Data Types Null Agtype NULL vs Postgres NULL Integer Float Numeric Bool String Composite Data Types List Lists in general NULL in a List Access Individual Elements...
Deploying Hudi Streamer Spark Datasource Writer Jobs Upgrading Downgrading Migrating This section provides all the help you need to deploy and operate Hudi tables at scale. ...
Pre-Splitting New Tables Multiple Ingest Clients Bulk Ingest Logical Time for Bulk Ingest MapReduce Ingest Accumulo is often used as part of a larger data processing and stor...
Upgrading from 1.10 or 2.0 to 2.1 Create ZooKeeper snapshot (optional - but recommended) Rename master Properties, Config Files, and Script References Pre-Upgrade the property st...
The Basics of AQE Dynamically Switch Join Strategies Dynamically Coalesce Shuffle Partitions Other Tips for Best Practises How to set spark.sql.adaptive.advisoryPartitionSizeInByt...
Indexing Multi-modal Indexing Index Types in Hudi Global and Non-Global Indexes Configs Spark based configs Flink based configs Indexing Strategies Workload 1: Late arriving...