Partitioning What is partitioning? Partitioning is a way to make queries faster by grouping similar rows together when writing. For example, queries for log entries from a logs ...
Components like Druid, Hive, Ranger, Oozie, and Superset require an operational database. During installation, you have the option to use an existing database or have Ambari insta...
Custom Catalog It’s possible to read an iceberg table either from an hdfs path or from a hive table. It’s also possible to use a custom metastore in place of hive. The steps to do...
Session Conf Advisor The steps of injecting session configs Example New in version 1.5.0. Session Conf Advisor Kyuubi supports inject session configs with custom config ad...
RisingWave RisingWave is a Postgres-compatible SQL database designed for real-time event streaming data processing, analysis, and management. It can ingest millions of events per...
Flink Connector Apache Flink supports creating Iceberg table directly without creating the explicit Flink catalog in Flink SQL. That means we can just create an iceberg table by s...
Features and Limitations Features Apache XTable™ (Incubating) provides users with the ability to translate metadata from one table format to another. Apache XTable™ (Incubatin...
How Hive Registration Works in Gobblin HiveSpec HiveRegistrationPolicy HiveSerDeManager Predicate and Activity How to Use Hive Registration in Your Gobblin Job Hive Regist...
Querying from Apache Spark To read an Apache XTable™ (Incubating) synced target table (regardless of the table format) in Apache Spark locally or on services like Amazon EMR, Goog...