Getting Started The latest version of Iceberg is 1.8.1 . Spark is currently the most feature-rich compute engine for Iceberg operations. We recommend you to get started with Spar...
Disclaimer Supported Storage System Verified Combination of Spark and storage system HDInsight Spark2.4 on Azure Data Lake Storage Gen 2 Databricks Spark2.4 on Azure Data Lake S...
Partitioning What is partitioning? Partitioning is a way to make queries faster by grouping similar rows together when writing. For example, queries for log entries from a logs ...
Yihui typed out most of the words in this book, which is the only justification for him being the “first” author. Christophe has made substantial contribution to this book by help...
If you are not familiar with Markdown yet, or do not prefer writing Markdown code, RStudio v1.4 has included an experimental visual editor for Markdown documents, which feels simi...
Hive Connector Integration Dependencies Configurations Hive Connector Operations The Kyuubi Hive Connector is a datasource for both reading and writing Hive table, It is imple...
Evolution Iceberg supports in-place table evolution . You can evolve a table schema just like SQL — even in nested structures — or change partition layout when data volume chang...
Mixed-Hive format is a format that has better compatibility with Hive than Mixed-Iceberg format. Mixed-Hive format uses a Hive table as the BaseStore and an Iceberg table as the C...