Introduction Pre-requisites Steps Configuration Details What Next? Introduction The Kafka writer allows users to create pipelines that ingest data from Gobblin sources into ...
Custom Catalog It’s possible to read an iceberg table either from an hdfs path or from a hive table. It’s also possible to use a custom metastore in place of hive. The steps to do...
Flink Connector Apache Flink supports creating Iceberg table directly without creating the explicit Flink catalog in Flink SQL. That means we can just create an iceberg table by s...
Spark Queries To use Iceberg in Spark, first configure Spark catalogs . Iceberg uses Apache Spark’s DataSourceV2 API for data source and catalog implementations. Querying with S...
Gobblin Execution Modes Overview One important feature of Gobblin is that it can be run on different platforms. Currently, Gobblin can run in standalone mode (which runs on a sing...
DDL commands CREATE Catalog Hive catalog This creates an Iceberg catalog named hive_catalog that can be configured using 'catalog-type'='hive' , which loads tables from Hive m...
Introduction Docker Docker Repositories Run the docker image with simple wikipedia jobs Use Gobblin Standalone on Docker for Kafka and HDFS Ingestion Run Gobblin as a Service ...