Advantages of Migrating to Gobblin Kafka Ingestion Related Job Config Properties Config properties for pulling Kafka topics Config properties for compaction Deployment and Chec...
Custom Catalog It’s possible to read an iceberg table either from an hdfs path or from a hive table. It’s also possible to use a custom metastore in place of hive. The steps to do...
Over the years, LinkedIn’s data infrastructure team built custom solutions for ingesting diverse data entities into our Hadoop eco-system. At one point, we were running 15 t...
Introduction Docker Docker Repositories Run the docker image with simple wikipedia jobs Use Gobblin Standalone on Docker for Kafka and HDFS Ingestion Run Gobblin as a Service ...
Spark Queries To use Iceberg in Spark, first configure Spark catalogs . Iceberg uses Apache Spark’s DataSourceV2 API for data source and catalog implementations. Querying with S...