Over the years, LinkedIn’s data infrastructure team built custom solutions for ingesting diverse data entities into our Hadoop eco-system. At one point, we were running 15 t...
Source schema Converters Converters available in Gobblin Schema specification Supported data types by different converters Primitive types Complex types Array Map Record En...
Introduction Implementation Summary Entities Work Flow Configuration Introduction The Google Search Console data ingestion project is to download query and analytics data f...
Custom Catalog It’s possible to read an iceberg table either from an hdfs path or from a hive table. It’s also possible to use a custom metastore in place of hive. The steps to do...
Advantages of Migrating to Gobblin Kafka Ingestion Related Job Config Properties Config properties for pulling Kafka topics Config properties for compaction Deployment and Chec...
Features and Limitations Features Apache XTable™ (Incubating) provides users with the ability to translate metadata from one table format to another. Apache XTable™ (Incubatin...
Id generator Redis or Valkey based Java Id generator RIdGenerator generates unique numbers but not monotonically increased. At first request, batch of id numbers is allocated and...
Flink Connector Apache Flink supports creating Iceberg table directly without creating the explicit Flink catalog in Flink SQL. That means we can just create an iceberg table by s...
Evolution Iceberg supports in-place table evolution . You can evolve a table schema just like SQL — even in nested structures — or change partition layout when data volume chang...