Introduction Pre-requisites Steps Configuration Details What Next? Introduction The Kafka writer allows users to create pipelines that ingest data from Gobblin sources into ...
Over the years, LinkedIn’s data infrastructure team built custom solutions for ingesting diverse data entities into our Hadoop eco-system. At one point, we were running 15 t...
Introduction Implementation Summary Entities Work Flow Configuration Introduction The Google Search Console data ingestion project is to download query and analytics data f...
Partitioning What is partitioning? Partitioning is a way to make queries faster by grouping similar rows together when writing. For example, queries for log entries from a logs ...
RisingWave RisingWave is a Postgres-compatible SQL database designed for real-time event streaming data processing, analysis, and management. It can ingest millions of events per...
Advantages of Migrating to Gobblin Kafka Ingestion Related Job Config Properties Config properties for pulling Kafka topics Config properties for compaction Deployment and Chec...
Custom Catalog It’s possible to read an iceberg table either from an hdfs path or from a hive table. It’s also possible to use a custom metastore in place of hive. The steps to do...
Lock Redis or Valkey based distributed reentrant Lock object for Java and implements Lock interface. Uses pub/sub channel to notify other threads across all Redisson instances w...