Querying from Apache Spark To read an Apache XTable™ (Incubating) synced target table (regardless of the table format) in Apache Spark locally or on services like Amazon EMR, Goog...
Over the years, LinkedIn’s data infrastructure team built custom solutions for ingesting diverse data entities into our Hadoop eco-system. At one point, we were running 15 t...
Querying from Google BigQuery Iceberg tables To read an Apache XTable™ (Incubating) synced Iceberg table from BigQuery , you have two options: Using Iceberg JSON metadata file ...
Description Key features Options project_id [string] collection [string] credentials [string] common options Example Changelog next version Google Firestore sink connec...
Contributing to Gobblin Code Contributions Documentation Contributions Contributing to Gobblin You can contribute to Gobblin in multiple ways. For resources and guides, please...
Collect Trace with Jaeger Jaeger Configuration Migrating from Zipkin Collect Trace with Jaeger Jaeger Jaeger , inspired by Dapper and OpenZipkin , is a distributed tracing ...
GCS Configs GCS Credentials GCS Libs For Hudi storage on GCS, regional buckets provide an DFS API with strong consistency. GCS Configs There are two configurations required ...
Avro files File copy Query based Rest Api Google Analytics Google Drive Google Webmaster Hadoop Text Input Hello World Hive Avro-to-ORC Hive compliance purging JSON Kaf...