Partitioning What is partitioning? Partitioning is a way to make queries faster by grouping similar rows together when writing. For example, queries for log entries from a logs ...
Source schema Converters Converters available in Gobblin Schema specification Supported data types by different converters Primitive types Complex types Array Map Record En...
Introduction Filter on aggregate function results Sort results before using collect on them Limit branching of a path search Introduction Using WITH , you can manipulate the ...
Introduction Record format Configuration General configuration values Authentication No credentials Using certificates Using bucket password Document level expiration 1 - Ex...
Spark Streaming Spark Streaming You can write Hudi tables using spark’s structured streaming. Scala // spark-shell // prepare to stream write to new table import org ....
Evolution Iceberg supports in-place table evolution . You can evolve a table schema just like SQL — even in nested structures — or change partition layout when data volume chang...
Lock Redis or Valkey based distributed reentrant Lock object for Java and implements Lock interface. Uses pub/sub channel to notify other threads across all Redisson instances w...
Based on your Internet access, choose one of the following options: No Internet Access This option involves downloading the repository tarball, moving the tarball to the sele...
The Ranger database user in Amazon RDS PostgreSQL Server should be created before installing Ranger and should be granted an existing role which must have the role CREATEDB. Usi...
Tomcat Session Redisson implements Redis or Valkey based Tomcat Session Manager. It stores session of Apache Tomcat in Redis or Valkey and allows to distribute requests across a ...