Querying with SQL
Querying Mixed-Format table by merge on read
Using Select statement to query on Mixed-Format tables.
SELECT * FROM arctic_catalog.db.sample
The Mixed-Format connector will merge the data from BaseStore and ChangeStore.
Query on change store
For a Mixed-Format table with primary keys. you can query on ChangeStore by .change.
SELECT * FROM arctic_catalog.db.sample.change+---+----+----+---------------+------------+--------------+| id|name|data|_transaction_id|_file_offset|_change_action|+---+----+----+---------------+------------+--------------+| 1|dddd|abcd| 3| 1| INSERT|| 1|dddd|abcd| 3| 2| DELETE|+---+----+----+---------------+------------+--------------+
The addition columns are:
- _transaction_id: The transaction ID allocated by AMS during data write is assigned per SQL execution in batch mode and per checkpoint in streaming mode.
- _file_offset:The order of data written with the same
_transaction_id. - _change_action:The type of change record,
INSERTorDELETE.
Querying with DataFrames
You can read the Mixed-Format table by Spark DataFrames:
val df = spark.read.table("arctic_catalog.db.sample")df.count
And visit the ChangeStore by .change.
val df = spark.read.table("arctic_catalog.db.sample.change")df.count
