Since spanner does not have ddl feature like
insert into dest as (select * from source_table)
How do we select subset of a table and copy that rows into another table ?
I am trying to write data to temporary table and then move data to archive table at the end of day. But only solution i could find so far is, select rows from source table and write them to new table. Which is done using java api, and it does not have a ResultSet to Mutation converter, so i need to map every column of table to new table, even they are exactly same.
Another thing is updating just one column data, like there is no way of doing "update table_name set column= column-1 "
Again to do that, i need to read that row and map every field to update Mutation, but this is not useful if have many tables, i need to code for all of them, a ResultSet -> Mutation converted would be nice too.
Is there any generic Mutation cloner and/or any other way to copy data between tables?
As of version 0.15 this open source JDBC Driver supports bulk INSERT-statements that can be used to copy data from one table to another. The INSERT-syntax can also be used to perform bulk UPDATEs on data.
Bulk insert example:
Bulk update is done using an INSERT-statement with the addition of ON DUPLICATE KEY UPDATE. You have to include the value of the primary key in your insert statement in order to 'force' a key violation which in turn will ensure that the existing rows will be updated:
You can use the JDBC driver with for example SQuirreL to test it, or to do ad-hoc data manipulation.
Please note that the underlying limitations of Cloud Spanner still apply, meaning a maximum of 20,000 mutations in one transaction. The JDBC Driver can work around this limit by specifying the value
AllowExtendedMode=true
in your connection string or in the connection properties. When this mode is allowed, and you issue a bulk INSERT- or UPDATE-statement that will exceed the limits of one transaction, the driver will automatically open an extra connection and perform the bulk operation in batches on the new connection. This means that the bulk operation will NOT be performed atomically, and will be committed automatically after each successful batch, but at least it will be done automatically for you.Have a look here for some more examples: http://www.googlecloudspanner.com/2018/02/data-manipulation-language-with-google.html