Scriptella - How to get and re-use autogenerated ids?

909 views Asked by At

I'm using Scriptella for ETL operations, working with many tables referenced with autogenerated ids. I want to re-use these ids without using subqueries, this is my script fragment of the etl file:

<script connection-id="out" if="rownum>1">

SELECT nextval('SQC_CLASE') AS claseId;
INSERT INTO zoologia.clase VALUES( ?claseId, ?phylumId, ?clase, ?subclase, ?infraclase, true );

SELECT nextval('SQC_ORDEN') AS ordenId;
INSERT INTO zoologia.orden VALUES( ?ordenId, ?claseId, ?orden, ?suborden, true );

SELECT nextval('SQC_SUPERFAMILIA') AS superfamiliaId;
INSERT INTO zoologia.superfamilia VALUES( ?superfamiliaId, ?ordenId, ?superfamilia, true );

SELECT nextval('SQC_FAMILIA') AS familiaId;
INSERT INTO zoologia.familia VALUES( ?familiaId, ?superfamiliaId, ?familia, ?subfamilia, ?tribu, true );

SELECT nextval('SQC_GENERO') AS generoId;
INSERT INTO zoologia.genero VALUES( ?generoId, ?familiaId, ?genero, true );

SELECT nextval('SQC_ESPECIE') AS especieId;
INSERT INTO zoologia.especie VALUES( ?especieId, ?generoId, ?especie, ?subespecie, ?variedad, ?genero, true );

</script>

This is obviously wrong because SELECT can't be performed inside script, right? I'm not really sure how to do it whitout subqueries. I'm using PostgreSQL.

EDIT: What I want to achieve is, for example, get the value of the autogenerated id used in the insertions for the first table, for using it in the insertions of the second table, because the records should be referenced

2

There are 2 answers

1
Nelson On BEST ANSWER

Thanks to a user from Scriptella forums, this is the solution, a single query with all the values of the sequences:

<query connection-id="external">
    <query connection-id="sizoo">
        SELECT nextval('SQC_PHYLUM') AS phylumId
        , nextval('SQC_CLASE') AS claseId
        , nextval('SQC_ORDEN') AS ordenId
        , nextval('SQC_SUPERFAMILIA') AS superfamiliaId
        , nextval('SQC_FAMILIA') AS familiaId
        , nextval('SQC_GENERO') AS generoId
        , nextval('SQC_ESPECIE') AS especieId;

        <script connection-id="sizoo" if="rownum>1">
            INSERT INTO zoologia.phylum VALUES( ?phylumId, ?phylum, true );
            INSERT INTO zoologia.clase VALUES( ?claseId, ?phylumId, ?clase, ?subclase, ?infraclase, true );
            INSERT INTO zoologia.orden VALUES( ?ordenId, ?claseId, ?orden, ?suborden, true );
            INSERT INTO zoologia.superfamilia VALUES( ?superfamiliaId, ?ordenId, ?superfamilia, true );
            INSERT INTO zoologia.familia VALUES( ?familiaId, ?superfamiliaId, ?familia, ?subfamilia, ?tribu, true );
            INSERT INTO zoologia.genero VALUES( ?generoId, ?familiaId, ?genero, true );
            INSERT INTO zoologia.especie VALUES( ?especieId, ?generoId, ?especie, ?subespecie, ?variedad, ?genero, true );
        </script>
    </query>
</query>
2
Craig Ringer On

Your code initally looked totally bogus, as nothing seems to connect the SELECT to the following INSERT so even if it ran you'd just be generating an ID and throwing it away. It looks like your scripting tool might be automatically defining the results of SELECT column-aliases as variables that can be referenced in later queries, though; see "copy to another database" in the Scriptella tutorial. A quick glance at that suggest that what you want to do might work, but you'd have to use nested <query/> and <script/> blocks to do it.

The correct way to use a sequence-generated ID is one of:

INSERT INTO zoologia.especie VALUES( nextval('SQC_ESPECIE'), ?generoId, ?especie, ?subespecie, ?variedad, ?genero, true );

INSERT INTO zoologia.especie VALUES( DEFAULT, ?generoId, ?especie, ?subespecie, ?variedad, ?genero, true );

INSERT INTO zoologia.especie(generoId, especie, subespecie, variedad, genero, someothercol) 
VALUES( ?generoId, ?especie, ?subespecie, ?variedad, ?genero, true );

Your question initially appeared to be about re-using deleted IDs (ie gapless sequences) but it looks like you've clarified it to remove that.


Update after edits + comment changed the meaning of the question:

If you're trying to use the generated ID from one row in subsequent INSERTs, you must either:

  • Capture the ID using INSERT ... RETURNING or by calling currval('the_id_sequence') after the INSERT, store the ID in a client-side variable in your scripting language, and pass that to subsequent INSERTs; or

  • Use currval('the_id_sequence') in the VALUES list of subsequent inserts.

I've never even heard of Scriptella let alone used it, so I can't help with the 1st option of using client side variables. I'd be very surprised if it didn't have a way to store the results of a SELECT or INSERT ... RETURNING for later use, though. A really quick glance suggests it's done with nested <query/> and <script/> blocks, but that's just a 30-second glance at the tutorial.

The 2nd option is simple. Say you've just inserted:

INSERT INTO zoologia.genero VALUES( DEFAULT, ?familiaId, ?genero, true );

and want to insert a new row in especie that has the generoId of the just-inserted genero. Presuming that the ID sequence for genero follows the standard naming PostgreSQL uses, tablename_columnname_id_seq, you'd use:

INSERT INTO zoologia.especie VALUES( DEFAULT, currval('genero_generoId_seq'), ...);

See: