Import massive table from Oracle to PostgreSQL with oracle-fdw return ORA-01406

758 views Asked by At

I work on a project to transfer data from an Oracle database to a PostgreSQL database to create a datawarehouse with bash & SQL scripts. To access to the Oracle database, I use the PostgreSQL extension oracle-fdw.

One of my scripts import data from a massive table (~ 100 000 000 new rows/day). This table is partitioned and each partition contains 1 day of data. The query I use to import data looks like that :

INSERT INTO postgre_target_table (some_fields)
SELECT some_aggregated_fields -- (~150 fields)
FROM oracle_source_table
WHERE partition_id = :v_partition_id AND some_others_filters
GROUP BY primary_key;

On DEV server, the query works fine (there is much less data on this server) but in PREPROD, it returns the error ORA-01406: fetched column value was truncated.

In some posts, people say that the output fields may be too small but if I try to send a simple SELECT query without INSERT or GROUP BY I have the same error.

Another idea I found in another post is to create an Oracle side view but in my query I use multiple parameters that I cannot use in a view.

The last idea I found is to create an Oracle stored procedure that fills a table with aggregated data and then import data from this table but the Oracle database is critical and my customer prefers to avoid adding more data on it.

Now, I'm starting to think there's no solution and it's not good...

PostgreSQL version : 12.4 / Oracle version : 11.2

UPDATE

It seems my problem is more complecated than I thought.

After applying the modification given by Laurenz Albe, the query runs correctly on PGAdmin but the problem still appears when I use psql command. Moreover, another query seems to have the same problem. This other query does not use the same source table as the first query, it uses 4 joined tables without any partition. The common point between these queries is the structure.

The detail I omit to specify in the original post is that the purpose of both queries is to pivot a table. They look like that :

SELECT osr.id,
   MIN(CASE osr.category
      WHEN 123 THEN
         1
      END) AS field1,
   MIN(CASE osr.category
      WHEN 264 THEN
         1
      END) AS field2,
   MIN(CASE osr.category
      WHEN 975 THEN
         1
      END) AS field3,
...
FROM oracle_source_table osr
WHERE osr.category IN (123, 264, 975, ...)
GROUP BY osr.id;

Now that I have detailed what the queries look like, I can give you some results I had with the second one without changing the value of max_long (this query is lighter than the first one) :

  1. Sometimes it works (~10%), sometimes it failed (~90%) on PGadmin but it never works with psql command
  2. If I delete the WHERE, it always works

I don't understand why deleting the WHERE change something, the field used in this clause is a NUMBER(6, 0) between 0 and 2500 and it is still used in the SELECT clause... Oh and in the 4 Oracle tables used by this query, there is no LONG datatype, only NUMBER datatype is used.

Among 20 queries I have, only these two have a problem, their structure is similar and I don't believe in coincidences.

1

There are 1 answers

1
Laurenz Albe On

Don't despair!

Set the max_long option on the foreign table big enough that all your oversized data fit.

The documentation has the details:

  • max_long (optional, defaults to "32767")

The maximal length of any LONG, LONG RAW and XMLTYPE columns in the Oracle table. Possible values are integers between 1 and 1073741823 (the maximal size of a bytea in PostgreSQL). This amount of memory will be allocated at least twice, so large values will consume a lot of memory.

If max_long is less than the length of the longest value retrieved, you will receive the error message

ORA-01406: fetched column value was truncated

Example:

ALTER FOREIGN TABLE my_tab OPTIONS (ADD max_long '1000000');