OracleDB Fetch from Select when handling output CLOB Type cause Python crash

104 views Asked by At

I have been using SQLAlchemy with OracleDB as driver to build my applications.

I observed one of the process being killed unexpectedly while being inside a try except Exception block.

And after trial and error, I found the reason to be the handler being used to read data from CLOB and BLOB to string.

def output_type_handler(cursor, metadata):
    if metadata.type_code is oracledb.DB_TYPE_CLOB:
        return cursor.var(oracledb.DB_TYPE_LONG, arraysize=cursor.arraysize)
    if metadata.type_code is oracledb.DB_TYPE_BLOB:
        return cursor.var(oracledb.DB_TYPE_LONG_RAW, arraysize=cursor.arraysize)
    if metadata.type_code is oracledb.DB_TYPE_NCLOB:
        return cursor.var(oracledb.DB_TYPE_LONG_NVARCHAR, arraysize=cursor.arraysize)

I tried using OracleDB alone without SqlAlchemy and successfully fetched the data and read the CLOB with read().

However, after I enabled the outputtypehandler, even fetchone() cause Jupyter to crash with ExitCode: 3221225477.

I have tried other settings like oracledb.defaults.fetch_lobs = False and it also caused the same problem.

Are there any configs I can use with SQLAlchemy to fix this?

Edit: I found the problem happen when I access a table with multiple CLOB columns across DBLink. It works fine without DBLink.

  • DB Version: 19.6c
  • Client Version: 21c
  • Python: 3.9.5
  • oracledb: 1.4.2
  • SQLAlchemy: 2.0.23
  • Connection Type: Thick Mode

Table Schema:

CREATE TABLE AUTOMAIL_JOB_LIST (
ID NUMBER GENERATED BY DEFAULT ON NULL AS IDENTITY START WITH 1 INCREMENT BY 1,
ID_DEFINITION NUMBER,
SERVICE_GROUP VARCHAR2 (100 Char),
SUBJECT VARCHAR2 (300 Char),
EMAIL_CONTENT CLOB,
ATTACHMENT CLOB,
SEND_TO CLOB,
SEND_CC CLOB,
SEND_BCC CLOB,
STATUS NUMBER(2) DEFAULT 0,
CREATE_DATE DATE DEFAULT SYSDATE,
SEND_DATE DATE DEFAULT NULL,
OWNER_EMAIL VARCHAR2(200 CHAR))

The table is storing emails to send. The columns attachment, send_to, send_cc, and send_bcc usually are just a few email/filepath. The problem came from EMAIL_CONTENT column that store the entire html of the email, which could contain tables.

Fetching them all but EMAIL_CONTENT and then EMAIL_CONTENT alone is fine, but it will cause problem when SELECT *

Sample Code:

from sqlalchemy import create_engine
import pandas as pd

engine = create_engine("", connect_args = {
                                "encoding": "UTF-16",
                                "nencoding": "UTF-16"
                            },
                        thick_mode = True,
                        pool_pre_ping = True,
                        echo_pool = True,
                        pool_size = 1)

query = """
            SELECT *
            FROM AUTOMAIL_JOB_LIST@DBLINK
        """

df = pd.read_sql(query, engine)

OracleDB-Python Version:

import oracledb
import pandas as pd
oracledb.init_oracle_client()

def output_type_handler(cursor, metadata):
    if metadata.type_code is oracledb.DB_TYPE_CLOB:
        return cursor.var(oracledb.DB_TYPE_LONG, arraysize=cursor.arraysize)
    if metadata.type_code is oracledb.DB_TYPE_BLOB:
        return cursor.var(oracledb.DB_TYPE_LONG_RAW, arraysize=cursor.arraysize)
    if metadata.type_code is oracledb.DB_TYPE_NCLOB:
        return cursor.var(oracledb.DB_TYPE_LONG_NVARCHAR, arraysize=cursor.arraysize)

query = """
            SELECT *
            FROM AUTOMAIL_JOB_LIST@DBLINK
        """

oracle_connection_string = oracledb.makedsn(oracle_host, oracle_port, service_name = oracle_service_name)
oracle_connection = oracledb.connect(user = oracle_username, password = oracle_password, dsn = oracle_connection_string, encoding = "UTF-16", nencoding = "UTF-16")
oracle_connection.outputtypehandler = output_type_handler

df = pd.read_sql(query, oracle_connection)

# cursor = oracle_connection.cursor()
# cursor.execute(query)
# row = cursor.fetchone()

Jupyter Notebook to reproduce the problem https://github.com/TuanDang293/BugReproduce/blob/main/Code.ipynb

0

There are 0 answers