How to rename tables and columns in SQL query using Python?

164 views Asked by At

I am trying to rename tables and columns in a SQL query string using Python. I have a mapping of old names to new names stored in a Python dictionary. The SQL query can be complex, with multiple joins, nested structures, and table/column aliases. The column names in the query might not be prefixed with a table name or alias, especially in the case of nested subqueries.

Here is an example SQL query:

SELECT p.name, p.age, j.title     
FROM person AS p     
LEFT JOIN job AS j ON p.job_id = j.job_id     
WHERE p.age > 30  

And here is an example mapping:

mapping = {  
    "person": "individual",  
    "name": "full_name",  
    "age": "years_old",  
    "job": "occupation",  
    "title": "role",  
    "job_id": "identifier"  
}  

I want to transform the SQL query based on this mapping. I have tried using the sqlglot library in Python to parse the SQL query into an abstract syntax tree (AST), traverse the AST, and replace the names.

Here is the code I have tried so far:

from sqlglot import parse
from sqlglot.expressions import Identifier, Column, Table  
  
def transform(node, mapping):  
    if isinstance(node, (Identifier, Column, Table)) and node.args['this'] in mapping:  
        return type(node)(this=mapping[node.args['this']], **{arg: val for arg, val in node.args.items() if arg != 'this'})  
    return node 

expressions = parse(sql)  
  
new_expressions = [expression.transform(lambda node: transform(node, mapping)) for expression in expressions]  
  
new_sql = "".join(expression.sql() for expression in new_expressions)   
  
print(new_sql)  

This script works for simple cases. However, I encountered difficulties when

  1. the same column name is present in multiple tables, and each has a different mapping based on the table it belongs to. Example:
sql = '''SELECT p.name, p.age, j.title   
FROM person AS p   
LEFT JOIN job AS j ON p.job_id = j.job_id   
WHERE p.age > 30'''

job_id is available in both,

  • for person
    • job_id --> occupation_id
  • for job
    • job_id --> identifier
  1. in nested sub-queries, when the same alias is used for different table.
0

There are 0 answers