Using the Python library sqlglot, where can I find documentation that explains:
- Which attributes I should expect to find on which expression nodes types (which arg types does Join, Table, Select, etc. have?)
- What overall structure I should expect the AST to have for various kinds of SQL statements? (e.g. that a Select has a "joins" child, which in turn has a list of tables) And what "arg" name do I use to access each of these?
For example, what documentation could I look at to know that code like below (from here) will find the names of table within the joins? How would I know to request "joins" from node.args? What does "this" refer to?
node = sqlglot.parse_one(sql)
for join in node.args["joins"]:
table = join.find(exp.Table).text("this")
My use case is to parse a bunch of Hive SQL scripts in order to find FROM, INSERT, ADD/DROP TABLE statements/clauses within the scripts, for analyzing which statements interact with which tables. So I am using sqlglot as a general-purpose SQL parser/AST, rather than as a SQL translator.
I have generated a copy of the pdocs locally, but it only tells me which Python API methods are available on the Expression nodes. It does not seem to answer the questions above, unless I am looking in the wrong place.
You can look in the expressions.py file
https://github.com/tobymao/sqlglot/blob/main/sqlglot/expressions.py
every expression type has arg_types