I'm a last year college student and I'm doing my thesis right now. My title is "Index Suggestion based on Log Analysis". This project will analyze the PostgreSQL transaction log to give index recommendation to the database that will be tested.
This research will develop an index recommender tool by analyzing the attribute that is frequently accessed (using SELECT statement).
But, I found it's hard to find the PostgreSQL log file. My question is, where can I find PostgreSQL log transaction dataset? Or maybe other database log transaction dataset?
You are mixing up the transaction log (WAL) and the regular text log file.
The latter does contain statements (if the configuration is set like that), while the transaction log doesn't contain statements at all, just binary information about what has changed in which block.
You won't be able to recommend an index just from looking at the query, I can't do that either.
I have a suggestion for you: if you want to write a tool that suggests indexes, it should take the output of
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) SELECT /* your query */as input.Moreover, the tool will have to be connected to the database to query table and index metadata (and perhaps statistics). That makes you dependent on the database version, because metadata can change (and do – see partitioned tables), but that won't concern you so much in a thesis paper.
The task is still not simple (query optimization is AI), but then you have at least a chance.