Configuring Merge Join in PostgreSQL

724 views Asked by At

I'm using PostgreSQL with big tables, and query takes too much time.

I have two tables. The first one has about 6 million rows (data table), and the second one has about 30000 rows (users table).

Each user has about 200 rows in data table. Later, data and users tables may increase up to 30 times.

My query is:

SELECT d.name, count(*) c
  FROM users AS u JOIN data AS d on d.id = u.id  
  WHERE u.language = 'eng' GROUP BY d.name ORDER BY c DESC LIMIT 10;

90% of users has eng language, and query time is 7 seconds. Each column is indexed!

I read about Merge Join and it should be really fast, so I sorted tables by id and forced Merge Join, but time increased up to 20 seconds.

I suppose, the tables configuration is wrong, but I don't know how to fix it.

Should I make other improvements?

1

There are 1 answers

1
Gordon Linoff On

For this query:

SELECT d.name, count(*) c
FROM users u JOIN
     data d
     on d.id = u.id  
WHERE u.language = 'eng'
GROUP BY d.name
ORDER BY c DESC
LIMIT 10;

First, try indexes: users(language, id), data(id, name). See if this speeds up the query.

Second, what is d.name? Can a user have more than one of them? Is there a table of valid values? Depending on the answers to these questions, there may be other ways to structure the query.