Postgres default sort by id - worldship

49.4k views Asked by At

I need to setup worldship to pull from one of our postgres databases. I need to have it so that the packages are sorted by id. I have no way (that i am aware of) of having worldship send the order by clause so I need to have the default for the records returned to be returned by id.

On a second note I have no idea how postgres default sorts it looks like it by the last time the record was changed so if i write a two records id 1,2 then change record 2 when I run the query it returns them with record 2 being first.

5

There are 5 answers

1
Denis de Bernardy On BEST ANSWER

Rows are returned in an unspecified order, per sql specs, unless you add an order by clause. In Postgres, that means you'll get rows in, basically, the order that live rows read on the disk.

If you want a consistent order without needing to add an order by clause, create a view as suggested in Jack's comment.

2
Daniel Harcek On

You could eventually use a sorted index, which should guarantee you order of retrieved rows in case the query plan hits the index, or if you force it, but this approach will be more than circuitous :). ORDER BY clause is the way to go as mentioned already.

5
Webucator On

For what it's worth, which probably isn't much, from my testing, it appears that PostgreSQL's "default" ordering is based on the time the records were last updated. The most recently updated records will appear last. Note that I couldn't find any documentation to support this. It's just what I've found from my own testing.

0
AudioBubble On

There is no such thing as a "default sort". Rows in a table are not sorted.

You could fake this with a view (as suggested by Jack Maney) there is no way you can influence the order of the rows that are returned.

But if you do that, be aware that adding an additional ORDER BY to a SELECT based on that view will sort the data twice.

Another option might be to run the CLUSTER command on that table to physically order the rows on the disk according to the column you want. But this sill does not guarantee that the rows are returned in that order. Not even with a plain SELECT * FROM your_table (but chances are reasonably high for that). You will need to re-run this statement on a regular basis because the order created by the CLUSTER command is not automatically maintained.

0
Theodore R. Smith On

I have several tables that have several million rows, and the ORDER BY package for the vast majority of queries was dramatically slowing things down.

The way I did this with the Eloquent ORM is

  1. Create two migrations with a far future date. I chose 9999_99_99_999999_resort_packages_table.php and 9999_99_99_999999_reset_migrations.php so that they will always run last.
  2. Inside the both migrations, make sure to end the script with a DELETE FROM migrations WHERE id in ( SELECT id FROM migrations ORDER BY id desc LIMIT 1 ). This will delete both records so that they will be rerun every time.
  3. Inside the resort migration, use the following SQL:
-- Create a temporary sorted table.
CREATE TEMPORARY TABLE package_tmp AS SELECT * FROM package ORDER BY package;

-- Drop all of the foreign keys
ALTER TABLE "package_stats"          DROP CONSTRAINT "package_stats_package_id_foreign";
ALTER TABLE "packagist_stats"        DROP CONSTRAINT "packagist_stats_package_id_foreign";
ALTER TABLE "code_quality"           DROP CONSTRAINT "raw_code_quality_package_id_foreign";
ALTER TABLE "code_stats"             DROP CONSTRAINT "raw_code_stats_package_id_foreign";

-- Resort the original table.
TRUNCATE packages;
INSERT INTO packages SELECT * FROM packages_tmp;

-- Readd all of the foreign keys.
ALTER TABLE "package_stats"          ADD CONSTRAINT "package_stats_package_id_foreign"          FOREIGN KEY (package) REFERENCES packages(package) ON UPDATE CASCADE ON DELETE CASCADE;
ALTER TABLE "packagist_stats"        ADD CONSTRAINT "packagist_stats_package_id_foreign"        FOREIGN KEY (package) REFERENCES packages(package) ON UPDATE CASCADE ON DELETE CASCADE;
ALTER TABLE "code_quality"           ADD CONSTRAINT "raw_code_quality_package_id_foreign"       FOREIGN KEY (package) REFERENCES packages(package) ON UPDATE CASCADE ON DELETE CASCADE;
ALTER TABLE "code_stats"             ADD CONSTRAINT "raw_code_stats_package_id_foreign"         FOREIGN KEY (package) REFERENCES packages(package) ON UPDATE CASCADE ON DELETE CASCADE;

-- Reset the reset migration.
DELETE FROM migrations WHERE id IN (SELECT id FROM migrations ORDER BY id desc LIMIT 1);

This way, you can run the migration on a cronjob or manually or whenever. Your huge table will now always be sorted when you do SELECT, except for new rows. I run the cron once a day and it saves about 15% of overall DB performance, not needing the ORDER BY except for data younger than 24 hours.