Why does this Postgres Select Query work in the local Docker Container but not on Amazon RDS Postgres?

1.4k views Asked by At

My team has a local development environment that contains a Ruby on Rails app that connects to a Postgres database to retrieve some information about airports and use that to make further queries. The local environment is meant to duplicate our production environment (both running in the same docker containers).

The local Postgres container is using the official Postgres:9.6-alpine container found here: https://hub.docker.com/_/postgres/

The query in question is:

SELECT airports.iata FROM "routes" INNER JOIN "regions" ON "regions"."id" = "routes"."origin_id" INNER JOIN "airports_regions" ON "airports_regions"."region_id" = "regions"."id" INNER JOIN "airports" ON "airports"."id" = "airports_regions"."airport_id";

Local Query Execution

  1. The above command executes properly when called by our rails app locally.
  2. I can also run the command using psql after "docker exec -it'ing" into the Postgres container locally.

Amazon RDS Postgres Query Execution

  1. In production our Rails app attempts to run the query in question but returns an empty data set: {}

It should be noted that it successfully connects to the DB and "successfully" makes the query after properly authenticating, so really the issue here is that the query doesn't properly return / match the data.

  1. When I manually connect to our Amazon RDS Postgres database using psql, I can properly authenticate, and I can execute the query but I again receive the null / {} result. All other queries appear to work.

In both of the above cases the query executes properly. When executing manually it specifically returns the following:

iata 
------ 
(0 rows)

Other Interesting facts

  1. Point local rails container at Production RDS Postgres — Result: Postgres returns {} even though the Postgres containers COULD execute that same query and return expected list of data when pointed at the local DB. Returned result in this case reproduces the production behavior: {}
  2. SSH into Amazon EC2 instance connected to RDS DB in the same VPC / Security group and run the above command to make sure it was not a permissions problem — Result: Same as remote query: {}
  3. Counting the table on both instances (both return the same result ~5880): $ SELECT COUNT(*) FROM airports;
  4. Listing tables returns identical table list on both Amazon RDS and local Postgres

Moving forward

At this point I am operating under the assumption that there must be differences between how Amazon RDS for Postgres handles data vs the local Postgres 9.6 db but I haven't been able to find any indication of this on le google. Hopefully some has run into something similar.

Postgres Versions

SELECT version(); returns the following:

RDS Postgres Version

PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16), 64-bit (1 row)

Local Postgres Version

PostgreSQL 9.6.3 on x86_64-pc-linux-musl, compiled by gcc (Alpine 6.2.1) 6.2.1 20160822, 64-bit (1 row)

StackOverflow Suggestions w/ Results

From @Aleroot Modify Query to:

SELECT a.iata FROM routes ru JOIN regions re ON re.id = ru.origin_id JOIN airports_regions ar ON ar.region_id = re.id JOIN airports a ON a.id = ar.airport_id;

Result: (Same)

 iata 
------
(0 rows)
2

There are 2 answers

0
Vao Tsun On BEST ANSWER

as we spoke in comments, you can check if the result was not influenced by other tables. Changing INNER JOIN to OUTER JOIN will append rows with nulls for missing keys, thus amount of rows of airports table will full. Either from the result, or just by checking counts on regions, airports_regions, routes tables should reveal the difference between RDS and local databases.

0
Necevil On

Since @VaoTsun's comment originally got me on the correct path I accepted his answer (also found here).

The real question, which I should have been asking, actually ended up being "How can I make sure that a DB dump that I imported into Amazon RDS Postgres is intact?"

Much of the confusion here resulted from the fact that Amazon RDS did not throw any errors when I imported the dump, and that the web facing portion of my Rails app properly showed almost all data.

This was because MOST tables were imported properly with all data and ALL tables were created. Therefore listing the tables / schema resulted in everything looking correct and all queries against all but one table resulted in correct answers.

I finally went back through and selected all on each table in the DB: SELECT * FROM each_table_name;

One specific table returned no results on the remote RDS Database while it returned the expected results locally. Once this happened I dropped the database out of Amazon RDS and re-imported everything. No errors during import (again just like last time) but all tables exist with data this time around after selecting all in each table one by one to verify the data imported properly.