Combining columns into records for WHERE IN()

60 views Asked by At

First, I am sorry. I do not have control to change how the data is stored, so I have to work with it the way it is.

I have some tables in a Pervasive database that looks like this:

Records:
id |  first_name  |  last_name  | address   | phone ....
 1 |  John
 2 |  Bill
 3 |  Jared
 4 |  Dave


Users:
id | parent_id| first_name
 2 |        2 | Bill
 3 |        1 | Jared
 4 |        1 | Dave

Times:
user_id | Date
 4      | 2020-09-29

These are medical records, the User would be the patient of record, who may be 3 years old, while the Records would contain the billing info, but they may or may not be patients.

I need the data from the Records table based on the Times table, but I have to go through the Users table to get the association. The trouble is that what I want is to not just get the exact matching records, but assign them to family groups and return the Records for all members of that family

So in this case, I would be looking at the Times table, getting user_id 4, which matches to the Users table, gives me Dave who is in the family 1. I now need to find all the Users in that family, then getting all those records from the Records table, along with the record for John.

My previous plan was is to flatten the two columns id and parent_id and only return those that are within a certain time frame. This is the only way I have figured it out so far:

SELECT "parent_id" as ids from  "Users" as u
where u."parent_id" IN(
    SELECT  "parent_id" FROM "Users" where "id" IN(  
        SELECT "user_id"
        FROM Times as a
        WHERE a.Date >=  CURRENT_DATE() - 2
        AND a.Date <=  CURRENT_DATE() + 5
    )
)
UNION ALL

SELECT "id" as ids from  "Users" as u
where u."parent_id" IN(
    SELECT  "parent_id" FROM "Users" where "id" IN(  
        SELECT "user_id"
        FROM Times as a
        WHERE a.Date >=  CURRENT_DATE() - 2
        AND a.Date <=  CURRENT_DATE() + 5
    )
)

The trouble is, I need to use it as a filter for another query like this:

SELECT id, first_name, last_name 
FROM Records WHERE id 
IN(
   insert query here
)

Which would return:

Output:
id |  first_name  |  last_name  
 1 |  John        |  Smith
 3 |  Jared       |  Smith
 4 |  Dave        |  Smith

What can I do here? Efficiency is important, these tables have hundreds of thousands of records. Because of the UNION ALL, I am not able to use what I have in Pervasive.

1

There are 1 answers

2
mirtheil On BEST ANSWER

I created your tables and added some data based on your post and tried your query as posted and it worked. I don't have enough data to test performance but you should be able to use the Query Plan / Query Plan Viewer to determine if the statement is optimized. I used both PSQL 11 and Zen 14 (current version of Pervasive PSQL) and got the same (expected) results. The query I used is:

create table "Users" 
(id int,
parent_id int, 
first_name char(100)
);
create table times 
(user_id int, 
"Date" date);

create table records
(id int,
first_name char(100), 
last_name char(100), 
address char(250), 
phone char(13));

insert into records values (1, 'john', 'smith', 'addr1', 'phone');
insert into records values (2, 'Bill', 'smith', 'addr1', 'phone');
insert into records values (3, 'Jared', 'smith', 'addr1', 'phone');
insert into records values (4, 'Dave', 'smith', 'addr1', 'phone');

insert into "Users" values (2,2,'Bill');
insert into "Users" values (3,1,'Jared');
insert into "Users" values (4,1,'Dave');

insert into Times values (4,'2020-09-29');

SELECT id, first_name, last_name 
FROM Records WHERE id 
IN(
   SELECT "parent_id" as ids from  "Users" as u
where u."parent_id" IN(
    SELECT  "parent_id" FROM "Users" where "id" IN(  
        SELECT "user_id"
        FROM Times as a
        WHERE a.Date >=  CURRENT_DATE() - 2
        AND a.Date <=  CURRENT_DATE() + 5
    )
)
UNION ALL

SELECT "id" as ids from  "Users" as u
where u."parent_id" IN(
    SELECT  "parent_id" FROM "Users" where "id" IN(  
        SELECT "user_id"
        FROM Times as a
        WHERE a.Date >=  CURRENT_DATE() - 2
        AND a.Date <=  CURRENT_DATE() + 5
    )
)
)