I have a database which contains main symptoms, additional symptoms and diseases. I need to replace 2 queries because I am sure my first query is not efficient and my 2nd is not correct at all. I hope anyone can help me because I am new to this area..
Database explanation:
The database is being used by a medical app:
- The user selects a specific bodypart
- The app lists all main symptoms of that specific bodypart
- The user selects a main symptom (common or less common)
The app lists all diseases of the selected main symptom. There also appear 2 checkboxes (additional symptoms) that can be checked by the user. The order of the listed diseases (d_weight) depends on age, gender, selected main symptom and what boxes the user has checked. A disease with
d_weight
<= 5 is considered to be a common disease. A disease withd_weight
> 5 is considered to be less-common. The possibilities of user input (age, gender, bodypart, main symptom) is stored in the symptom_disease_combi table.asa_id
is the id of all symptom that apply (addtional symptoms that are checked by the user)asc_id
is the id of all possibilities of additional symptoms that belong to the specific main symptom. For example,asc_id
= 0, in case there is no additional symptom selected.asc_id
= 1, in case only additional symptom 'Insomnia' is selected.asc_id
= 2, in case both 'Insomnia' and 'Blowing up' are selected.
1. Query to get all symptoms of a specific bodypart (can be improved):
SELECT DISTINCT s.name
, s.id
, sdc.s_common
FROM symptom as s
, symptom_disease_combi as sdc
WHERE sdc.age = ".$age."
AND sdc.gender = ".$gender."
AND sdc.bodypart = ".$bodypart."
AND sdc.s_id = s.id
2. Query to get all diseases and additional symptoms of selected symptom (doesn't work):
SELECT DISTINCT d.name
, d.id
, sdc.d_weight
, adls.id
, adls.name
FROM disease as d
, symptom_disease_combi as sdc
, symptom as s
, adlsymptom as adls
WHERE sdc.age = ".$age."
AND sdc.gender = ".$gender."
AND sdc.bodypart = ".$bodypart."
AND s.id = ".$sid."
AND sdc.s_id = s.id
3. Database structure (please let me know if my design can be improved)
CREATE TABLE symptom
(id INT NOT NULL AUTO INCREMENT
,name VARCHAR(100) DEFAULT NULL
,critical INT NOT NULL
,PRIMARY KEY (id)
) ENGINE=MyISAM;
id name critical
1 Behavioral disturbances 1
2 Ear pressure 0
3 Elevated temperature 0
4 Fainting 0
5 Head pain 0
CREATE TABLE disease (
id int(11) NOT NULL AUTO_INCREMENT,
name varchar(100) DEFAULT NULL,
critical int(11) NOT NULL,
description text NOT NULL,
tests text NOT NULL,
treatment text NOT NULL,
PRIMARY KEY (id)
) ENGINE=MyISAM AUTO_INCREMENT=19 DEFAULT CHARSET=latin1
id name critical description tests treatment
1 Adjustment disorder 0 lorem ipsum lorem ipsum lorem ipsum
2 ADHD 0 lorem ipsum lorem ipsum lorem ipsum
3 Drug reaction 0 lorem ipsum lorem ipsum lorem ipsum
4 Seizure (epilepsy) 1 lorem ipsum lorem ipsum lorem ipsum
CREATE TABLE adlsymptom (
id int(11) NOT NULL AUTO_INCREMENT,
name varchar(100) NOT NULL,
PRIMARY KEY (id)
) ENGINE=MyISAM AUTO_INCREMENT=5 DEFAULT CHARSET=latin1
id name
1 Insomnia
2 Blowing up
3 Depressed
4 Drug abuse
CREATE TABLE adlsymptom_apply (
id int(11) NOT NULL,
as_id int(11) NOT NULL,
PRIMARY KEY (id,as_id),
KEY FK_additional_symptom_that_apply (as_id)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
id as_id
1 1
1 2
CREATE TABLE adlsymptom_combi (
id int(11) NOT NULL,
as_id int(11) NOT NULL,
PRIMARY KEY (id,as_id),
KEY FK_additional_symptom_combination (as_id)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
id as_id
1 1
2 2
3 1
3 2
CREATE TABLE symptom_disease_combi (
id int(11) NOT NULL AUTO_INCREMENT,
age int(11) NOT NULL,
gender int(11) NOT NULL,
bodypart int(11) NOT NULL,
s_id int(11) NOT NULL,
s_common int(11) NOT NULL,
asc_id int(11) NOT NULL,
asa_id int(11) NOT NULL,
d_id int(11) NOT NULL,
d_weight int(11) NOT NULL,
PRIMARY KEY (id),
KEY FK_symptom (s_id),
KEY FK_additional_symptom_combination (asc_id),
KEY FK_additional_symptom_that_apply (asa_id),
KEY FK_disease (d_id)
) ENGINE=MyISAM AUTO_INCREMENT=65 DEFAULT CHARSET=latin1
id age gender bodypart s_id s_common asc_id asa_id d_id d_weight
1 1 1 1 1 1 0 1 1 1
2 1 1 1 1 1 0 1 2 2
3 1 1 1 1 1 0 1 3 3
4 1 1 1 1 1 0 1 11 4
UPDATE 1:
critical
in disease and symptom is created to tell the user that they need to go to the hospital immediately, when they click on the disease or symptomage
,gender
,bodypart
are id's, soage
= 1, means 0-5,age
= 2 means 6-17,age
= 3 means 18-59 andage
= 4 means 60+.- Please take a look at the design of the application, it will help a lot to understand the design of the database: https://i.stack.imgur.com/ley6T.png Btw, in the design; cause == disease...
asa_id
refers to id of adlsymptom_applyasc_id
referst to id of adlsymptom_combi- The 'distinct' was used to get all symptoms/diseases just 1 time. I am sure it's not needed, but I don't know how to fix it.
ditch 2 symptom tables, go with one (symptom) and 1 intersect table (sdc)
i would not add a new column to symptom, such as status/level trying to jack up a symptom in importance to main or secondary, despite the temptation, because this could easily make it not flexible.
for instance, fainting may seem main for once disease/condition but it could skew it in general
go for generality, therefore 1 table for symptoms, you correctly have a d_weight in sdc
i like your sdc.d_weight concept.
for instance, fainting might have some weight with epilepsy, but not for the flu. the whole concept is screwed up when someone with the flu is prescribed Zarontin and not Tamiflu
since i like your sdc.d_weight concept, i wonder why you went with the additional symptoms table
in table sdc, you have keys/indexes with names starting with "FK_". Hopefully you have actual FK constrainst
and not just naming conventions that make you think you have them (FK's, you don't have them)
for instance, real FK's for bodypart, symptom, and disease/condition.
as the user selects symptoms, remove them from the GUI's ability to add that symptom again for search
this cuts down work and simplifies querying by ditching the secondary table (line 1 of what i wrote and suggest)
note, again, that your use of KEY (which is a synonym for index, not foreign key constraint), just creates an index ...
.
i think you need to think long and hard about the age column is sdc going to be loaded with a new row, one for each age, 50 to 120, for gender='M', for prostrate cancer? i am basing this on your age=xxxxx line, maybe you mean >=. or for leukemia, <= ... i think your schema needs a bit of thought
so i am excluding age from the below query
-- for the next query, i wonder why you need a distinct? -- is there more than 1 sdc.s_common for given symptom row ? only you would know -- if not, ditch the distinct
-- the order of your join on clause matters, make it follow a good index so results come back fast
-- i am not suggesting these are covered indexes, but you will minimize table scan activity
-- not my comment above about age -- incorporate weights as needed in a computed column
those are my suggestions, and good luck