Optimizing MySQL query with group_concat

47 views Asked by At

I have an old PHP5 MySQL webapp whose code had to be adapted to PHP8/MySQL14.14

There's a view, querying few tables and based on that code:

(SELECT `db_demandes`.`bdd`.`id`
        AS `Id`,
        `db_demandes`.`bdd`.`ref`
        AS `Ref`,
        `db_demandes`.`bdd`.`nodossier`
        AS `NoDossier`,
        `db_demandes`.`bdd`.`agentfirme`
        AS `Agentfirme`,
        `db_demandes`.`bdd`.`agentanterieur`
        AS `AgentAnterieur`,
        Concat(`db_demandes`.`bdd`.`nom`, '\r\n', `db_demandes`.`bdd`.`prenom`)
        AS
`Demandeur`,
Concat(`db_demandes`.`bdd`.`no_o`, ' ', `db_demandes`.`bdd`.`rue_o`, '\r\n', `db_demandes`.`bdd`.`cp_o`, ' ', `db_demandes`.`bdd`.`localite_o`)
        AS `Adresse`,
`db_demandes`.`bdd`.`protection`
        AS `Protection`,
`db_demandes`.`bdd`.`type_objet`
        AS `Type_Objet`,
`db_demandes`.`bdd`.`datation_o`
        AS `Datation_O`,
`db_demandes`.`bdd`.`datedemandesubsideavt`
        AS `DateDemandeSubsideAVT`,
REPLACE(`db_demandes`.`bdd`.`datef1`, '+', '\n')
        AS `DateF1`,
REPLACE(`db_demandes`.`bdd`.`datear1`, '+', '\n')
        AS `DateAR1`,
`db_demandes`.`bdd`.`datedemandeavise`
        AS `DateDemandeAvise`,
`db_demandes`.`bdd`.`dateavisfirme_e`
        AS `DateAvisfirme_E`,
`db_demandes`.`bdd`.`dateavisfirme_o`
        AS `DateAvisfirme_O`,
`db_demandes`.`bdd`.`dtdatedemandeauthtrav`
        AS `dtDateDemandeAuthTrav`,
`db_demandes`.`bdd`.`dtdatedecisionDirection`
        AS `dtDateDecisionDirection`,
Group_concat(DISTINCT `avantpromesse`.`dtdate` SEPARATOR ' ')
        AS `V1`,
`db_demandes`.`bdd`.`daterefusavt`
        AS `DateRefusAVT`,
`db_demandes`.`bdd`.`email`
        AS `Email`,
`db_demandes`.`bdd`.`travaux_a_subventionner`
        AS `Travaux_a_subventionner`,
`db_demandes`.`bdd`.`pc`
        AS `PC`,
`db_demandes`.`bdd`.`ff`
        AS `FF`,
(SELECT Group_concat(`db_demandes`.`tblpromesses`.`montantestimatifpromesse`
        SEPARATOR
        ' ')
 FROM   `db_demandes`.`tblpromesses`
 WHERE  ( `db_demandes`.`tblpromesses`.`fidossier` = `db_demandes`.`bdd`.`ref` )
 ORDER  BY `db_demandes`.`tblpromesses`.`idpromesse`)
        AS `MontantsEstimatifs`,
(SELECT Group_concat(`db_demandes`.`tblpromesses`.`dtdatepromesse` SEPARATOR ' '
        )
 FROM   `db_demandes`.`tblpromesses`
 WHERE  ( `db_demandes`.`tblpromesses`.`fidossier` = `db_demandes`.`bdd`.`ref` )
 ORDER  BY `db_demandes`.`tblpromesses`.`idpromesse`)
        AS `DatesPromesses`,
Group_concat(DISTINCT `db_demandes`.`tblvisites`.`dtdate` SEPARATOR ' ')
        AS
`Visites`,
`db_demandes`.`tblsuivi`.`dtavancementtravaux`
        AS
     `dtAvancementTravaux`,
Trim(Concat(`db_demandes`.`tblsuivi`.`dtobservations`, '\r\n', REPLACE(
          `db_demandes`.`tblsuivi`.`dtavancementtravaux`, '\r\n', 'X')))
        AS
`dtObservations`,
`db_demandes`.`tblsuivi`.`dtetat`
        AS `dtEtat`,
`db_demandes`.`bdd`.`paiementsprevisionnels2011`
        AS `PaiementsPrevisionnels2011`,
`db_demandes`.`bdd`.`paiementsprevisionnels2012`
        AS `PaiementsPrevisionnels2012`,
REPLACE(`db_demandes`.`bdd`.`datef2`, '+', '\n')
        AS `DateF2`,
REPLACE(`db_demandes`.`bdd`.`datear2`, '+', '\n')
        AS `DateAR2`,
`db_demandes`.`bdd`.`daterefusap_t`
        AS `DateRefusAP_T`,
`db_demandes`.`bdd`.`tt`
        AS `TT`,
(SELECT Group_concat(`db_demandes`.`tbldemandes`.`montantsubside` SEPARATOR ' ')
 FROM   `db_demandes`.`tbldemandes`
 WHERE  ( `db_demandes`.`tbldemandes`.`montantsubside` =
          `db_demandes`.`bdd`.`ref` )
 ORDER  BY `db_demandes`.`tbldemandes`.`montantsubside`)
        AS `Montantsdemandes`,
(SELECT Group_concat(`db_demandes`.`tbldemandes`.`date_am_subside` SEPARATOR ' '
        )
 FROM   `db_demandes`.`tbldemandes`
 WHERE  ( `db_demandes`.`tbldemandes`.`date_am_subside` =
          `db_demandes`.`bdd`.`ref` )
 ORDER  BY `db_demandes`.`tbldemandes`.`date_am_subside`)
        AS `Dates_AM_demandes`,
(SELECT Group_concat(`db_demandes`.`tbldemandes`.`date_op_subside` SEPARATOR ' '
        )
 FROM   `db_demandes`.`tbldemandes`
 WHERE  ( `db_demandes`.`tbldemandes`.`date_op_subside` =
          `db_demandes`.`bdd`.`ref` )
 ORDER  BY `db_demandes`.`tbldemandes`.`date_op_subside`)
        AS `Dates_OP_demandes`,
`db_demandes`.`bdd`.`dtflagdeleted`
        AS `dtFlagDeleted`,
`db_demandes`.`tbldevis`.`dtdate`
        AS `dtDateDevis`,
`db_demandes`.`bdd`.`patrimoine`
        AS `Patrimoine`,
`db_demandes`.`bdd`.`telmobile`
        AS `TelMobile`,
`db_demandes`.`bdd`.`telprive`
        AS `TelPrive`,
`db_demandes`.`bdd`.`notes`
        AS `Notes`,
`db_demandes`.`bdd`.`notesagent`
        AS `NotesAgent`
 FROM   (((((`db_demandes`.`bdd`
             LEFT JOIN `db_demandes`.`tblsuivi`
                    ON(( `db_demandes`.`tblsuivi`.`fidossier` =
                      `db_demandes`.`bdd`.`ref` )))
            LEFT JOIN `db_demandes`.`tbldevis`
                   ON(( ( `db_demandes`.`tbldevis`.`fidossier` =
                        `db_demandes`.`bdd`.`ref` )
                        AND ( `db_demandes`.`tbldevis`.`iddevis` = (SELECT Max(
                              `db_demandes`.`tbldevis`.`iddevis`)
                                                                    FROM
                              `db_demandes`.`tbldevis`
                                                                    WHERE  (
                                    `db_demandes`.`tbldevis`.`fidossier` =
                            `db_demandes`.`bdd`.`ref` )) ) )))
           LEFT JOIN `db_demandes`.`tblvisites`
                  ON(( `db_demandes`.`tblvisites`.`fidossier` =
                     `db_demandes`.`bdd`.`ref` )))
          LEFT JOIN `db_demandes`.`tblvisites` `avantpromesse`
                 ON(( `avantpromesse`.`fidossier` =
         `db_demandes`.`bdd`.`ref` )))
         LEFT JOIN `db_demandes`.`vvisitemin`
                ON(( `vvisitemin`.`fidossier` = `db_demandes`.`bdd`.`ref` )))
 WHERE ( ( NOT(( Lcase(`db_demandes`.`bdd`.`nodossier`) LIKE '%avis%' )) )
          AND ( NOT(( Lcase(`db_demandes`.`bdd`.`nodossier`) LIKE '%dble%' )) )
        )
 GROUP  BY `db_demandes`.`bdd`.`nodossier`,
           `db_demandes`.`tblsuivi`.`idsuivi`
 ORDER  BY `db_demandes`.`bdd`.`nodossier` DESC);

which is then queried by another SQL request. As a result, I have very poor performance, even with just 7000 records on the biggest table of the set (table "bdd").

As you can see, there are some concatenation and so on... making the stuff even more messy

I really need to improve performances as it's taking minutes to display.

Where would you start ? I'm not an expert in SQL... not yet!

I tried to get rid of some group_concat distinct to start but it's so messy... I guess there are some kind of legacy functions which may be optimized

2

There are 2 answers

3
slaakso On

You should provide MySQL EXPLAIN output for the query in addition to the table definitions with indexes.

The query is bit hard to read with all the backtics and extra unnecessary paretheses.

To start with the most obvious ones you can check that the related tables have indexes for the db_demandes.bdd.ref column that they are matched against.

The query itself will do a table scan against the db_demandes.bdd table as you have LIKE '%xxx%' in the WHERE-clause.

Additionally the GROUP BY looks bit suspicious, as you have db_demandes.bdd.id in the select list and yet you are grouping over other columns.

Here is a cleaned up version of the query:

SELECT    
  b.id                                                                   AS id,
  b.ref                                                                  AS ref,
  b.nodossier                                                            AS nodossier,
  b.agentfirme                                                           AS agentfirme,
  b.agentanterieur                                                       AS agentanterieur,
  Concat(b.nom, '\r\n', b.prenom)                                        AS demandeur,
  Concat(b.no_o, ' ', b.rue_o, '\r\n', b.cp_o, ' ', b.localite_o)  AS adresse,
  b.protection                                                           AS protection,
  b.type_objet                                                           AS type_objet,
  b.datation_o                                                           AS datation_o,
  b.datedemandesubsideavt                                                AS datedemandesubsideavt,
  Replace(b.datef1, '+', '\n')                                           AS datef1,
  Replace(b.datear1, '+', '\n')                                          AS datear1,
  b.datedemandeavise                                                     AS datedemandeavise,
  b.dateavisfirme_e                                                      AS dateavisfirme_e,
  b.dateavisfirme_o                                                      AS dateavisfirme_o,
  b.dtdatedemandeauthtrav                                                AS dtdatedemandeauthtrav,
  b.dtdatedecisiondirection                                              AS dtdatedecisiondirection,
  group_concat(DISTINCT avantpromesse.dtdate separator ' ')              AS v1,
  b.daterefusavt                                                         AS daterefusavt,
  b.email                                                                AS email,
  b.travaux_a_subventionner                                              AS travaux_a_subventionner,
  b.pc                                                                   AS pc,
  b.ff                                                                   AS ff,
  ( 
          SELECT   group_concat(tblpromesses.montantestimatifpromesse separator ' ') 
          FROM     tblpromesses 
          WHERE    tblpromesses.fidossier = b.ref
          ORDER BY tblpromesses.idpromesse) AS montantsestimatifs, 
  ( 
          SELECT   group_concat(tblpromesses.dtdatepromesse separator ' ' ) 
          FROM     tblpromesses 
          WHERE    tblpromesses.fidossier = b.ref
          ORDER BY tblpromesses.idpromesse)                              AS datespromesses,
  group_concat(DISTINCT tblvisites.dtdate separator ' ')                 AS visites,
  ts.dtavancementtravaux                                                 AS dtavancementtravaux,
  trim(concat(ts.dtobservations, '\r\n', 
     replace( ts.dtavancementtravaux, '\r\n', 'X'))
  )                                                                      AS dtobservations,
  ts.dtetat                                                              AS dtetat,
  b.paiementsprevisionnels2011                                           AS paiementsprevisionnels2011,
  b.paiementsprevisionnels2012                                           AS paiementsprevisionnels2012,
  replace(b.datef2, '+', '\n')                                           AS datef2,
  replace(b.datear2, '+', '\n')                                          AS datear2,
  b.daterefusap_t                                                        AS daterefusap_t,
  b.tt                                                                   AS tt,
  ( 
          SELECT group_concat(tbldemandes.montantsubside separator ' ') 
          FROM tbldemandes 
          WHERE tbldemandes.montantsubside = b.ref
          ORDER BY tbldemandes.montantsubside)                             AS montantsdemandes, 
  ( 
          SELECT group_concat(tbldemandes.date_am_subside separator ' ' ) 
          FROM tbldemandes 
          WHERE  tbldemandes.date_am_subside = b.ref 
          ORDER BY tbldemandes.date_am_subside
  )                                                                       AS dates_am_demandes, 
  ( 
          SELECT group_concat(tbldemandes.date_op_subside separator ' ' ) 
          FROM tbldemandes 
          WHERE tbldemandes.date_op_subside = b.ref
          ORDER BY tbldemandes.date_op_subside
  )                                                                       AS dates_op_demandes, 
  b.dtflagdeleted                                                         AS dtflagdeleted, 
  tbldevis.dtdate                                                         AS dtdatedevis, 
  b.patrimoine                                                            AS patrimoine, 
  b.telmobile                                                             AS telmobile, 
  b.telprive                                                              AS telprive, 
  b.notes                                                                 AS notes, 
  b.notesagent                                                            AS notesagent 
FROM db_demandes.bdd b
  LEFT JOIN tblsuivi ts ON ts.fidossier = b.ref 
  LEFT JOIN tbldevis ON tbldevis.fidossier = b.ref 
     AND tbldevis.iddevis = 
     ( 
            SELECT max( tbldevis.iddevis) 
            FROM   tbldevis 
            WHERE tbldevis.fidossier = b.ref
      )  
  LEFT JOIN tblvisites ON tblvisites.fidossier = b.ref
  LEFT JOIN tblvisites avantpromesse ON avantpromesse.fidossier = b.ref 
  LEFT JOIN vvisitemin ON vvisitemin.fidossier = b.ref
WHERE NOT( lcase(b.nodossier) LIKE '%avis%' ) AND NOT(lcase(b.nodossier) LIKE '%dble%' )  
GROUP BY b.nodossier, ts.idsuivi 
ORDER BY b.nodossier DESC;
2
Rick James On

Some of these may help:

bdd:  INDEX(ref)
bdd:  INDEX(nodossier, ref)
avantpromesse:  INDEX(fidossier,  dtdate)
tblpromesses:  INDEX(fidossier, idpromesse,  montantestimatifpromesse)
tblpromesses:  INDEX(fidossier, idpromesse,  dtdatepromesse)
tbldemandes:  INDEX(montantsubside)
tbldemandes:  INDEX(date_am_subside)
tbldemandes:  INDEX(date_op_subside)
tbldevis:  INDEX(fidossier,  iddevis)
tblvisites:  INDEX(fidossier,  dtdate)
tblsuivi:  INDEX(fidossier,  dtavancementtravaux, dtobservations, dtetat, idsuivi)
tbldevis:  INDEX(fidossier,  dtdate, iddevis)
vvisitemin:  INDEX(fidossier)

In this:

            WHERE  ( ( NOT(( Lcase(`bdd`.`nodossier`) LIKE '%avis%' )) )
                      AND  ( NOT(( Lcase(`bdd`.`nodossier`) LIKE '%dble%' )) ) 
                   )

Three things prevent that WHERE from being optimized:

  • If nodossier has a case insensitive collation, then LCASE() is unnecessary.
  • LIKE with a leading wildcard is not easily optimized; would a FULLTEXT index be appropriate?
  • Even if the other items can be resolved, NOT probably prevents optimization.

I think (but am not sure) that there is an extra sort for this:

            GROUP BY  `bdd`.`nodossier`, `tblsuivi`.`idsuivi`
            ORDER BY  `bdd`.`nodossier` DESC

Suggest changing to

            GROUP BY  `bdd`.`nodossier`,      `tblsuivi`.`idsuivi`
            ORDER BY  `bdd`.`nodossier` DESC, `tblsuivi`.`idsuivi` DESC

When GROUP BY and ORDER BY "match", the optimizer can do both at the same time.

Please provide

SHOW CREATE TABLE bdd;
EXPLAIN SELECT ...