I have a symfony repository method, which fetches a quite complex data set, which will then be placed in a CSV file by an export manager class. I don't want to place the whole code of handling the export job, but I managed to chatch the point at which the query slows down, so my question is regarding any other alternative to make that query faster, rather than the code itself. So the data fetched is some "site" data, which has multpiple "memberships", which then has a "user" . So the problem is, that when my query is trying to join the user information to the site, it slows down the execution. It looks like this:
$qb->leftJoin('s.memberships', 'ex_sm', 'WITH', 'ex_sm.revokedAt IS NULL');
$qb->leftJoin('ex_sm.user', 'ex_jappr', 'WITH', 'ex_sm.approverJobReactiveWeight IS NOT NULL');
There are a few things (I tried or crossed through my mind could help) to mention:
- I checked the tables, all the linked columns have an index and they are the same int data type.
- I red an article about the DQL performance issues, where it was mentioned that overuse of DQL Left Join calls can kill performance as they re-map the same entity object over and over. A possible solution mentioned there was to fetch the main data set, and then looping through the collection, add the additional (joining data fields) to each element directly from the field's entity class. That possibly could work (not sure how much impact it would have), the problem is that what I have is a really complex legacy code, I don't want to touch the export manager's logic, because that would require far too much testing. The export manager expects a query builder class, so I have to find a solution within the query itself.
- The issue is defintely caused by the join, and not the 'WITH' clause or the addtional conditions. I tried to call the query with the plain leftJoin call, same result.
- I know the leftJoin methods can be called chained to each other, the code looks this way because some of these call are used in if statements.
- I spend 2 days trying all sort things found here and other websites.
There are 6 different user types, for now I just called the script fetching the one above and it took 33 minutes to return the data. We are talking about 512 sites, which is not a huge collection of data. So my question is : Is there another DQL or any Doctrine way to simlify or reduce the call nulber of leftJoins in such a complex query, and somehow improve the performance?
Update: I think the problem comes from indexes so I give some details on the relationships: The 'memberships' entity comes from a table named 'access' the relationship to the user in its model looks like this:
/**
* The user this membership encapsulates.
*
* @ORM\ManyToOne(targetEntity="User", inversedBy="siteMemberships", cascade={"persist"})
* @ORM\JoinColumn(name="security_identity_id", referencedColumnName="id")
*
* @var User
*/
protected $user;
Here is a screenshot of the indexes assigend to the 'security_identity_id' columns

And the related User comes from a 'user' table having this relationship pointing to the membershipt
/**
* @ORM\OneToMany(targetEntity="SiteMembership", mappedBy="user", cascade={"persist"}, fetch="EXTRA_LAZY")
*/
protected $siteMemberships;
Primary key is 'id' in the entity. Hope this gives a better view of the problem. I'm not an expert with sql, but tried everything I found and could understand so far.
Update: Here is the query executed:
SELECT s0_.name AS name_0, s0_.id AS id_1, GROUP_CONCAT(DISTINCT u1_.name SEPARATOR ', ') AS sclr_2 FROM site s0_
LEFT JOIN access a2_ ON s0_.id = a2_.entity_id
AND a2_.type IN ('site_member')
AND (a2_.revoked_at IS NULL)
LEFT JOIN user u1_ ON a2_.security_identity_id = u1_.id
AND (a2_.approver_job_reactive_weight IS NOT NULL)
This returns the first site record with it's joined membershipt and user peroperty. But even the fething of this one single row takes more than 2 minutes.
Here is the create table information of the access (membership entity) table
'CREATE TABLE `access` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`buddy_id` int(11) DEFAULT NULL,
`security_identity_id` int(11) DEFAULT NULL,
`revoked_at` datetime DEFAULT NULL,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`type` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`approver_job_reactive_weight` int(11) DEFAULT NULL,
`entity_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `access_idx` (`type`,`security_identity_id`,`entity_id`,`buddy_id`),
KEY `IDX_6692B54395CE8D6` (`buddy_id`),
KEY `IDX_6692B54DF9183C9` (`security_identity_id`),
KEY `IDX_6692B5481257D5D` (`entity_id`),
KEY `idx_revoked_id_approver_type` (`revoked_at`,`entity_id`,`approver_job_reactive_weight`,`approver_job_planned_weight`,`type`),
KEY `idx_user_site_access` (`revoked_at`,`security_identity_id`,`buddy_id`,`type`),
KEY `idx_user` (`security_identity_id`),
KEY `idx_user_id` (`security_identity_id`),
CONSTRAINT `FK_6692B54DF9183C9` FOREIGN KEY (`security_identity_id`) REFERENCES `user` (`id`)
)
ENGINE=InnoDB AUTO_INCREMENT=262441 DEFAULT CHARSET=utf8
COLLATE=utf8_unicode_ci'
I removed some of the columns which are not relevant.
When doing a
LEFT JOIN, theONneeds to say how the tables are related. TheWHEREclause normally hasIS NULLorIS NOT NULLto say whether to exclude or include the right-hand rows.LEFT JOINandINNER JOINare mostly the same speed. But I need to see the indexes (SHOW CREATE TABLE) and the SQL of theSELECTto see if there are other issues.More
Replace
With