Translate from SQL to mongo

484 views Asked by At

How should this be written in MongoDB?

SELECT field1, field2, product, COUNT(product) as counter
FROM table
GROUP BY product
HAVING counter > 0

I have tried with the following, which is nearly a copy from Doctrine manual.

$query = $this->createQueryBuilder('AcmeBundle:Product')
    ->group(array('product'), array('count' => 0))
    ->reduce('function (obj, prev) { prev.count++; }')
    ->field('product, field1, field2, count')
;

This only gives me 1 element "count = 21" which seems it have taken all products and counted them together.

2

There are 2 answers

0
Neil Lunn On

Like many ODM implementations, the core of doctrine (mongodDB wrapping part) is built "on top of" the basic native driver interface for MongoDB. This allows exposure to those driver objects so you can implement native MongoDB methods to do your queries.

I say this in general as the best option for your query is the aggregation framework of MongoDB. It's a high performance implementation in native code on the server, so does not rely on JavaScript interpretation for "reduce" type methods as is implemented by other methods. Generally speaking, the DSL does not map well to managers that try to be "all things" and therefore generate SQL as well. The concepts are quite different, and therefore optimal.

As it turns out though, there is a Collection wrapper class which you can use without delving right into the native driver object. It has it's own .aggregate() method wrapper. Lacking details on how to access this object but you can trace through the reference from the original commit.

But getting the underlying driver object is fairly straightforward:

$mongoCollection = $dm->getConnection()->getMongo()
    ->selectCollection('collectionName');

$result = $mongoCollection->aggregate(
    array(
       array( '$group' => array(
           '_id' => array( 
               'field1' => '$field1',
               'field2' => '$field2',
               'product' => '$product'
           ),
           'counter' => array( '$sum' => 1 )
       )),
       array( '$match' => array( 
           'counter' => array( '$gt' => 0 )
       )),
       array( '$project' => array( 
           'field1' => '$_id.field1',
           'field2' => '$_id.field2,
           'product' => '$_id.product',
           'counter' => 1
       ))
    )
);

That uses the $group pipeline operator to do the actual "grouping" via the _id key specified. This can be a "compound key" as it is in this case, so all fields are taken in combination. Any fields your don't want to "group by" are used with grouping operators such as $sum here, which is supplied with a static value of 1 to represent the "count" of matches.

If you just want to "group by" the "product" field values, then other fields in your result must also be under a grouping operator. Maybe something like $first:

$result = $mongoCollection->aggregate(
    array(
       array( '$group' => array(
           '_id' => '$product',
           'field1' => array( '$first' => '$field1' ),
           'field2' => array( '$first' => '$field2' ),
           'counter' => array( '$sum' => 1 )
       )),
       array( '$match' => array( 
           'counter' => array( '$gt' => 0 )
       )),
       array( '$project' => array(
           '_id' => 0,
           'product' => '$_id',
           'field1' => 1,
           'field2' => 1,
           'counter' => 1
       ))
    )
);

But any fields you want must be subject to a "grouping operator" or are part of the "group by" statement. The same is true for SQL.

The other "pipeline" stages here are basically $match which is done "post grouping" in order to have the same effect as the "HAVING" clause, and finally the $project which just cleans up the output to have the desired field names rather than being compounded within the required _id ("group by") key.

That's how you do aggregation in MongoDB.

For more information on common SQL operations, see the SQL to Aggregation Mapping Chart in the core documentation.

0
ASSILI Taher On

The aggregation framework provides an easy way to process records and return computed results.
In YourDocumentNameRepository.php file write this lines:

$qb = $this->createAggregationBuilder('Document\YourDocumentName');

/* First stage: group by product and calculate the number of rows foreach group. */
$qb->group()
   ->field('id')
   ->expression('$product')
   ->field('field1')
   ->first('$field1)
   ->field('field2')
   ->first('$field2')  
   ->field('counter')
   ->sum(1);

/* Second stage: similar to  HAVING counter > 0 in SQL*/
$qb->match()
   ->field('counter')
   ->gt(0);

/* Execute the request */
$results = $qb->execute();
return $results;

You can read the documentation.