Problem running Gremlin query in CosmosDb

201 views Asked by At

I have a problem running this Gremlin query in Azure CosmosDB.

g.V().
  has('node', 'id', 'new').
  fold().coalesce(
    unfold(),
    addV('node').
    property('id', 'new').
    property('partitionKey', 'edd1f6ca3b1c446987d7da29e370cc7e')
  ).V().
  has('node', 'id', 'new').
    as('new').
  V().
  has('node', 'id', 'root').
  coalesce(
    outE('contains').where(inV().as('new')),
    addE('contains').to('new')
  ).V().
  has('node', 'id', 'new').
    as('new').
  V().has('userGroup', 'id', 'userGroup1').
  coalesce(
    outE('hasAccess').where(inV().as('new')),
    addE('hasAccess').to('new')
  )

I get two problems:

  1. If I have many other nodes in the DB (350000) the query times out. I have not been able to test this in TinkerPop.
  2. The hasAccess edge is not created. This works in TinkerPop.

The base for the query is (images from gremlify.com):

g.addV('node').
  property('id', 'root').
  property('partitionKey', '33cb2571f8e348eaa875e6a2639af385')
g.addV('userGroup').
  property('id', 'userGroup1').
  property('partitionKey', '1')

and I want to end up like:

with a query that can be run multiple times without changing anything (idempotent). If I do this in separate queries it works fine:

g.V().
  has('node', 'id', 'new').
  fold().coalesce(
    unfold(),
    addV('node').
    property('id', 'new').
    property('partitionKey', 'edd1f6ca3b1c446987d7da29e370cc7e')
  )
g.V().
  has('node', 'id', 'new').
    as('new').
  V().
  has('node', 'id', 'root').
  coalesce(
    outE('contains').where(inV().as('new')),
    addE('contains').to('new')
  )
g.V().
  has('node', 'id', 'new').
    as('new').
  V().has('userGroup', 'id', 'userGroup1').
  coalesce(
    outE('hasAccess').where(inV().as('new')),
    addE('hasAccess').to('new')
  )

But I want to save two calls to the DB and do it in one go.

1

There are 1 answers

0
noam621 On BEST ANSWER

From my experience using the V() step in the middle of a traversal is not well optimized in some vendors, even when followed by a strong filter like has('id', <name>), and I think you should try to avoid using it if you want to do a single query. you can try:

g.V().hasLabel('node', 'userGroup').has('_id', within('new', 'root', 'userGroup1')).
  fold().as('vertices').coalesce(
    unfold().has('node', '_id', 'new'),
    addV('node').property('_id', 'new').
    property('partitionKey', 'edd1f6ca3b1c446987d7da29e370cc7e')
  ).as('new').
  select('vertices').unfold().has('node', '_id', 'root').coalesce(
    outE('contains').where(inV().as('new')),
    addE('contains').to('new')
  ).
  select('vertices').unfold().has('userGroup', '_id', 'userGroup1')
  coalesce(
    outE('hasAccess').where(inV().as('new')),
    addE('hasAccess').to('new')
  )

example: https://gremlify.com/pdtla2bsrc/1