gremlin user defined steps in faunus script map

289 views Asked by At

Background: I have several months experience using Gremlin and Faunus, incl. the ScriptMap step.

Problem: User defined Gremlin steps work fine when loaded in the shell as part of a script. However, the same steps apparently have no effect when defined in a Faunus ScriptMap script.

 /***********Faunus Driver*************/

//usage gremlin -e <hhis file> NOTE: to run in gremlin remove .submit() at end of pipe
import Java.io.Console;
//get args
console = System.console()
mapperpath=console. readLine ('> <map script path>: ')
refns=console.readLine('> <reference namespace>: ')
refinterestkey-console.readLine('> <interest field>: ')
//currently not in use
refinterestval=console.readLine('> <interest value>: ')         
mainpropkey=console.readLine('> ^main field>: ')
delim=console.readLine('> <main delimiter>: ')
args=[]
args [0]=refns
args [1]=refinterestkey
args[2]=refinterestval
args [3]=mainpropkey
args [4]=delim
args=(String[]) args.toArray()
f=FaunusFactory.open('propertyfile')
f.V().filter('{it.get Property("_namespace") =="streamernamespace" && it.getProperty("_entity")==" selector"}').script(mapperpath, args).submit()
f.shutdown()

/***********Script Mapper*************/

Gremlin.defineStep ("findMatch", [Vertex, Pipe], 
    {streamer,  interestindicator, fieldofinterest, fun ->
    _().has (interestindicator , true).has(fieldofinterest, 
                                fun(streamer)
    }
)
Gremlin.defineStep("connectMatch", [Vertex, Pipe], {streamer ->
// copy and link streaming vertices to matching vertices in main graph 
_().transform({if(main!= null) {
        mylog.info("reference vertex " + main.id    
                               +" & streaming vertex"+streamer.id+" match on main " +main.getProperty(fieldofinterest));
        clone=g.addVertex(null);
        ElementHelper.copyProperties(streamer, clone);
        clone.setProperty("_namespace", main.getProperty("__namespace"));
        mylog.info("create clone "+clone.id+" in "+clone.getProperty("_namespace"));
        g.addEdge(main, clone, streamer.getProperty("source");
        mylog.info("created edge "+ e);
        g.commit()
    }})
})

def g
def refns
def refinterestkey
def refinterestval
def mainpropkey
def delim
def normValue

def setup(args) {
    refns=args[0] 
    refinterestkey=args[1]
    refinterestval=args[2] 
    mainpropkey=args[3] 
    delim=args[4] 
    normValue = {obj-> seltype=obj.getProperty("type");
            seltypenorm=seltype.trim().toUpperCase();   
            desc=obj.getProperty("description"); 
            if(desc.contains(delim}) (
                selnum=desc.split(delim) [1].trim ()
            } else selnum=desc.trim();
            selnorm=seltypenorm.concat(delim).concat(selnum); 
            mylog.info ("streamer selector (" + seltype", "+desc+") normalized as "+selnorm);
            return selnorm
   }
    mylog=java.util.logging.Logger.getLogger("script_map")
    mylog.info ("configuring connection to reference graph
    conf=new BaseConfiguration()
    conf.setProperty("storage.backend", "cassandra"}
    conf.setProperty!"storage.keyspace", "titan"}
    conf.setProperty("storage.index.index-name", "titan")
    conf.setProperty("storage.hostname", "localhost")
    g=TitanFactory.open(conf)
    isstepsloaded = Gremlin.getStepnames().contains("findMatch"} && 
    Gremlin.getStepNames().contain("connectMatch"}
    mylog.info("custom steps available?: "+isstepsloaded)
}
def map{v, args) { 
    try{
    incoming=g.v(v.id)
    mylog.info{"current streamer id: "+incoming.id)
    if(incoming.getProperty("_entity")=="selector") {
                    mylog.info("process incoming vertex "+incoming.id)          
                    g.V{"_namespace", refns).findMatch(incoming,refinterestkey, mainpropkey,normValue).connectMatch(incoming).iterate ()
    } 
    }catch(Exception e) {
            mylog.info("map method exception raised");
            mylog.severe(e.getMessage()
    }
            g.commit()
}
def cleanup(args) { g.shutdown()}
2

There are 2 answers

0
fsj On

The root problem was I set an obsolete value for the "storage.index.index-name" property (see titan graph config under setup(). Disregard discussion re getOrCreate methods/blueprints: apparently a broad range of mutations on existing graphs can be achieved at scale using custom Gremlin steps defined inside a script referenced in the Faunus script step, with faunus format NoOpOutputFormat. Lesson learned: Instead of configuring titan graphs in-line in the script, distribute a (centrally maintained) graph properties file for reference in configuring the titan graph CDH5 has simplified distributed cache management

21
Daniel Kuppitz On

I just tested Faunus with user defined steps over The Graph of the Gods and it seems to work just fine. Here's what I did:

father.groovy

Gremlin.defineStep('father', [Vertex, Pipe], {_().out('father')})

def g

def setup(args) {
    conf = new org.apache.commons.configuration.BaseConfiguration()
    conf.setProperty('storage.backend', 'cassandrathrift')
    conf.setProperty('storage.hostname', '192.168.2.110')
    g = com.thinkaurelius.titan.core.TitanFactory.open(conf)
}

def map(v, args) {
    u = g.v(v.id)
    pipe = u.father().name
    if (pipe.hasNext()) u.fathersName = pipe.next()
    u.name + "'s father's name is " + u.fathersName
}

def cleanup(args) {
    g.shutdown()
}

In Faunus' Gremlin REPL:

gremlin> g.V.has('type','demigod','god').script('father.groovy')
...
==>jupiter's father's name is saturn
==>hercules's father's name is jupiter
==>neptune's father's name is null
==>pluto's father's name is null

If this doesn't help to solve your problem, please provide more details, so we can reproduce the errors you see.

Cheers, Daniel