Lazy/Eager loading/fetching in Neo4j/Spring-Data

7.3k views Asked by At

I have a simple setup and encountered a puzzling (at least for me) problem:

I have three pojos which are related to each other:

@NodeEntity
public class Unit {
    @GraphId Long nodeId;
    @Indexed int type;
    String description;
}


@NodeEntity
public class User {
    @GraphId Long nodeId;
    @RelatedTo(type="user", direction = Direction.INCOMING)
    @Fetch private Iterable<Worker> worker;
    @Fetch Unit currentUnit;

    String name;

}

@NodeEntity
public class Worker {
    @GraphId Long nodeId;
    @Fetch User user;
    @Fetch Unit unit;
    String description;
}

So you have User-Worker-Unit with a "currentunit" which marks in user that allows to jump directly to the "current unit". Each User can have multiple workers, but one worker is only assigned to one unit (one unit can have multiple workers).

What I was wondering is how to control the @Fetch annotation on "User.worker". I actually want this to be laoded only when needed, because most of the time I only work with "Worker".

I went through http://static.springsource.org/spring-data/data-neo4j/docs/2.0.0.RELEASE/reference/html/ and it isn't really clear to me:

  • worker is iterable because it should be read only (incoming relation) - in the documentation this is stated clarly, but in the examples ''Set'' is used most of the time. Why? or doesn't it matter...
  • How do I get worker to only load on access? (lazy loading)
  • Why do I need to annotate even the simple relations (worker.unit) with @Fetch. Isn't there a better way? I have another entity with MANY such simple relations - I really want to avoid having to load the entire graph just because i want to the properties of one object.
  • Am I missing a spring configuration so it works with lazy loading?
  • Is there any way to load any relationships (which are not marked as @Fetch) via an extra call?

From how I see it, this construct loads the whole database as soon as I want a Worker, even if I don't care about the User most of the time.

The only workaround I found is to use repository and manually load the entities when needed.

------- Update -------

I have been working with neo4j quite some time now and found a solution for the above problem that does not require calling fetch all the time (and thus does not load the whole graph). Only downside: it is a runtime aspect:

import org.aspectj.lang.ProceedingJoinPoint;
import org.aspectj.lang.annotation.Around;
import org.aspectj.lang.annotation.Aspect;
import org.aspectj.lang.annotation.Pointcut;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.mapping.model.MappingException;
import org.springframework.data.neo4j.annotation.NodeEntity;
import org.springframework.data.neo4j.support.Neo4jTemplate;

import my.modelUtils.BaseObject;

@Aspect
public class Neo4jFetchAspect {

    // thew neo4j template - make sure to fill it 
    @Autowired private Neo4jTemplate template;

    @Around("modelGetter()")
    public Object autoFetch(ProceedingJoinPoint pjp) throws Throwable {
        Object o = pjp.proceed();
        if(o != null) {
            if(o.getClass().isAnnotationPresent(NodeEntity.class)) {
                if(o instanceof BaseObject<?>) {
                    BaseObject<?> bo = (BaseObject<?>)o;
                    if(bo.getId() != null && !bo.isFetched()) {
                        return template.fetch(o);
                    }
                    return o;
                }
                try {
                    return template.fetch(o);
                } catch(MappingException me) {
                    me.printStackTrace();
                }
            }
        }
        return o;
    }

    @Pointcut("execution(public my.model.package.*.get*())")
    public void modelGetter() {}

}

You just have to adapt the classpath on which the aspect should be applied: my.model.package..get())")

I apply the aspect to ALL get methods on my model classes. This requires a few prerequesites:

  • You MUST use getters in your model classes (the aspect does not work on public attributes - which you shouldn't use anyways)
  • all model classes are in the same package (so you need to adapt the code a little) - I guess you could adapt the filter
  • aspectj as a runtime component is required (a little tricky when you use tomcat) - but it works :)
  • ALL model classes must implement the BaseObject interface which provides:

    public interface BaseObject { public boolean isFetched(); }

This prevents double-fetching. I just check for a subclass or attribute that is mandatory (i.e. the name or something else except nodeId) to see if it is actually fetched. Neo4j will create an object but only fill the nodeId and leave everything else untouched (so everything else is NULL).

i.e.

@NodeEntity
public class User implements BaseObject{
    @GraphId
    private Long nodeId;

        String username = null;

    @Override
    public boolean isFetched() {
        return username != null;
    }
}

If someone finds a way to do this without that weird workaround please add your solution :) because this one works, but I would love one without aspectj.

Base object design that doenst require a custom field check

One optimization would be to create a base-class instead of an interface that actually uses a Boolean field (Boolean loaded) and checks on that (so you dont need to worry about manual checking)

public abstract class BaseObject {
    private Boolean loaded;
    public boolean isFetched() {
        return loaded != null;
    }
    /**
     * getLoaded will always return true (is read when saving the object)
     */
    public Boolean getLoaded() {
        return true;
    }

    /**
     * setLoaded is called when loading from neo4j
     */
    public void setLoaded(Boolean val) {
        this.loaded = val;
    }
}

This works because when saving the object "true" is returned for loaded. When the aspect looks at the object it uses isFetched() which - when the object is not yet retrieved will return null. Once the object is retrieved setLoaded is called and the loaded variable set to true.

How to prevent jackson from triggering the lazy loading?

(As an answer to the question in the comment - note that I didnt try it out yet since I did not have this issue).

With jackson I suggest to use a custom serializer (see i.e. http://www.baeldung.com/jackson-custom-serialization ). This allows you to check the entity before getting the values. You simply do a check if it is already fetched and either go on with the whole serialization or just use the id:

public class ItemSerializer extends JsonSerializer<BaseObject> {
    @Override
    public void serialize(BaseObject value, JsonGenerator jgen, SerializerProvider provider)
      throws IOException, JsonProcessingException {
        // serialize the whole object
        if(value.isFetched()) {
            super.serialize(value, jgen, provider);
            return;
        }
        // only serialize the id
        jgen.writeStartObject();
        jgen.writeNumberField("id", value.nodeId);
        jgen.writeEndObject();
    }
}

Spring Configuration

This is a sample Spring configuration I use - you need to adjust the packages to your project:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:context="http://www.springframework.org/schema/context"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:neo4j="http://www.springframework.org/schema/data/neo4j"
       xmlns:tx="http://www.springframework.org/schema/tx"
       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.0.xsd
http://www.springframework.org/schema/data/neo4j http://www.springframework.org/schema/data/neo4j/spring-neo4j-2.0.xsd http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx-2.5.xsd">

    <context:annotation-config/>
    <context:spring-configured/>

    <neo4j:repositories base-package="my.dao"/> <!-- repositories = dao -->

    <context:component-scan base-package="my.controller">
        <context:exclude-filter type="annotation" expression="org.springframework.stereotype.Controller"/> <!--  that would be our services -->
    </context:component-scan>
    <tx:annotation-driven mode="aspectj" transaction-manager="neo4jTransactionManager"/>    
    <bean class="corinis.util.aspects.Neo4jFetchAspect" factory-method="aspectOf"/> 
</beans>

AOP config

this is the /META-INF/aop.xml for this to work:

<!DOCTYPE aspectj PUBLIC
        "-//AspectJ//DTD//EN" "http://www.eclipse.org/aspectj/dtd/aspectj.dtd">
    <aspectj>
        <weaver>
            <!-- only weave classes in our application-specific packages -->
            <include within="my.model.*" />
        </weaver>
        <aspects>
            <!-- weave in just this aspect -->
            <aspect name="my.util.aspects.Neo4jFetchAspect" />
        </aspects>
    </aspectj>
3

There are 3 answers

1
Niko On BEST ANSWER

Found the answer to all the questions myself:

@Iterable: yes, iterable can be used for readonly

@load on access: per default nothing is loaded. and automatic lazy loading is not available (at least as far as I can gather)

For the rest: When I need a relationship I either have to use @Fetch or use the neo4jtemplate.fetch method:

@NodeEntity
public class User {
    @GraphId Long nodeId;
    @RelatedTo(type="user", direction = Direction.INCOMING)
    private Iterable<Worker> worker;
    @Fetch Unit currentUnit;

    String name;

}

class GetService {
  @Autowired private Neo4jTemplate template;

  public void doSomethingFunction() {
    User u = ....;
    // worker is not avaiable here

    template.fetch(u.worker);
    // do something with the worker
  }  
}
2
s_t_e_v_e On

Not transparent, but still lazy fetching.

template.fetch(person.getDirectReports());

And @Fetch does the eager fetching as was already stated in your answer.

3
Samuel Kerrien On

I like the aspect approach to work around the limitation of the current spring-data way to handle lazy loading.

@niko - I have put your code sample in a basic maven project and tried to get that solution to work with little success:

https://github.com/samuel-kerrien/neo4j-aspect-auto-fetching

For some reasons the Aspect is initialising but the advice doesn't seem to get executed. To reproduce the issue, just run the following JUnit test:

playground.neo4j.domain.UserTest