elastic4s: How to automagically read document id into case class instance?

Question

elastic4s: How to automagically read document id into case class instance?

196 views Asked by aebblcraebbl At 06 May 2021 at 11:48

Using elastic4s 7.12.1 with spray-json 1.3.6 (and scala 2.13.5):
Is there a way to read the _id of an Elasticsearch document into a field, e.g . id, of a case class instance,
using only an implicit spray-json RootJsonFormat, i. e. without wiriting a custom HitReader for elastic4s and if so, how?
The same goes for writing documents: Is there a way to insert an instance of a case class without serializing (making it part of the _source in ES) the id field using only the aforementioned RootJsonFormat, i. e. without writing a custom Indexable?
According to the elastic4s documentation, this should be possible using jackson, which I want to avoid because of it's numerous critical security issues, that come up all the time.

Consider this case class, which should be indexed into ES:

case class Foo(id: String, name: String)

Using spray-json, I would only need to define a RootJsonFormat:

implicit val fooJsonFormat: RootJsonFormat[Foo] = jsonFormat2(Foo)

And could use elastic4s this way to index and search Foos:

val someFoo = Foo("idWhichShouldBeOverwrittenByES", "someName")
client.execute {
  indexInto("foos").doc(someFoo)
}

val result: Response[SearchResponse] = client.execute {
      search("foos").query {
        boolQuery().must {
          matchQuery("name", "someName")
        }
      }
    }.await

result match {
        case RequestSuccess(_, _, _, result) => result.to[Foo].foreach(println)
        case RequestFailure(_, _, _, error) => println(error.toString)
      }

However, there are major problems with this approach:

I need to provide an id when creating a Foo, while I actually want ES to generate the _id for me when indexing the document. This is of course primarily caused by using a case class
When loading a Foo document, its id field contains the (meaningless) dummy value I used when I indexed it, not the actual _id under which it's stored inside the ES node

To solve these problems (the first one only partially), I could of course write my own Indexable and HitReader like this:

  implicit object FooHitReader extends HitReader[Foo] {
    override def read(hit: Hit): Try[Foo] = Try({
      val source = hit.sourceAsMap
      Foo(
        id = hit.id,
        name = source("name").toString
      )
    })
  }

  implicit object FooIndexable extends Indexable[Foo] {
    override def json(t: Foo): String =
      JsObject(
        "name" -> JsString(t.name),
      ).compactPrint
  }

This doesn't look too terrible in a small example, but I think it's obvious that this approach scales horribly, provides no type safety and is a refactoring nightmare, since the names of the fields (e. g. "name") need to be specified manually.

Bottomline: Is there a better a way to achieve a spring-data-elasticsearch-like experience or is elastic4s with spray-json just not suited for this task?

edit: Another possibility would be to remove the id field from Foo, introduce a wrapper case class, e.g. FooResultWrapper, which stores Foo search results by _id in a Map[String, Foo], use a RootJsonFormat[Foo] and HitReader[FooResultWrapper] that converts the _source to Foo and stores it by hit.id. But that's also not very satisfying.

Original Q&A

There are 1 answers

**aebblcraebbl** · Answer 1 · 2021-05-06T14:33:21+00:00

Behold the brilliant solution I came up with (basically what I suggested in the edit of the question):
Removed the id fields of my domain case class (e. g. Foo) and introduced a generic case class to wrap the results and force using objects to implement read from elastic4s for the specific case class:

case class ESResultWrapper[T](id: String, result: T)

along with a generic trait, which contains the implementation for wrapping results of type T in ESResultWrapper instances:

trait ESResultWrapperHitReader[T] extends HitReader[ESResultWrapper[T]] {
  def readInternal(hit: Hit)(implicit reader: HitReader[T]): Try[ESResultWrapper[T]] = Try({
    ESResultWrapper(
      id = hit.id,
      result = hit.to[T]
    )
  })
}

Now all that's left for actual "domain" classes is to extend the ESResultWrapperHitReader[T] trait with the specific case class (for which a RootJsonFormat also exists) and delegating hit to hitInternal, thereby implicitly providing a HitReader[T] through the RootJsonFormat[T]:

  implicit object FooResultWrapperHitReader extends ESResultWrapperHitReader[Foo] {
    override def read(hit: Hit): Try[ESResultWrapper[Foo]] = readInternal(hit)
  }

Usage is pretty simple (sticking with the example from the question):

result match {
        case RequestSuccess(_, _, _, result) => result.to[ESResultWrapper[Foo]].foreach(println)
        case RequestFailure(_, _, _, error) => println(error.toString)
      }

leads to e. g.: ESResultWrapper(-XMSQXkB-5ze1JvrVWup,Foo("someFoo"))
And the best part: Changig the wrapping implementation doesn't impact the domain classes.

I applaud myself for coming up with this on my 3rd day of using Scala. Good job.

TechQA.

elastic4s: How to automagically read document id into case class instance?

There are 1 answers

Related Questions in SCALA

Related Questions in ELASTICSEARCH

Related Questions in SPRAY-JSON

Related Questions in ELASTIC4S

Popular Questions

Trending Questions