Experiencing deadlocks when using the Hikari transactor for Doobie with ZIO

1.7k views Asked by At

I'm using Doobie in a ZIO application, and sometimes I get deadlocks (total freeze of the application). That can happen if I run my app on only one core, or if I reach the number of maximum parallel connections to the database.

My code looks like:

def mkTransactor(cfg: DatabaseConfig): RManaged[Blocking, Transactor[Task]] =
    ZIO.runtime[Blocking].toManaged_.flatMap { implicit rt =>
      val connectEC = rt.platform.executor.asEC
      val transactEC = rt.environment.get.blockingExecutor.asEC

      HikariTransactor
        .fromHikariConfig[Task](
          hikari(cfg),
          connectEC,
          Blocker.liftExecutionContext(transactEC)
        )
        .toManaged
    }

  private def hikari(cfg: DatabaseConfig): HikariConfig = {
    val config = new com.zaxxer.hikari.HikariConfig

    config.setJdbcUrl(cfg.url)
    config.setSchema(cfg.schema)
    config.setUsername(cfg.user)
    config.setPassword(cfg.pass)

    config
  }

Alternatively, I set the leak detection parameter on Hikari (config.setLeakDetectionThreshold(10000L)), and I get leak errors which are not due to the time taken to process DB queries.

1

There are 1 answers

0
fanf42 On

There is a good explanation in the Doobie documentation about the execution contexts and the expectations for each: https://tpolecat.github.io/doobie/docs/14-Managing-Connections.html#about-transactors

According to the docs, the "execution context for awaiting connection to the database" (connectEC in the question) should be bounded.

ZIO, by default, has only two thread pools:

  1. zio-default-async – Bounded,
  2. zio-default-blocking – Unbounded

So it is quite natural to believe that we should use zio-default-async since it is bounded.

Unfortunately, zio-default-async makes an assumption that its operations never, ever block. This is extremely important because it's the execution context used by the ZIO interpreter (its runtime) to run. If you block on it, you can actually block the evaluation progression of the ZIO program. This happens more often when there's only one core available.

The problem is that the execution context for awaiting DB connection is meant to block, waiting for free space in the Hikari connection pool. So we should not be using zio-default-async for this execution context.

The next question is: does it makes sense to create a new thread pool and corresponding execution context just for connectEC? There is nothing forbidding you to do so, but it is likely not necessary, for three reasons:

  • You want to avoid creating thread pools, especially since you likely have several already created from your web framework, DB connection pool, scheduler, etc. Each thread pool has its cost. Some examples are:

    • More to manage for the jvm JVM
    • Consumes more OS resources
    • Switching between threads, which that part is expensive in terms of performance
    • Makes your application runtime more complex to understand(complex thread dumps, etc)
  • ZIO thread pool ergonomics start to be well optimized for their usage

  • At the end of the day, you will have to manage your timeout somewhere, and the connection is not the part of the system which is the most likely to have enough information to know how long it should wait: different interactions (ie, in the outer parts of your app, nearer to use points) may require different timeout/retry logic.

All that being said, we found a configuration that works very well in an application running in production:

// zio.interop.catz._ provides a `zioContextShift`

  val xa = (for {
    // our transaction EC: wait for aquire/release connections, must accept blocking operations
    te <- ZIO.access[Blocking](_.get.blockingExecutor.asEC)
  } yield {
    Transactor.fromDataSource[Task](datasource, te, Blocker.liftExecutionContext(te))
  }).provide(ZioRuntime.environment).runNow

  def transactTask[T](query: Transactor[Task] => Task[T]): Task[T] = {
    query(xa)
  }

I made a drawing of how Doobie and ZIO execution context map one other to each other: https://docs.google.com/drawings/d/1aJAkH6VFjX3ENu7gYUDK-qqOf9-AQI971EQ4sqhi2IY

UPDATE: I created a repos with 3 examples of that pattern usage (mixed app, pure app, ZLayer app) here: https://github.com/fanf/test-zio-doobie Any feedback is welcome.