Is there a SoftHashMap in Scala?

1k views Asked by At

I'm aware of this question for java, but none of those implementations seem to play well with scala.collection.JavaConversions.

I'm looking for something simple (e.g. single file, not a whole library) that implements SoftHashMap such that it plays well with Scala Map (i.e. supports getOrElseUpdate, unzip, and the remaining Scala Map methods).

4

There are 4 answers

0
dsg On BEST ANSWER

Implementation inspired by this java WeakHashMap:

import scala.collection.mutable.{Map, HashMap}
import scala.ref._


class SoftMap[K, V <: AnyRef] extends Map[K, V]
{
  class SoftValue[K, +V <: AnyRef](val key:K, value:V, queue:ReferenceQueue[V]) extends SoftReference(value, queue)

  private val map = new HashMap[K, SoftValue[K, V]]

  private val queue = new ReferenceQueue[V]

  override def += (kv: (K, V)): this.type =
  {
    processQueue
    val sv = new SoftValue(kv._1, kv._2, queue)
    map(kv._1) = sv
    this
  }

  private def processQueue
  {
    while (true)
    {
      queue.poll match
      {   
        case Some(sv:SoftValue[K, _]) => map.remove(sv.key)
        case _ => return
      }
    }
  }


  override def get(key: K): Option[V] = map.get(key) match
  {
    case Some(sv) => sv.get match
      { case v:Some[_] => v
        case None => {map.remove(key); None} }
    case None => None
  }



  override def -=(key: K):this.type =
  {
    processQueue
    map.remove(key)
    this
  }

  override def iterator: Iterator[(K, V)] =
  {
    processQueue
    map.iterator.collect{ case (key, sv) if sv.get.isDefined => (key, sv.get.get) }
  }

  override def empty:SoftMap[K, V] = new SoftMap[K, V]

  override def size = {processQueue; map.size}
}
2
Blaisorblade On

As other people have observed, SoftReferences are usually not The Right Thing to build a cache. However, some libraries provide better replacements. While the OP requires not using a library, I still think this is the best answer possible. Plus, with SBT downloading the library is quite simple.

In build.sbt, assuming you're building your project with SBT >= 0.10 (tested with 0.12), add:

libraryDependencies += "com.google.guava" % "guava" % "13.0"

libraryDependencies += "com.google.code.findbugs" % "jsr305" % "1.3.9" //Needed by guava, but marked as optional; at least Scalac 2.10 however does require it to parse annotations.

In client code, you can build your map as follows (look into CacheBuilder's option for the meaning of the various parameters; the ones below are the ones I chose for my use case):

For Scala 2.10:

import com.google.common.cache.CacheBuilder
import collection.JavaConversions._

def buildCache[K <: AnyRef, V <: AnyRef]: collection.concurrent.Map[K, V] =
  CacheBuilder.newBuilder()
    .maximumSize(10000).expireAfterAccess(10, TimeUnit.MINUTES)
    .concurrencyLevel(2)
    .build[K, V]().asMap()

For Scala 2.9 (deprecated/not compiling in 2.10):

import com.google.common.cache.CacheBuilder
import collection.{JavaConversions, mutable}
import JavaConversions._

def buildCache2_9[K <: AnyRef, V <: AnyRef]: mutable.ConcurrentMap[K, V] =
  CacheBuilder.newBuilder()
    .maximumSize(10000).expireAfterAccess(10, TimeUnit.MINUTES)
    .concurrencyLevel(2)
    .build[K, V]().asMap()

To make it work for both versions, call the implicit conversion explicitly - it's JavaConversions.asScalaConcurrentMap. I've reported the problem on the scala-language mailing list (and will also report it on the bug tracker), so I hope that the 2.9 code will at least compile in 2.10 (while still causing deprecation warning): https://groups.google.com/d/topic/scala-language/uXKRiGXb-44/discussion.

0
iron9light On

I found one in liftweb.

I do not use it yet, so check it yourself please.

http://scala-tools.org/mvnsites/liftweb-2.4-M5/net/liftweb/util/SoftReferenceCache.html

4
Jed Wesley-Smith On

Occasionally you get asked a question like, "what's the best way to poke your eye out with a stick" that you can go to great lengths answering how you should harden and sterilise the stick after carving a 1 inch hook into the end, and follow up with where the stick should be inserted and so on. Really though, the best answer is probably not exactly what was asked – but the question as to why on earth you'd want to do this in the first place!

This is one of those questions.

SoftReferences are something that initially sound like something you might want. A reference that does not get garbage collected until there is GC pressure. Presumably, you'd use this to cache something that was worth caching, usually because it was expensive to create in the first place.

The problem is, SoftRefs clear almost exactly when you don't want them to, when the GC is under pressure! This means that they will need to be recreated (expensive op) right when the VM is already busy and under GC pressure.

Furthermore, there is no way to hint the VM as to the priority of objects that are soft-referenced. The particular algorithm used for selecting which objects to clear is unspecified and VM dependent.

Essentially, SoftReferences are a misguided attempt to off-load an application level concern (caching) to the garbage-collector. You really should never* actually use them.

*never, modulo some very small and very specialised use-cases