What are the similarities and differences between Scala Transducers and Clojure Transducers?

1.3k views Asked by At

Paul Chiusano and Rúnar Óli have written a fantastic book Functional programming in Scala. In it they mention a little-referenced concept in the Scala community - Transducers.

Scala Transducers in the book Functional Programming In Scala

In the Clojure Community - Transducers get a little more press.

My question is: What are the similarities and differences between Scala Transducers **(from the book Functional Programming in Scala) and Clojure Transducers?**

Assumptions:

I'm aware that

  1. Transducers are common parlance from their concept in Electrical Engineering

  2. There is a pre-existing concept in Computer Science called a Finite State Transducer

  3. There is a precedent in Biology and Psychology adopting the word transduction

  4. There is already a history of other technical books like SICP adopting the word Transducers.

2

There are 2 answers

0
Steven Shaw On BEST ANSWER

The stream transducers from the book Function Programming in Scala (FPiS) and Clojure's transducers are quite similar. They are a generalisation of the idea of having a "machine" (step function) to process the input stream into the output stream. FPiS's transducers are called Processes. Rich Hickey also uses the term process in his introductory talk on transducers in Clojure.

Origins

The design of FPiS's transducers is based on Mealy machines. Mealy machines are said to have:

transition function T : (S, I) -> S
output function     G : (S, I) -> O

These functions can be fused together to form:

step: (S, I) -> (S, O)

Heres it's easy to see that the step function operates on the current state of the machine and the next input item to produce the next state of the machine and output item.

One of the combinators from FPiS uses such a step function:

trait Process[I, O] {
  ...
  def loop[S, I, O](z: S)(f: (I,S) => (O,S)): Process[I, O]
  ...
}

This loop function is essentially the seeded left reduction that Rickey talks about in this slide.

Context agnostic

Both can be used in many different context (such as lists, streams, channels, etc.).

In FPiS transducers, a process type is:

trait Process[I, O]

All it knows about are it's input elements and it's output elements.

In Clojure, it's a similar story. Hickey calls this "fully decoupled".

Composition

Both types of transducers can be composed.

FPiS uses a "pipe" operator

map(labelHeavy) |> filter(_.nonFood)

Clojure uses comp

(comp
  (filtering non-food?)
  (mapping label-heavy))

Representation

In Clojure:

reducer:    (whatever, input) -> whatever
transducer: reducer -> reducer

In FPiS:

// The main type is
trait Process[I, O]

// Many combinators have the type
Process[I, O] ⇒ Process[I, O]

However, FPiS's representation isn't just a function under the hood. It's a case-class (algebraic data type) with 3 variants: Await, Emit, and Halt.

case class Await[I,O](recv: Option[I] => Process[I,O])
case class Emit[I,O](head: O, tail: Process[I,O]
case class Halt[I,O]() extends Process[I,O]
  • Await plays the part of the reducer->reducer function from Clojure.
  • Halt plays the part of reduced in Clojure.
  • Emit stands in stead of calling the next step function in Clojure.

Early termination

Both support early termination. Clojure does it using a special value called reduced which can be tested for via the reduced? predicate.

FPiS uses a more statically typed approach, a Process can be in one of 3 states: Await, Emit or Halt. When a "step function" returns a process of state Halt, the processing function knows to stop.

Efficiency

In some points they are again similar. Both types of transducers are demand-driven and don't generate intermediate collections. However, I'd imagine that FPiS's transducers are not as efficient when pipelined/composed as the internal representation is more than "just a stack of function calls" as Hickey puts it. I'm only guessing here about the efficiency/performance though.

Look into fs2 (previously scalaz-stream) for a perhaps more performant library that is based on the design of the transducers in FPiS.

Example

Here's an example of filter in both implementations:

Clojure, from Hickey's talk slides:

(defn filter
  ([pred]
    (fn [rf]
      (fn
        ([] (rf))
        ([result] (rf result))
        ([result input]
          (if (prod input)
            (rf result input)
            result)))))
  ([pred coll]
    (sequence (filter red) coll)))

In FPiS, here's one way to implement it:

def filter[I](f: I ⇒ Boolean): Process[I, I] =
  await(i ⇒ if (f(i)) emit(i, filter(f))
            else filter(f))

As you can see, filter is built up here from other combinators such as await and emit.

Safety

There are a number of places where you have to be careful when implementing Clojure transducers. This seems to be a design trade-off favouring efficiency. However, this downside would seem to effect mostly library producers rather than end-users/consumers.

  • If a transducer gets a reduced value from a nested step call, it must never call that step function again with input.
  • Transducers that require state must create unique state and may not be aliased.
  • All step functions must have an arity-1 variant that does not take an input.
  • A transducer's completion operation must call its nested completion operation, exactly once, and return what it returns.

The transducer design from FPiS favours correctness and ease of use. The pipe composition and flatMap operations ensure that completion actions occur promptly and that errors are handled appropriately. These concerns are not a burden to implementors of transducers. That said, I imagine that the library may not be as efficient as the Clojure one.

Summary

Both Clojure and FPiS transducers have:

  • similar origins
  • the ability to be used in different contexts (list, streams, channels, file/network io, database results)
  • demand-driven / early termination
  • finalisation/completion (for resource safety)
  • tasty :)

They differ somewhat in their underlying representation. Clojure style transducers seem to favour efficiency whereas FPiS transducers favour correctness and compositionality.

0
metasoarous On

I am not particularly familiar with Scala's concept of transducers or how ubiquitous that terminology is, but from the snippet of text you've posted above (and my knowledge of transducers), here's what I can say:

  • They are very different
  • They are similar only in that they both relate to how one transforms one collection into another

What I can tell about Scala Transducers:

It would seem from the definition above that any function or callable with a signature roughly along the lines of

Stream[A] -> Stream[B]

So for instance, a mapping function that worked streams would count as a transducer in this case.

That's it; pretty simple really.

Clojure Transducers:

A Clojure transducer is a function that transforms one reducing function into another. A reducing function is one that can be used with reduce. That is, if Clojure had signatures, it would have a signature

(x, a) -> x

In English, given some starting collection x, and "the next thing" a in the collection being reduced, our reducing function returns "the next iteration of the collection being built".

So, if this is the signature of a reducing function, a transducer has signature

((x, a) -> x) -> ((x, b) -> x)

The reason transducers were added to Clojure is that with the addition or core.async channels, Rich Hickey and friends found themselves reimplimenting all of the stanard collection functions to work with channels (map, filter, take, etc.). RH wondered if there wasn't a better way here, and went to work thinking about how to decomplect the logic of these various collection processing functions from the mechanics of the collection types at hand. Explaining exactly how transducers do this is I think outside the scope of this question however, so I'll get back to the point. But if you're interested, there is a lot of literature on this that's easy to find and work through.

So how do these things relate?

Clearly, these are very different concepts, but here's how I see them relating:

While Scala transducers are collection processing functions for Streams (as apposed to other Scala collections), Clojure's transducers are really a mechanism for unifying the implementation of collection processing functions across different collection types. So one way of putting it would be the if Scala had Clojure's notion of transducers, Scala's notion of transducers could be implemented in terms of Clojure's notion of transducers, which are more abstract/general processing functions reusable for multiple collection types.