Creating a Semigroup data type instance in Haskell

460 views Asked by At

May goal concerned with creating a new Semigroup type class instance for newly defined datatype in Haskell (For those who knows the "Get programming with Haskell" book by Will Kurt, I may refer you to the page 428,i.e. the end of the capstone project 5 with exercise extension).

There is a newly defined datatype:

data HINQ m a b = HINQ (m a -> m b) (m a) (m a -> m a)
                | HINQ_ (m a -> m b) (m a)

This datatype designates the SQL-like query, where m defines the context (Monad or Alternative), (m a -> m b) is the function whose goal is similar to SQL function SELECT,i.e. defines the property kind one wants to see in a database, (m a) is a "table" to which applies the previous function (similar to SQL's table_name) and finally (m a -> m a) filters out the property one is seeking for (similar to SQL's WHERE).

My goal is to make this datatype an instance of a Semigroup (and finally a Monoid). It is worth mentioning that all needed Semigroup instances for a, b etc are assumed.

instance (Semigroup a, Semigroup (m a), Semigroup b,...) => 
          Semigroup (HINQ m a b) where
  (<>) (HINQ func1 start1 test1)
       (HINQ func2 start2 test2) =

So the rough idea (on the background to see it clearer) of it is to make it possible to compose several different queries to a database into a single query, but I couldn't come up with an idea how to merge two different functions of type (m a -> m b) into one at the same time merging two tables (m a)... The first idea was to combine them into lists but then the type signature changes I haven't found yet the solution to this.

1

There are 1 answers

4
Daniel Wagner On BEST ANSWER

I think you do not want a Semigroup. It doesn't make sense to compose all pairs of queries -- only ones where the output type of the one matches up sensibly with the input type of the other! Luckily, we have a concept that corresponds to a "typed" variant of Semigroup (actually, a typed Monoid, but close enough): Category.

Also, I think it's a design mistake to couple the query with the table you are querying. They are notionally separate concepts; indeed, when composing two queries, you still have just one table, not two. So:

data HINQ m a b = HINQ (m a -> m b) (m a -> m a)

instance Category (HINQ m) where
    id = HINQ id id
    HINQ slct whr . HINQ slct' whr' = HINQ (slct . whr . slct') whr'

The identity laws are pretty clear, but the asymmetry between the uses of the right and left WHERE clauses looks a bit suspicious, so we should check the associativity law carefully:

(HINQ s0 w0 . HINQ s1 w1) . HINQ s2 w2
= HINQ (s0 . w0 . s1) w1 . HINQ s2 w2
= HINQ (s0 . w0 . s1 . w1 . s2) w2
= HINQ s0 w0 . HINQ (s1 . w1 . s2) w2
= HINQ s0 w0 . (HINQ s1 w1 . HINQ s2 w2)

Looks good!

EDIT

Err, hmm... maybe the x . id = x law isn't so clear after all. Yikes! This probably isn't fixable, unless you only consider equality up to the composition of the contained functions, in which case why not just use the Category instance for functions directly? The other option you have, of course, is not to demand the identity laws hold. That's a bit unusual, but I guess it depends a lot on your use case whether this is reasonable or not.

Composition might be easier if you represent your filtering more explicitly, e.g. as a -> Bool or a -> m Bool instead of m a -> m a. This gives you more of a chance to combine the two filters in your (.) implementation, rather than rolling one of the filters into the select operation as the instance above does.