How to process a recursive GADT with kind :: '[SomeDataKind]

311 views Asked by At

I think this is the furthest into the Haskell type system extensions that I've even been and I've run into an error that I haven't been able to figure out. Apologies in advance for the length, it's the shortest example I could create that still illustrates the issue I'm having. I have a recursive GADT whose kind is a promoted list, something like the following:

GADT Definition

{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE TypeOperators #-}

data DataKind = A | B | C

-- 'parts' should always contain at least 1 element which is enforced by the GADT.
-- Lets call the first piece of data 'val' and the second 'subdata'.
-- All data constructors have these 2 fields, although some may have
-- additional fields which I've omitted for simplicity.
data SomeData (parts :: [DataKind]) where
    MkA :: Maybe Char -> Maybe (SomeData subparts) -> SomeData ('A ': subparts)
    MkB :: Maybe Char -> Maybe (SomeData subparts) -> SomeData ('B ': subparts)
    MkC :: Maybe Char -> Maybe (SomeData subparts) -> SomeData ('C ': subparts)
deriving instance Show (SomeData parts)

Problem

What I'm trying to do is traverse the data and perform some operations, for instance propagate the first Just Char found up to the top.

Annoying missing features - needed for the next section

Now, because there is apparently no record syntax support for GADTs (https://ghc.haskell.org/trac/ghc/ticket/2595) you need to write them manually, so here they are:

getVal :: SomeData parts -> Maybe Char
getVal (MkA val _) = val
getVal (MkB val _) = val
getVal (MkC val _) = val

-- The lack of record syntax for GADTs is annoying.
updateVal :: Maybe Char -> SomeData parts -> SomeData parts
updateVal val (MkA _val sub) = MkA val sub
updateVal val (MkB _val sub) = MkB val sub
updateVal val (MkC _val sub) = MkC val sub

-- really annoying...
getSubData :: SomeData (p ': rest) -> Maybe (SomeData rest)
getSubData (MkA _ sub) = sub
getSubData (MkB _ sub) = sub
getSubData (MkC _ sub) = sub

Test Data

So, the thing I want to do. Walk down the structure from the top until I find a value that is Just. So given the following initial values:

a :: SomeData '[ 'A ]
a = MkA (Just 'A') Nothing

b :: SomeData '[ 'B ]
b = MkB (Just 'B') Nothing

c :: SomeData '[ 'C ]
c = MkC (Just 'C') Nothing

bc :: SomeData '[ 'B, 'C ]
bc = MkB Nothing (Just c)

abc :: SomeData '[ 'A, 'B, 'C ]
abc = MkA Nothing (Just bc)

Expected Result

I would like have something like this:

> abc
MkA Nothing (Just (MkB Nothing (Just (MkC (Just 'C') Nothing))))
> propogate abc
MkA (Just 'C') (Just (MkB (Just 'C') (Just (MkC (Just 'C') Nothing))))

Previous Attempts

I took a few stabs at it, first with a regular function:

propogate sd =
    case getVal sd of
        Just _val ->
            sd
        Nothing ->
            let
                newSubData = fmap propogate (getSubData sd)
                newVal = join . fmap getVal $ newSubData
            in
                updateVal newVal sd

This gives the error:

Broken.hs:(70,1)-(81,35): error: …
    • Occurs check: cannot construct the infinite type: rest ~ p : rest
      Expected type: SomeData rest -> SomeData (p : rest)
        Actual type: SomeData (p : rest) -> SomeData (p : rest)
    • Relevant bindings include
        propogate :: SomeData rest -> SomeData (p : rest)
          (bound at Broken.hs:70:1)
Compilation failed.

I also tried a typeclass and attempting to match on the structure:

class Propogate sd where
    propogateTypeClass :: sd -> sd

-- Base case: We only have 1 element in the promoted type list.
instance Propogate (SomeData parts) where
    propogateTypeClass sd = sd    

-- Recursie case: More than 1 element in the promoted type list.
instance Propogate (SomeData (p ': parts)) where
    propogateTypeClass sd =
        case getVal sd of
            Just _val ->
                sd
            Nothing ->
                let
                    -- newSubData :: Maybe subparts
                    -- Recurse on the subdata if it exists.
                    newSubData = fmap propogateTypeClass (getSubData sd)
                    newVal = join . fmap getVal $ newSubData
                in
                    updateVal newVal sd

This results in the error:

Broken.hs:125:5-26: error: …
    • Overlapping instances for Propogate (SomeData '['A, 'B, 'C])
        arising from a use of ‘propogateTypeClass’
      Matching instances:
        instance Propogate (SomeData parts)
          -- Defined at Broken.hs:91:10
        instance Propogate (SomeData (p : parts))
          -- Defined at Broken.hs:95:10
    • In the expression: propogateTypeClass abc
      In an equation for ‘x’: x = propogateTypeClass abc
Compilation failed.

I've also tried combinations of matching on SomeData '[] and SomeData '[p] to no avail.

I'm hoping I'm missing something simple but I haven't found documentation anywhere on how to process a structure like this and I'm at the limits of my understand if the Haskell type system, for now anyways :). Once again, sorry for the long post and any help would be greatly appreciated :)

2

There are 2 answers

3
user2407038 On BEST ANSWER

The error you get is a red herring - GHC cannot infer the types of functions on GADTs. You must give it a type signature to see the real mistake:

propogate :: SomeData x -> SomeData x 
....

This gives the error:

* Couldn't match type `x' with `p0 : parts0'
  `x' is a rigid type variable bound by
    the type signature for:
      propogate :: forall (x :: [DataKind]). SomeData x -> SomeData x
  Expected type: SomeData (p0 : parts0)
    Actual type: SomeData x

If it is not clear, this error message says that we claim that the type arguement to SomeData is just a variable (that is, we know nothing about it, except its kind) but the use of some functions inside propogate require it to be of the form p0 : parts0 for some types p0 : parts0.

However, you state at the beginning:

'parts' should always contain at least 1 element which is enforced by the GADT.

but the compiler doesn't know this! You must tell it, and typically this is done by pattern matching on a GADT constructor which brings a type equality into scope. Your only issue here is you would have to match on three constructors, all of which would have identical code. The solution is to remove the duplicate from your datatype:

data SomeData (parts :: [DataKind]) where
  Mk :: SDataKind s -> Maybe Char -> Maybe (SomeData subparts) -> SomeData (s ': subparts)

data SDataKind (x :: DataKind) where 
  SA :: SDataKind A 
  SB :: SDataKind B 
  SC :: SDataKind C 

Packages like singletons will generate the SDataKind type and related functions on the type for you automatically.

All of your 'record' functions are greatly simplified:

getVal :: SomeData parts -> Maybe Char
getVal (Mk _ v _) = v 

updateVal :: Maybe Char -> SomeData parts -> SomeData parts
updateVal val (Mk t _val sub) = Mk t val sub

getSubData :: SomeData (p ': rest) -> Maybe (SomeData rest)
getSubData (Mk _ _ sub) = sub

And now, in order to get your function to typecheck (since it is indeed correct semantically), all you must do is pattern match on the constructor:

propogate :: SomeData x -> SomeData x 
propogate sd@Mk{} =
    .... -- unchanged

Note that the code is identical, other than the addition of a type signature and @Mk{} after the variable pattern sd.

Finally, trying to use typeclasses here does not give you anything - you already have an indexed type which contains all the (type) information which you need - you obtain this information by pattern matching on the constructors of that type.


Also note you do not lose any generality in SomeData - if you want particular indices to have extra information, you simply add another field indexed on the part. You can even embed complex logic inside of the extra field, as the SDataKind field allows you to pattern match on the index at will:

data SomeData (parts :: [DataKind]) where
  Mk :: SDataKind s 
     -> DataPart s 
     -> Maybe Char 
     -> Maybe (SomeData subparts) 
     -> SomeData (s ': subparts) 

type family (==?) (a :: k) (b :: k) :: Bool where 
  a ==? a = 'True 
  a ==? b = 'False 

-- Example - an additional field for only one index, without duplicating 
-- the constructors for each index
data DataPart x where 
  Nil :: ((x ==? 'A) ~ 'False) => DataPart x 
  A_Data :: Integer -> DataPart A

Finally, if you wish to trudge along the original course, you can do so, but the source of the code duplication should become very apparent:

partsUncons :: SomeData ps0 -> (forall p ps . (ps0 ~ (p : ps)) => r) -> r 
partsUncons MkA{} x = x 
partsUncons MkB{} x = x 
partsUncons MkC{} x = x 

propogate :: SomeData x -> SomeData x 
propogate sd = partsUncons sd $ 
    .... -- unchanged 
1
Cactus On

This might seem a bit boilerplate-ish, but if you just write out the three cases of propagate, then it can be made simple (if a bit verbose):

import Control.Monad (mplus)
import Control.Arrow (second)

propagate' :: (Maybe Char -> Maybe (SomeData ps) -> SomeData ps')
           -> Maybe Char -> Maybe (SomeData ps) -> (Maybe Char, SomeData ps')
propagate' mk c ds = (c'', mk c'' ds')
  where
    (c', ds') = maybe (Nothing, Nothing) (second Just . propagate) ds
    c'' = c `mplus` c'

propagate :: SomeData ps -> (Maybe Char, SomeData ps)
propagate (MkA c ds) = propagate' MkA c ds
propagate (MkB c ds) = propagate' MkB c ds
propagate (MkC c ds) = propagate' MkC c ds