How to create a lattice-type data structure in Haskell?

304 views Asked by At

I am trying to build a lattice-type of FCA-type of data structure in Haskell where I could check if two entities have a join or not. Actually, I'm not even sure that the lattice is the right structure as it might be a bit "too much".

Here is the context. In a COBOL program analyzer, I have an ontology of data set names, files, records and fields. A data set name can have multiple file names depending on the program, a file can have multiple records and a record can have multiple fields. I'd like this hierarchy to be reflected in the Haskell data structure. But I'd like also to be able to have a relation inherited for file1 and file2 such that I can check if file1 and file2 belong to the same data set name. Actually, this relation could almost be that of "==". But it could simply be that they do have a join in dsn0, for instance.

I have other ontologies in that context that would benefit from a lattice or FCA data structure. For example, I have programs that belongs to job steps and job steps that belong to jobs. If I could easily figure out if two programs belong to the same job, that would be great. Here also, it seems like a "join" operator. Getting the extension (code) of a certain entity would be useful too.

I'm still a bit new to Haskell. I tried to look at the Lattice library but I'm not sure where to go from there, concretely. Any idea of how to get started? A small example of a lattice in Haskell would be very helpful. Thank you very much for your help (and patience).

UPDATE: The Lattice might not be the best formalism for this as mentioned in the comments. I realize I might just have to use a regular class type of data structure along those lines:

data DSN = DSN {
    programFiles :: [ProgramFile]
    name :: String
    ddn :: DDN
}
data ProgramFile = ProgramFile {
    records :: [Record]
    name :: String
}
data Record = Record {
    fields :: [Field]
    name :: String
}
data Field = Field {
    name :: String
    order :: Int
}

I guess my initial intention behind using a tree/lattice/FCA type of structure is to take full advantage of the functor potential in Haskell which should lead to interesting lattice operations including geting all the extension of a concept, checking that two concepts belong to the same higher-level concept, checking equality '==' of two files through their DSN,...

Maybe a non-binary tree structure would be better? Is it easy to do that in Haskell?

1

There are 1 answers

5
Daniel Wagner On BEST ANSWER

I recommend making an abstract data type to represent one-to-many relationships. It might look something like this:

module OneToMany (OMRel, empty, insert, delete, source, targets) where

import Data.Map (Map)
import Data.Set (Set)
import qualified Data.Map.Strict as M
import qualified Data.Set as S

data OMRel a b = OMRel
    { oneToMany :: Map a (Set b)
    , manyToOne :: Map b a
    } deriving (Eq, Ord, Read, Show)

empty :: OMRel a b
empty = OMRel M.empty M.empty

insert :: (Ord a, Ord b) => a -> b -> OMRel a b -> OMRel a b
insert a b (OMRel otm mto) = OMRel
    { oneToMany = M.insertWith S.union a (S.singleton b) $
        case M.lookup b mto of
            Just oldA -> M.adjust (S.delete b) oldA otm
            Nothing -> otm
    , manyToOne = M.insert b a mto
    }

delete :: (Ord a, Ord b) => a -> b -> OMRel a b -> OMRel a b
delete a b (OMRel otm mto) = OMRel (M.adjust (S.delete b) a otm) (M.delete b mto)

source :: Ord b => b -> OMRel a b -> Maybe a
source b = M.lookup b . manyToOne

targets :: Ord a => a -> OMRel a b -> Set b
targets a = M.findWithDefault S.empty a . oneToMany

(Of course you can flesh out the API with more efficient bulk operations like merging, bulk insert, sequential composition, etc. But this is sort of the minimal construct/consume API that gets you where you need to go.)

Then you need a couple data types to represent your various ontological entries:

newtype Dataset = Dataset { dataset :: String }
newtype Record = Record { record :: String }
newtype Field = Field { order :: Int }

From there you can use values with types like OMRel Dataset FilePath to represent the fact that datasets "contain" files. For querying containment equality, you can write this once and for all via the OMRel API above:

sameSource :: (Eq a, Ord b) => OMRel a b -> b -> b -> Bool
sameSource rel b b' = source b rel == source b' rel

(You may need an extra clause if two missing targets should be considered inequal.) Then, e.g., this can be specialized to

sameSource :: OMRel Dataset FilePath -> FilePath -> FilePath -> Bool

and friends.

You will not be able to make a Functor (or Bifunctor) instance for OMRel because of the Ord constraints. It's not clear to me that fmap/bimap make a TON of sense for this particular data structure, though. For example, if we have associations x :: a <-> y :: b and x' :: a <-> y' :: b, and f b = f b', then should fmap f associate f b with a or a'? If they do turn out to have a sensible interpretation and be useful, you could either pursue the constrained-monads approach or simply offer a function named bimap from the OneToMany module with the appropriate type, but without making it an instance method.