I would like to write a safe zip function in coq that accepts the argument length equality as an argument.
Fixpoint zip {b a:Type} (proof : length l1 = length l2) (l1 : list a) (l2 : list b) : list (a * b) :=
match l1,l2 with
| nil,nil => nil
| cons a a',cons b b' => cons (a,b) (zip a' b')
| _,_ => (* never reached *)
end.
what is the general approach to this kind of problem? I would appreciate comments and resources on using refinement types in the context of coq functions generally.
Here is the best approach to this particular problem, in my opinion:
In words, we first define a version of
zip
that does not care about lengths, and then we use that to define the function you're looking for.This might feel like cheating; after all, the
zip
function doesn't even use its proof argument! Here is another version that is perhaps closer to what you were originally looking for:Unlike
zip
,zip'
uses its proof argument in two ways. In the contradictory cases, it invokes some tactic code (ltac:(easy)
) to argue that this case cannot arise. In the recursive case, it needs to find a proof oflength xs = length ys
to apply the recursive call; for this, it uses thecongruence
tacitc.Why is
zip
better thanzip'
? Its code is shorter and easier to read. In particular, note howzip'
has a match returning a function. This idiom, known as the convoy pattern, is needed whenever we need to refine the type of an argument in a pattern matching branch. Actually,zip'
is even worse than what you might think, because the tactics that discharge the proof obligations generate code. Try printingzip'
to see what the definition really looks like! Sadly, this ugliness is not just cosmetic: these more complicated definitions are much harder to reason about. For instance, it is possible to prove thatzip
andzip'
always produce the same outputs. Try it to see how fun it is!To be fair, there are Coq plugins that make it easier to write this sort of code (e.g. the Equations plugin). But at the end of the day they will still generate code that is equivalent to
zip'
. And in this case, the length hypothesis doesn't buy us much.In general, one is better off avoiding dependent types in Coq, unless there is a strong argument that justifies the additional complexity. For instance, in the case of
zip
, suppose that you have some code that uses a lot of different lists of the same lengthn
. You might want to argue thatzip
has an inverse:It is not possible to prove this result unless we know that
xs
andys
have the same length. We can add an additional hypothesis to our lemma, or we can work with length-indexed lists. Here is one possible definition:The
{.. | ..}
is Coq's notation for refinement, or subset types. We can then repackage some of the functions over lists to work overvec
. For instance, we can show thatmap
takesvec A n
tovec B n
. This approach pays off if you don't use many functions that require you to change the length indexn
, because in those cases you need to reason about the equality of complicated length expressions on types, which Coq is not very good at. Forvec
in particular, I would recommend you to have a look at thetuple
library of mathcomp (available here), which provides a good example of how this pattern can be used at scale.Edit
More on the
easy
tactic: https://coq.inria.fr/refman/proofs/automatic-tactics/auto.html#coq:tacn.easyCalling tactic code to build terms: https://coq.github.io/doc/V8.11.1/refman/language/gallina-extensions.html#solving-existential-variables-using-tactics