In what range does caml_hash_varian return values?

68 views Asked by At

The subchapter "20.3.6 Polymorphic variants" describes how to identify polymorphic variant value in C (*It contains a mistake: should be caml_hash_variant instead of hash_variant)

I want to use those hash values as error codes in C++ directly. Something like that

archive.mli:

...
type t = ...
type err = [`File_not_found | `Archive_is_corrupted]
val opena : string -> (t, err) error
... 

archive.ml

...
let () = Callback.register "open archive" opena
...

archive.cpp:

...
const int Error::File_not_found = caml_hash_variant("File_not_found")
const int Error::Archive_is_corrupted = caml_hash_variant("Archive_is_corrupted")

int Archive::open(char* path) {
  static const value* f = nullptr; \
  if (f = nullptr) 
      f = caml_named_value("open archive");

  value result = caml_callback(*f, caml_copy_string(path));
  if (Tag_val(result) == 0) { // Result.Ok
    archive = Field(caml_state, 0);
    return ??????
  } else { // Result.Error
    return Int_val(Field(caml_state, 0));
  }
}
...

There is no problem with to return error code and compare it

if (x.open(path) == Error::Archive_is_corrupted) {
...
}

But I don't know what I can return as OK status. 0? -1?
Does any guaranteed value that cannot be returned by caml_hash_variant exist?

1

There are 1 answers

3
Jeffrey Scofield On BEST ANSWER

Immediate values in the usual OCaml implementation have the low bit set, and the variant hash is an immediate value. So if you're looking at variant hash values in C++ you can be sure that the value 0 will never be returned by caml_hash_variant.

If you look at the code, the final value is generated either by Val_int() or Val_long(). In the definitions of these macros you'll see that they guarantee that the low bit is set.

I haven't done any kind of analysis of the code, but the value -1 is at least superficially possible as a hash value, since its low bit is set.

Update

The low bit is set on immediate values as a marker for the garbage collector. So it's a convention that must be followed strictly. (IMHO it's one of many really nice design tradeoffs in the OCaml implementation.)