TypeScript: Hacking around unsoundness in a Map `getOrCreate` helper function

165 views Asked by At

I have a helper function for getting an entry from a map, adding it if it's not present already.

export function mapGetOrCreate<K, V>(map: Map<K, V>, key: K, valueFn: (key: K) => V): V {
    let value = map.get(key);
    if (value === undefined) {
        value = valueFn(key);
        assert(value !== undefined);
        map.set(key, value);
    }
    return value;
}

(Full code in TS playground)

TypeScript's unsoundness around generic type variance ends up causing this practical issue for me:

type A = {a: number};
type B = A & {b: string};
type C = B & {c: boolean};

declare function createA(): A;
declare function createB(): B;
declare function createC(): C;

function f(m: Map<string, B>, element: Base) {
    m.set('1', createA()); // TS error (good)
    m.set('1', createB());
    m.set('1', createC());

    mapGetOrCreate(m, '1', createA); // missing error!
    mapGetOrCreate(m, '1', createB);
    mapGetOrCreate(m, '1', createC);
}

I've found that adding another type parameter (V2) to the function signature "fixes" the issue:

export function mapGetOrCreate2<K, V, V2 extends V>(map: Map<K, V>, key: K, valueFn: (key: K) => V2): V {
    let value = map.get(key);
    if (value === undefined) {
        value = valueFn(key);
        assert(value !== undefined);
        map.set(key, value);
    }
    return value;
}

It's still unsound, but at least TypeScript can't automatically infer the types, which is a slight improvement.

Questions:

  1. Is there a better way to accomplish what I'm doing?
  2. Are there ways in which mapGetOrCreate2 is worse than the original?
1

There are 1 answers

0
jcalz On BEST ANSWER

You are looking for noninferential type parameter usage as requested in microsoft/TypeScript#14829. There is no official support for this, but there are a number of techniques to get this effect, one of which is what you are using already.


Just to be clear for anyone who comes across this question later, the unsoundness here is that TypeScript allows Map<K, U> to be assignable to Map<K, T> whenever U is assignable to T:

function technicallyUnsound<K, T, U extends T>(mapU: Map<K, U>) {
    const mapT: Map<K, T> = mapU;
}

This is actually completely reasonable as long as you're only reading from the maps, but when you write to them you can end up getting in trouble:

function technicallyUnsound<K, T, U extends T>(mapU: Map<K, U>, k: K, t: T) {
    mapU.set(k, u); // error, as desired
    const mapT: Map<K, T> = mapU;
    mapT.set(k, t); // no error, uh oh
}

const m = new Map<string, Date>();
technicallyUnsound(m, "k", {});
m.get("k")?.getTime(); // compiles okay, but
// RUNTIME ERROR: u is not defined

This is just the way that TypeScript is; it has a set of useful features, such as object mutability, subtyping, aliasing, and method bivariance, that allow for increased developer productivity but can be used in unsafe ways. Anyway, see this SO answer for more details around such soundness issues.

There's really no way to completely prevent this; no matter what you do, even if you can harden mapGetOrCreate(), you can always use such aliasing to get around it:

mapGetOrCreate2(m, "", createA) // error, but
const n: Map<string, A> = m;
mapGetOrCreate2(n, "", createA) // no error

With that caveat in mind, though, what are the options for hardening mapGetOrCreate()?


The real issue you're having with mapGetOrCreate() is TypeScript's generic type parameter inference algorithm. Let's boil down mapGetOrCreate() to a function g() where we forget about the key type K (using just string). In the following call:

declare function g<T>(map: Map<string, T>, valueFn: (key: string) => T): T;
g(m, createA); // no error, not good
// function g<A>(map: Map<string, A>, valueFn: (key: string) => A): A

The compiler infers that the type parameter T should be specified by the type A, because valueFn returns an A, and a Map<string, B> is also considered a valid Map<string, A>.

Ideally, you'd like the compiler to infer T from map alone, and then check that valueFn is assignable to (key: string) => T for that T. In valueFn you'd just like to use T and not infer it.

Put in other words, you're looking for noninferential type parameter usage, as requested in microsoft/TypeScript#14829. And as I said in the beginning, there's no official way of doing this.


Let's look at the unofficial ways:

One unofficial way is to use an additional type parameter U which is constrained to the original type parameter T. Since U will be inferred separately from T, it will look "non-inferential" from the point of view of T. This is what you've done with mapGetOrCreate2:

declare function g<T, U extends T>(map: Map<string, T>, valueFn: (key: string) => U): T;
g(m, createA); // error! A is not assignable to B
// function g<B, B>(map: Map<string, B>, valueFn: (key: string) => B): B
g(m, createB); // okay
g(m, createC); // okay

Another unofficial way is to intersect the "non-inferential" locations with {}, which "lowers the priority" of that inference site. This is less reliable and only works for types X where X & {} does not narrow X (so X cannot have undefined or null in it), but it also works for this case:

type NoInfer<T> = T & {};
declare function g<T>(map: Map<string, T>, valueFn: (key: string) => NoInfer<T>): T;
g(m, createA); // error! A is not assignable to type NoInfer<B>
// function g<B>(map: Map<string, B>, valueFn: (key: string) => NoInfer<B>): B
g(m, createB); // okay
g(m, createC); // okay

And the final way I know is to use the fact that evaluation of conditional types is deferred for as-yet unresolved type parameters. So NoInfer<T> will eventually evaluate to T, but this will happen after type inference has occurred. And it also works here:

type NoInfer<T> = [T][T extends any ? 0 : never];
declare function g<T>(map: Map<string, T>, valueFn: (key: string) => NoInfer<T>): T;
g(m, createA); // error! A is not assignable to type B
// function g<B>(map: Map<string, B>, valueFn: (key: string) => B): B
g(m, createB); // okay
g(m, createC); // okay

All three of these approaches are workarounds and do not work in all cases. You can read through ms/TS#14829 if you're interested in details and discussion about it. My main point here is that if your technique works for your use cases, then it's probably fine and I don't know of any obviously superior technique.

The only way I'd say a modified version is worse than the original is that it is more complicated and requires more testing. The problem you are trying to avoid doesn't actually seem to come up incredibly often in practice (which is why method bivariance is part of the language); since you are actually running into the problem, then you should probably actually solve it, so the added complexity is worthwhile. But since such hardening is fundamentally impossible in the face of an unsound type system, there will quickly be a point of diminishing returns, after which is better to just embrace the unsoundness and write some more defensive runtime checking, and give up trying to carve out a territory of pure soundness.

Playground link to code