Retrieve some meta information from the arbitrary JavaScript object monomorphically (v8)?

319 views Asked by At

Question for v8 experts.

Recently, I've discovered the situation with polymorphism in v8. It is that polymorphism is well-optimized only up to the 4 object "shapes", after which the performance degrades significantly. Classes and object inheritance is ignored. This was a very frustrating discovery, because, with such limitations, well-structured code won't be performant. It seems it is a well-known problem since 2017 and it is very unlikely anything will change in near future.

So, I'm willing to implement better polymorphism in user-space: https://github.com/canonic-epicure/monopoly

This is not a novel problem, it is already solved in pretty much any other language out there, with vtables, code specialization etc. Any suggestions how to do it in JavaScript are very welcome.

The problem I'm currently trying to solve is to retrieve some meta information from the arbitrary object monomorphically. This meta information will be a vtable analog, containing the information for methods dispatch. This is to avoid an extra box, containing this information.

The meta information (some object that is shared by many other objects) in JS naturally maps to prototype, so the 1st step is to get the prototype of the object. This can be done monomorphically with Object.getPrototypeOf(). But then, it seems whatever you try, you loose the monomorphicity.

For example, in the following code, access to the constructor of the object will be megamorphic:

class HasVTable {}

HasVTable.prototype.vtable = {}

class Class1 extends HasVTable {}
class Class2 extends HasVTable {}
class Class3 extends HasVTable {}
class Class4 extends HasVTable {}
class Class5 extends HasVTable {}

function access(obj) {
    console.log(Object.getPrototypeOf(obj).constructor.name);
}

%OptimizeFunctionOnNextCall(access);

access(new Class1);
access(new Class2);
access(new Class3);
access(new Class4);
access(new Class5);

So question is, how to store some information in the prototype and then retrieve it, w/o loosing monomorphicity? Perhaps, some "well-known" symbols can help here? Or is there some other solution?

Thank you!


For example, I've just tried using the iterator symbol with no luck - access to proto in iterator position is still megamorphic:

class HasVTable {}

class Class1 extends HasVTable {
    *[Symbol.iterator] () {
        yield 'Class1'
    }
}
class Class2 extends HasVTable {
    *[Symbol.iterator] () {
        yield 'Class2'
    }
}
class Class3 extends HasVTable {
    *[Symbol.iterator] () {
        yield 'Class3'
    }
}
class Class4 extends HasVTable {
    *[Symbol.iterator] () {
        yield 'Class4'
    }
}
class Class5 extends HasVTable {
    *[Symbol.iterator] () {
        yield 'Class5'
    }
}

function access(obj) {
    const proto = Object.getPrototypeOf(obj)

    let res

    for (res of proto) break

    console.log(res)
}

%OptimizeFunctionOnNextCall(access);

access(new Class1);
access(new Class2);
access(new Class3);
access(new Class4);
access(new Class5);

UPDATE 2020/10/21

I use an excellent deoptigate tool to track the code de-optimizations:

npx deoptigate --allow-natives-syntax -r esm src_js/draft3.js

2

There are 2 answers

7
jmrk On

polymorphism is well-optimized only up to the 4 object "shapes", after which the performance degrades significantly.

Not quite. The "polymorphic" approach is fast for low numbers of shapes, but doesn't scale well. The "megamorphic" approach works better for large numbers of shapes seen at the same site. Whether the threshold should be 3, 4 (as it currently is), 5, 6, or something else, depends a lot on the specific benchmark you're looking at; I've seen cases where a limit of 8 would work better, but on other cases the current limit of 4 is better. Higher values also cost more memory, which doesn't matter for a small stand-alone test, but is a significant consideration for large apps.

how to store some information in the prototype and then retrieve it, w/o loosing monomorphicity?

If you want monomorphic accesses, stick with one object shape. That's really all there is to it; you simply have to figure out how to apply that principle to your situation. You'll likely end up with object pairs to represent each high-level object: one part for the monomorphic, type-independent shared part, one part (linked from the first) for the things that are specific to each class. Maybe roughly:

// Generic part of the object pair:
class HasVTable {
  constructor(vtable, name, object_data) {
    this.vtable = vtable;
    // This is just an example of the simplest possible way to implement
    // properties that are guaranteed to be present in all objects.
    // Totally optional; `vtable` and `object_data` are the ideas that matter.
    this.name = name;
    this.object_data = object_data;
  }

  // Note that `doStuff` might not exist on all "subclasses". Just don't
  // attempt to call it where it would be invalid. Or add a check, if you
  // want to pay a (small) performance cost for better robustness.
  doStuff(...) {
    // The fact that "doStuff" is the first method is hardcoded here and
    // elsewhere. That hardcoding is what makes vtables fast.
    return this.vtable[0](this.object_data, ...);

    // Alternatively, you could get fancy and use:
    return this.vtable[0].call(this.object_data, ...);
    // And then `Class1_doStuff` below could use `this` like normal.
    // I'm not sure which is faster, you'd have to measure.
  }
  getName() {
    // Fields that are guaranteed to be present on all objects can be
    // stored and retrieved directly:
    return this.name;
  }
}

// The following is the class-specific part of the object pair.
// I'm giving only Class1 here for brevity.
// This does *not* derive from anything, and is never constructed directly.
class Class1 {
  constructor(...) {
    this.class1_specific_field = "hello world";
  }
}

function Class1_doStuff(this_class1) {
  // Use `this_class1` instead of `this` in here.
  console.log(this_class1.class1_specific_field);
}

// Note that only one instance of this exists, it's shared by all class1 objects.
let Class1_vtable = [
  Class1_doStuff,  // doStuff is hardcoded to be the first method.
]

// Replaces `new Class1(...)` in traditional class-based code.
// Could also make this a static function: `Class1.New = function(...) {...}`.
function NewClass1(...) {
  return new HasVTable(Class1_vtable, "Class1", new Class1(...));
}

This technique can result in significant speedups, at least for fairly simple object hierarchies. So experimenting with this may well give you beneficial results. Good luck!

I don't have a suggestion for how you'd implement multiple inheritance (as your github project description says you'd like to do) or mixins; fast multiple inheritance is a hard problem, especially in dynamic languages.

Speaking of dynamic languages: if you assume that code might randomly mutate prototype chains, then making sure everything doesn't just fall apart will also be very difficult (I have no idea how you might accomplish that). This, btw, is one reason why JavaScript engines can't just perform such a transformation under the hood: they must be 100% spec compliant, and they must work well in a wide variety of situations. When you build your own system, you can choose to impose certain limitations that you feel are acceptable (e.g.: modifying prototypes is forbidden), or you can choose to optimize for specific patterns that you know are important to you.

the 1st step is to get the prototype of the object. [...] This is to avoid an extra box.

The prototype is also an "extra box", so there's no reason to prefer prototypes here (and, in fact, as you already noticed, prototypes don't accomplish what you're aiming for).

%OptimizeFunctionOnNextCall(access);

That line in your code accomplishes exactly nothing useful. Please don't just copy-paste things you've seen elsewhere without understanding what they do. Firstly, optimization has nothing to do with polymorphism, so it's entirely irrelevant for your issue. Secondly, optimization in JavaScript only makes sense when type feedback from previous, non-optimized runs is available -- if optimizing functions on their very first execution were a good idea, engines would do it. The only reason engine developers spend many person-years of effort on writing non-optimizing execution tiers is because it makes sense to have them.

0
CanonicEpicure On

Trying to answer my own question, inspired by the https://stackoverflow.com/a/17111430/365104. Not quite sure yet, but it seems the trick is just to make the prototype of the object callable.

As simple as:

function HasVTable(arg) {
    return function () { return arg }
}

const Class1 = function () {}
Class1.prototype = HasVTable('class1_vtable')
Class1.prototype.some1 = function () {}

const Class2 = function () {}
Class2.prototype = HasVTable('class2_vtable')
Class2.prototype.some2 = function () {}

const Class3 = function () {}
Class3.prototype = HasVTable('class3_vtable')
Class3.prototype.some3 = function () {}

const Class4 = function () {}
Class4.prototype = HasVTable('class4_vtable')
Class4.prototype.some4 = function () {}

const Class5 = function () {}
Class5.prototype = HasVTable('class5_vtable')
Class5.prototype.some5 = function () {}


function access(obj) {
    console.log(Object.getPrototypeOf(obj)());
}

%OptimizeFunctionOnNextCall(access);

%OptimizeFunctionOnNextCall(Class1);
%OptimizeFunctionOnNextCall(Class2);
%OptimizeFunctionOnNextCall(Class3);
%OptimizeFunctionOnNextCall(Class4);
%OptimizeFunctionOnNextCall(Class5);

access(new Class1);
access(new Class2);
access(new Class3);
access(new Class4);
access(new Class5);

Deoptigate report does not show any megamorphic inline caches.

I wonder if its possible to avoid function call as well and perform some simple property access.