transpiler battle: breaking out of nested function, with vs without throw

1.2k views Asked by At

I have just finished writing "version 0" of my first (toy) transpiler. It works. It turns a string of "pseudo JavaScript" (JavaScript with an additional feature) into a string of runnable JavaScript. Now, I want to improve it.

The work area possibly most interesting for other SO users is this: The compiled code (i.e., output of my transpiler) does not heed a coding style recommendation as given in an accepted answer to some earlier SO question. If I would have at my hands a second transpiler where that coding style recommendation is heeded, I could make an informed decision regarding which branch is more promising to continue to develop on - I'd like to compare the 2 braches regarding performance, development time needed, amount of bugs, and so on, and decide based on that.

Let me tell you about the "additional JS feature" my transpiler deals with: "nested return". Consider closures / nested functions like so

function myOuterFunc(){
    ... code ...
    function innerFunc(){
        ... code ...
    }
    ... code ...
}

(note that above '...code...' is supposed to include every possible JS code including more nested function declarations, so myOuterFunc is not necessarily the direct parent of innerFunc)

In above situation, suppose you desire to return a result from myOuterFunc from somewhere inside - not necessarily directly inside - innerFunc

With "nested return" implemented, you could then write simply

return.myOuterFunc result

Here is an exmalpe of a (not-runnable) function using this feature and doing something meaningful

function multiDimensionalFind(isNeedle, haystack) {
    // haystack is an array of arrays
    // loop (recursively) through all ways of picking one element from each array in haystack
    // feed the picked elements as array to isNeedle and return immediately when isNeedle gives true
    // with those picked elements being the result, i.e. the 'found needle'
    var LEVEL = haystack.length;
    function inner(stack) {
        var level = stack.length;
        if (level >= LEVEL) {
            if (isNeedle(stack)) return.multiDimensionalFind stack;
        } else {
            var arr = haystack[level];
            for (var i = 0; i < arr.length; i++) {
                inner(stack.concat([arr[i]]));
            }
        }
    }
    inner([]);
    return 'not found'
}

And here is the (runnable) code my transpiler automatically produces from that (comments were added/remove later, obviously), followed by some code testing if that function does what it claims to do (and it does, as you can convince yourself.)

///////////// the function /////////////////
function multiDimensionalFind(isNeedle, haystack) {
    try {
        var LEVEL = haystack.length;
        function inner(stack) {
            var level = stack.length;
            if (level >= LEVEL) {
                if (isNeedle(stack)) throw stack;
            } else {
                var arr = haystack[level];
                for (var i = 0; i < arr.length; i++) {
                    inner(stack.concat([arr[i]]));
                }
            }
        }
        inner([]);
        return 'not found'
    } catch(e){
        // make sure "genuine" errors don't get destroyed or mishandled
        if (e instanceof Error) throw e; else return e;
    }
}
////////////////// test it //////////////////
content = document.getElementById('content');
function log2console(){
  var digits = [0,1];
  var haystack = [digits,digits,digits,digits,digits];
  var str = '';
  function isNeedle(stack){
    str = str + ', ' + stack.join('')
    return false;
  }
  multiDimensionalFind(isNeedle, haystack);
  content.textContent = str;
}
function find71529(){ // second button
  var digits = [0,1,2,3,4,5,6,7,8,9]
  var haystack = [digits,digits,digits,digits,digits]
  function isNeedle(stack){
    return stack.reduce(function(b,i){ return 10*b+i; }, 0) === 71529
    // returns true iff the stack contains [7,1,5,2,9]
  }
  content.textContent = multiDimensionalFind(
    isNeedle, haystack
  ).join('_')
}
<button onclick='log2console()'>print binary numbers with 5 digits</button>
<br>
<button onclick='find71529()'>find something is 5d space</button>
<div id='content'></div>

You can play around with my transpiler in this fiddle here. I'm using the esprima library, the escodegen.js library on top of esprima, a teeny tiny work in progress abstract syntax tree generation library of my own (see the script tags in the fiddle). The code which is not library, and not UI code, i.e. the "real meat" of the transpiler has less than 100 lines (see function transpile). So this might be a lot less complicated than you thought.

I can't remember where I had seen the style recommendation, but I am certain that it was actually in multiple places. If you know or come across one such question, I invite you to be so kind to put the link into a comment under the question, I will mark useful. So far, there is one link, thank you Barmar.

You might ask why I even bothered to write a "non-compliant" transpiler first, and did not go for the "compliant" version straight away. That has to do with the estimated amount of work. I estimate the amount of work to be much larger for the "compliant" version. So much so, that it didn't really seem worthwhile to embark on such an endeavor. I would very much like to know whether this assessment of the amount of work is correct or incorrect. Thus the question. Please do not insinuate a rhetorical, or even a dishonest motive for the question; however weird it might sound to some, it really is the case that I'd like to be proven wrong, so please be so kind not to assume I'm "just saying that" for whatever reason, you'd be doing me an injustice. This is, of all the questions I asked on SO so far, by far the one I've put the most work into. And, if you ask me, it is by far the best question I have ever asked here.

Apart from someone assisting me in writing the "compliant" version of the transpiler, I'm also interested (albeit to a lesser degree) in anything objectively demonstrable which stands a chance to convince me that the "non-compliant" way is the wrong way. Speed tests (with links to jsperf), reproducible bug reports, things of that sort.

I should mention the speed tests I have done myself so far:

first test, second test

loosely related question

4

There are 4 answers

0
T.J. Crowder On

The better way really is to use return (if you can't completely refactor the tower). It makes it clear what you're doing and when you're returning a value from the intermediate functions; you can tell by looking at those functions where the result may be provided. In contrast, using throw is invisible in those intermediate layers; it magically bypassing them in a way designed for error conditions.

If you don't want to do that, I don't think you have a reasonable alternative other than throw. I wondered about going down the route of generator functions or similar, but I think that would just complicate things more, not less.

Using return doesn't markedly complicate the example code, and does (to my eye) make it clearer what's going on and when results are potentially expected.

function wrapper(){
    function first(){
        function second(){
            function third(){
                doStuff4();
                some loop {
                    var result = ...
                    if (something) return result;
                }
            }
            doStuff2();
            let result = third();
            if (result) {
                return result;
            }
            doStuff3();
            return third();
        }
        doStuff1();
        return second();
    }
    return first() || "not found";
}

(In the above, result is tested for truthiness; substitute something else if appropriate.)

2
mathheadinclouds On

Sometime later, when I get around to it, I'll add instructions here on how to use my abstract syntax tree generation library which I used for my transpiler, maybe even start a little towards the other version, maybe explaining in more detail what the things are which make me think that it is more work.

old version follows (will be deleted soon)

If the feature is used multiple times we'll need something like this:

function NestedThrowee(funcName, value){
    this.funcName = funcName;
    this.value = value;
}

return.someFunctionName someReturnValue (before compilation) will give (after compliation) something like

var toBeThrown = new NestedThrowee("someFunctionName", someReturnValue);

And inside the generated catch block

if (e instanceof Error){
    throw e;   // re-throw "genuine" Error
} else {
    if (e instance of NestedThrowee){
    if (e.funcName === ... the name of the function to which
                       this catch block here belongs ...) return e.value;
    throw new Error('something happened which mathheadinclouds deemed impossible');
}

In the general case, it is necessary to wrap the result (or 'throwee') like shown above, because there could be multiple, possibly nested "nested returns", and we have to take care that the catch phrase and caught object of type NestedThrowee match (by function name).

5
TheMaster On

I think you should use arrays;

const funcs = [first, second, third];
for(let i = 0; i < funcs.length; ++i){
 const result = funcs[i]();
 if (result) break;
}

You should return only from function that has the result.

0
tarkh On

Ok, here is another approach with use of all async power of JavaScript... So basically I've recreated your nested functions but with use of Promise/await technique. You will get result only once, the first time you'll resolve it from any level of nested functions. Everything else will be GC. Check it out:

// Create async function
(async () => {
  const fnStack = (val) => {
    return new Promise((resolve, reject) => {
      // Worker functions.
      // Could be async!
      const doStuff1 = (val) => val + 1;
      const doStuff2 = (val) => val * 2;
      const doStuff3 = (val) => val * -1; // This will not affect result
      const doStuff4 = (val) => val + 1000;
      
      // Nested hell
      function first() {
        function second() {
          function third() {
            val = doStuff4(val);
            // Some loop
            for(let i = 0; i < 1000; i++) {
              if(i === 500) {
                // Here we got our result
                // Resolve it!
                resolve(val);
              }
            }
          }
          val = doStuff2(val);
          third();
          // Below code will not affect
          // resolved result in third() above
          val = doStuff3(val);
          third();
        }
        val = doStuff1(val);
        second();
      }
      
      //
      first();
    });
  }
  
  // Run and get value
  const val = await fnStack(5);
  
  // We get our result ones
  console.log(val);
})();