Evaluating javascript using AngleSharp

1.4k views Asked by At

I'm trying to evaluate the javascript on a page before I do a query because the html that I'm looking for doesn't exist in the AngleSharp document.

There is a method: document.ExecuteScript(string )

But I don't know how to use it compared to how I've seen other libraries used. For example, some python code looks like this...

wait.until(presence_of_element_located((By.ID, "class-name")))

Which just pauses the code I guess until the entire page is evaluated. Elements can then be searched.

In AngleSharp it looks like I have to run ExecuteScript method to do the same thing. But it just throws an exception (Jint.Runtime.JavaScriptException: 'results is not defined') and it returns an object as a result - which is completely obfuscating, not helpful at all.

What do I do so that my next command:

IHtmlCollection<IElement> cells = document.QuerySelectorAll(s);

actually looks through the entire document and not just the initial HTML?

1

There are 1 answers

0
Florian Rappl On

I think there is more than just one question here. So I'll break it down.

First, let's get some basics here:

  1. AngleSharp is a browser core - its not a browser, and it is not a JS engine.
  2. AngleSharp provides the ability to extend it with, for instance, a JS engine. AngleSharp.Js is such an engine based on Jint, however, its still experimental and complicated scripts will definitely not run.
  3. Before we already get into the event loop and async loading details, I'd recommend to make sure that whatever script you expect to run, really runs.

Now to the specifics:

ExecuteScript is a little helper from AngleSharp.Js that actually runs a piece of JS code that you provide. I guess its not at all what you want.

If you just want to "wait" until something is there you can do that with a few lines of C# code:

var maxTime = TimeSpan.FromSeconds(1.5);

var totalTime = TimeSpan.Zero;
var pollTime = TimeSpan.FromMilliseconds(25);

while (totalTime < maxTime)
{
    await Task.Delay(pollTime.Milliseconds);
    
    // check condition
    if (document.QuerySelector(".foo.bar") != null)
    {
        // run zoned code
        break;
    }
    
    totalTime += pollTime;
}

AngleSharp.Js actually has various methods to get into the event loop, i.e., to wait until JS has completed the current work.

For instance, WaitUntilAvailable() can be used to wait until the load event (and related) has been handled.

To enqueue some action the Then() extension method was added. All these extension methods live directly on the IDocument instance.