PuppeteerSharp "Request interception is not enabled!" when trying to intercept a GET request

134 views Asked by At

I am currently building a web scraper that downloads media files off of a webpage. It navigates to each page and intercepts the media GET request then downloads the audio file off of the URL in the request.

However, when I run the program it gets as far as printing the correct URL it extracted from the GET request, and then gives me an error (see below). I am confused by this error because this is the only part of my code that uses request interception, and I enable and disable it in the correct (I think) places when I call the method.

PuppeteerSharp.PuppeteerException: Request Interception is not enabled!
   at PuppeteerSharp.Request.ContinueAsync(Payload overrides) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Request.cs:line 108
   at Scraper.<>c__DisplayClass1_0.<<requestInterceptAsync>b__0>d.MoveNext() in C:\Users\iangr\Desktop\scraperV2\scraper.cs:line 83
Unhandled exception. PuppeteerSharp.PuppeteerException: Request Interception is not enabled!
   at PuppeteerSharp.Request.ContinueAsync(Payload overrides) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Request.cs:line 108
   at Scraper.<>c__DisplayClass1_0.<<requestInterceptAsync>b__0>d.MoveNext() in C:\Users\iangr\Desktop\scraperV2\scraper.cs:line 83
System.InvalidOperationException: An attempt was made to transition a task to a final state when it had already completed.
   at System.Threading.Tasks.TaskCompletionSource`1.SetException(Exception exception)
   at Scraper.<>c__DisplayClass1_0.<<requestInterceptAsync>b__0>d.MoveNext() in C:\Users\iangr\Desktop\scraperV2\scraper.cs:line 96

I do not understand why this error is occurring as after the program has the URL of the file it shouldn't need request interception.

Here is how I call my request interception method:

await page.SetRequestInterceptionAsync(true);
TaskCompletionSource<string> tcs = new TaskCompletionSource<string>();
requestInterceptAsync(page, tcs);
await page.GoToAsync(links[i], new NavigationOptions { Timeout = 5000, WaitUntil = new[] { WaitUntilNavigation.DOMContentLoaded } });
string audioUrl = await tcs.Task;
await page.SetRequestInterceptionAsync(false);

Here is my request interception method:

    public static async Task requestInterceptAsync(IPage page, TaskCompletionSource<string> tcs)
    {
        try
        {
            page.Request += async (sender, e) =>
            {
                try
                {
                    if (e.Request.ResourceType != ResourceType.Media)
                    {
                        // Forward the intercepted request
                        await e.Request.ContinueAsync();
                    }
                    else
                    {
                        // Extract the URL from the intercepted request
                        string audioUrl = e.Request.Url;

                        tcs.SetResult(audioUrl);
                    }
                }
                catch (Exception ex)
                {
                    Console.WriteLine(ex.ToString());
                    tcs.SetException(ex);
                }
            };

            // await Task.Delay(0);
        }
        catch (Exception ex)
        {
            Console.WriteLine(ex.ToString());
            throw;
        }
    }

The first thing I tried was restructuring the code. My request interception method used to be a Task, but I switched to TaskCompletionSource so I could run it asynchronously without awaiting it. I consulted ChatGPT multiple times, but all it coughed up was the same couple crackhead responses over and over.

I have been working on this error for so long. I am at the end of my wit.

1

There are 1 answers

2
hardkoded On

The problem you have there is that you are still getting Request events after you call await page.SetRequestInterceptionAsync(false);.

You could do two things:

  1. Do not call await page.SetRequestInterceptionAsync(false);
  2. Refactor, a little more :), your code so you can remove the event listener. Something like:
void RequestEventListener(object sender, RequestEventArgs e) => SomeCode();

page.Request += RequestEventListener;
// Do something
page.Request -= RequestEventListener;