I am currently building a web scraper that downloads media files off of a webpage. It navigates to each page and intercepts the media GET request then downloads the audio file off of the URL in the request.
However, when I run the program it gets as far as printing the correct URL it extracted from the GET request, and then gives me an error (see below). I am confused by this error because this is the only part of my code that uses request interception, and I enable and disable it in the correct (I think) places when I call the method.
PuppeteerSharp.PuppeteerException: Request Interception is not enabled!
at PuppeteerSharp.Request.ContinueAsync(Payload overrides) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Request.cs:line 108
at Scraper.<>c__DisplayClass1_0.<<requestInterceptAsync>b__0>d.MoveNext() in C:\Users\iangr\Desktop\scraperV2\scraper.cs:line 83
Unhandled exception. PuppeteerSharp.PuppeteerException: Request Interception is not enabled!
at PuppeteerSharp.Request.ContinueAsync(Payload overrides) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Request.cs:line 108
at Scraper.<>c__DisplayClass1_0.<<requestInterceptAsync>b__0>d.MoveNext() in C:\Users\iangr\Desktop\scraperV2\scraper.cs:line 83
System.InvalidOperationException: An attempt was made to transition a task to a final state when it had already completed.
at System.Threading.Tasks.TaskCompletionSource`1.SetException(Exception exception)
at Scraper.<>c__DisplayClass1_0.<<requestInterceptAsync>b__0>d.MoveNext() in C:\Users\iangr\Desktop\scraperV2\scraper.cs:line 96
I do not understand why this error is occurring as after the program has the URL of the file it shouldn't need request interception.
Here is how I call my request interception method:
await page.SetRequestInterceptionAsync(true);
TaskCompletionSource<string> tcs = new TaskCompletionSource<string>();
requestInterceptAsync(page, tcs);
await page.GoToAsync(links[i], new NavigationOptions { Timeout = 5000, WaitUntil = new[] { WaitUntilNavigation.DOMContentLoaded } });
string audioUrl = await tcs.Task;
await page.SetRequestInterceptionAsync(false);
Here is my request interception method:
public static async Task requestInterceptAsync(IPage page, TaskCompletionSource<string> tcs)
{
try
{
page.Request += async (sender, e) =>
{
try
{
if (e.Request.ResourceType != ResourceType.Media)
{
// Forward the intercepted request
await e.Request.ContinueAsync();
}
else
{
// Extract the URL from the intercepted request
string audioUrl = e.Request.Url;
tcs.SetResult(audioUrl);
}
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
tcs.SetException(ex);
}
};
// await Task.Delay(0);
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
throw;
}
}
The first thing I tried was restructuring the code. My request interception method used to be a Task, but I switched to TaskCompletionSource so I could run it asynchronously without awaiting it. I consulted ChatGPT multiple times, but all it coughed up was the same couple crackhead responses over and over.
I have been working on this error for so long. I am at the end of my wit.
The problem you have there is that you are still getting
Request
events after you callawait page.SetRequestInterceptionAsync(false);
.You could do two things:
await page.SetRequestInterceptionAsync(false);