I am facing this problem in puppeteer in a for loop when i go on another page to get data, then when i go back it comes me this error line:

Error "We have an error Error: the execution context was destroyed, probably because of a navigation."

It's a directory page that contains 15 companies per page and then I want to visit each company to get information.

try {
    const browser = await pupputer.launch({
        headless: false,
        devtools: true,
        defaultViewport: {
            width: 1100,
            height: 1000
        }
    });

    const page = await browser.newPage();
    await page.goto('MyLink');

    await page.waitForSelector('.list-firms');

    for (var i = 1; i < 10; i++) {

        const listeCompanies = await page.$$('.list-firms > div.firm');

        for (const companie of listeCompanies) {

            const name = await companie.$eval('.listing-body > h3 > a', name => name.innerText);
            const link = await companie.$eval('.listing-body > h3 > a', link => link.href);

            await Promise.all([
                page.waitForNavigation(),
                page.goto(link),
                page.waitForSelector('.firm-panel'),
            ]);

            const info = await page.$eval('#info', e => e.innerText);

            const data = [{
                name: name,
                information: info,
            }];

            await page.goBack();

        }
        await Promise.all([
            page.waitForNavigation(),
            page.click('span.page > a[rel="next"]')
        ]);
    }
} catch (e) {
    console.log('We have error', e);
}

I managed to only get the data of the first company.

1 Answers

2
Thomas Dondorf On Best Solutions

Problem

The error means that you are accessing data which has become obsolete/invalid because of navigation. In your script the error references the variable listeCompanies:

const listeCompanies = await page.$$('.list-firms > div.firm');

You first, use this variable in a loop, then you navigate via page.goto and after that your loop tries to get the next item out of the variable listeCompanies. But after the navigation happened the element handles in that variable are not present anymore and therefore the error is thrown. That's also why the first iteration works.

Solution

There are multiple ways to fix this.

  1. Extract the data from your page at once (before using the loop)
  2. Use a second pageto do the "loop navigation" so that your main page does not need to navigate
  3. "Refresh" your variable by re-executing the selector after calling page.goBack

Option 1: Extract the data before entering the loop

This is the cleanest way to do it. You extract the information in the first page at once and then iterate over your extracted data. The nameLinkList will be an array with the name and link values (e.g. [{name: '..', link: '..'}, {name: '..', link: '..'}]). There is also no need to call page.goBack at the end of the loop as the data is already extracted.

const nameLinkList = await page.$$eval(
    '.list-firms > div.firm',
    (firms => firms.map(firm => {
        const a = firm.querySelector('.listing-body > h3 > a');
        return {
            name: a.innerText,
            link: a.href
        };
    }))
);

for (const {name, link} of arr) {
    await Promise.all([
        page.waitForNavigation(),
        page.goto(link),
        page.waitForSelector('.firm-panel'),
    ]);

    const info = await page.$eval('#info', e => e.innerText);

    const data = [{
        name: name,
        information: info,
    }];
}

Option 2: Use a second page

In this case your browser will have two open pages. The first one will only be used to read the data, the second one is used for navigation.

const page2 = await browser.newPage();
for (const companie of listeCompanies ){
    const name = await companie.$eval('.listing-body > h3 > a', name => name.innerText);
    const link = await companie.$eval('.listing-body > h3 > a', link => link.href);

    await Promise.all([
        page2.goto(link),
        page2.waitForSelector('.firm-panel'),
    ]);

    const info = await page2.$eval('#info', e => e.innerText);
    // ...
}

Option 3: "Refresh" selectors

Here you simply re-execute your selector after going back to your "main page". Note, that the for..of has to be change to an iterator-loop as we are replacing the array.

let listeCompanies  = await page.$$('.list-firms > div.firm');
for (let i = 0; i < listeCompanies.length; i++){
    // ...

    await page.goBack();
    listeCompanies = await page.$$('.list-firms > div.firm');
}

I recommend to go with option 1 as this also reduced the number of necessary navigation requests and will therefore speed up your script.