Execute a page script, which was inserted in the page from a content script, prior to other JavaScript in the page

2.4k views Asked by At

I am trying to analyze some JavaScript code for which I make use of function rewriting so that calls to a JavaScript library go through my JavaScript code. My JavaScript code is part of a Chrome Extension. From a Chrome extension content script, I install/inject the code into the target page's DOM.

This works fine for functions that are induced after the load of page. The library calls go through my function. But, there's JavaScript code that runs while the page is actually loading (probably while the DOM is being rendered). This happens before my custom script is injected. This way, the function calls before the custom script is injected are lost to me, or those JavaScript calls do not go through my function.

I make use of Content Script to actually inject other JavaScript by appending to the DOM as mentioned in the following Stack Exchange question:

Insert code into the page context using a content script

I know I can cause the loading time of Content Script to be at the start/end of the DOM but this is another script file that I append to the DOM of the target page. I do not seem to understand how to control it.

The problem explained in Is it possible to run a script in context of a webpage, before any of the webpage's scripts run, using a chrome extension? is exactly the same, but the solution does not seem to work. My intention is to make the injected script execute before any JavaScript code executes from the webpage. By specifying document_start in manifest.json, content script execution can be made to run before the webpage, but not the script that I inject through the content script (injecting script as explained in first link). This injected script is not running in any specific manner with respect to the webpage

Manifest.json:
Manifest file has the content script content.js added at document_start, so content.js is run before the target webpage (underlying page) runs.

"content_scripts":[
    {
    "matches":["<all_urls>"],            
    "js":["content.js"],
    "run_at":"document_start",
    "all_frames":false
    }
],

content.js:

content.js has the below code with which I add the main.js to the DOM, so that I am actually able to interact with the JavaScript that is in the target page's environment. I do this from a different file and attach it to the DOM because I cannot interact with the target page's JavaScript through the Content Scripts, since they both do not interfere with each other.

To explain further, main.js has some JavaScript that intercepts JavaScript calls during the execution of JavaScript in target page. JavaScript in target page makes calls to a library and I intend just to write a wrapper on those library functions.

var u = document.createElement('script');
u.src = chrome.extension.getURL('main.js');
(document.head||document.documentElement).appendChild(u);
u.onload = function() {
    u.parentNode.removeChild(u);
};

I expect that main.js is available in the target page's domain and any of the scripts in the target page, since I inject it through the content script that is run at document_start.

Assume I have a call to some JavaScript function like this in my target page HTML, someJSCall() is defined by the target page's domain.

<html onLoad="someJSCall( )">

In this scenario, main.js (code injected through my Chrome extension) is already available. So calls to the JavaScript library from someJSCall() function go through main.js wrapper functions. This works fine.

The problem is when there are IIFE (immediately invoked function expressions) defined in the target page's JavaScript. If these IIFE calls make library calls, this does not go through my main.js interceptions. If I look at the files loaded in the browser through Chrome Dev Tools, I see that main.js is still not loaded while IIFE calls are executing.

I hope I have explained the problem in detail.

1

There are 1 answers

2
Makyen On

Based on the additional information you added to the question about 2.5 weeks after I answered, you are adding code to the page context by including a "main.js", which is a separate file in your extension, using a <script> that looks something like:

<script src="URL_to_file_in_extension/main.js"/>

However, when you do that you introduce an asynchronous delay between when the <script> is inserted into the page and when the "main.js" is fetched from the extension and executed in the page context. You will not be able to control how long this delay is and it may, or may not, result in your code running prior to any particular code in the page. It will probably run prior to code that has to be fetched from external URLs, but may not.

In order to guarantee that your code runs synchronously, you must insert it in a <script> tag as actual code, not using the src attribute to pull in another file. That means the code which you want to execute in the page must exist within the content script file you are loading into the page.

Needing to execute code in the page context is a fairly common requirement. I've needed to do so in browser extensions (e.g. Chrome, Firefox, Edge, etc.) and in userscripts. I've also wanted to be able to pass data to such code, so I wrote a function called executeInPage(), which will take a function defined in the current context, convert it to text, insert it into the page context and execute it while passing any arguments you have for it (of most types). If interested, you can find executeInPage() in my answer to Calling webpage JavaScript methods from browser extension and my answer to How to use cloneInto in a Firefox web extension?

The following is my original answer based on the original version of the question, which did not show when the content script was being executed, or explain that the code being added to the page was in a separate file, not in the actual content script.

You state in your question that you "can handle the loading time of Content Script to be at the start/end of the DOM", but you don't make clear why you are unable to resolve your issue by executing your content script at document_start.

You can have your script injected prior to the page you are injecting into being built by specifying document_start for the run_at property in your manifest.json content_scripts entry, or for the runAt option passed to chrome.tabs.executeScript(). If you do this, then your script will start running when document.head and document.body are both null. You can then control what gets added to the page.

For chrome.tabs.executeScript() exactly when your script runs depends on when you execute chrome.tabs.executeScript() in relation to the process of loading the page. Due to the asynchronous nature of the processing (your background script is usually running in a different process), it is difficult to get your script consistently injected when document.head and document.body are both null. The best I've accomplished is to have the script injected sometimes when that is the case, and sometimes after the page is populated, but prior to any other resources being fetched. This timing will work for most things, but if you really need to have your script run prior to the page existing, then you should use a manifest.json content_scripts entry.

With your content script running prior to the existence of the head and body, you can control what gets inserted first. Thus, you can insert your <script> prior to anything else on the page. This should make your script execute prior to any other script in the page context.