Proxy is using too much data from opening 1 simple website

209 views Asked by At

I am using my own phone as a proxy, I am using an application called iproxy, everything is working fine, but I have a problem:

When using my python selenium to open "What Is my ip address" website through my code, it uses around 50MB of data which is crazy

But when I go and do the same process manually without my code it only costs 1-3MB of data usage

Some notes:

I am using this code to config my "Proxy Authentication" process which requires username and password: https://stackoverflow.com/a/55582859

I still did not wifi split my proxy, but i am pretty sure this is not the problem

my code is empty it only opens the website, not sure why its using so much data

this is my code:

import os
import zipfile
from selenium.webdriver.chrome.service import Service
from selenium import webdriver
import time

# Your proxy details
PROXY_HOST = 'hidden'  # rotating proxy or host
PROXY_PORT = hidden           # port
PROXY_USER = 'hidden' # username
PROXY_PASS = 'hidden'  # password


manifest_json = """
{
    "version": "1.0.0",
    "manifest_version": 2,
    "name": "Chrome Proxy",
    "permissions": [
        "proxy",
        "tabs",
        "unlimitedStorage",
        "storage",
        "<all_urls>",
        "webRequest",
        "webRequestBlocking"
    ],
    "background": {
        "scripts": ["background.js"]
    },
    "minimum_chrome_version":"22.0.0"
}
"""

# Replace the placeholders in the background.js string with your actual proxy details
background_js = """
var config = {
        mode: "fixed_servers",
        rules: {
        singleProxy: {
            scheme: "http",
            host: "%s",
            port: parseInt(%s)
        },
        bypassList: ["localhost"]
        }
    };

chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});

function callbackFn(details) {
    return {
        authCredentials: {
            username: "%s",
            password: "%s"
        }
    };
}

chrome.webRequest.onAuthRequired.addListener(
            callbackFn,
            {urls: ["<all_urls>"]},
            ['blocking']
);
""" % (PROXY_HOST, PROXY_PORT, PROXY_USER, PROXY_PASS)


def get_chromedriver(use_proxy=False, user_agent=None):
    path = os.path.dirname(os.path.abspath(__file__))
    
    # Create a Service object with the path to the chromedriver executable
    service = Service(executable_path=os.path.join(path, 'chromedriver.exe'))

    # Initialize ChromeOptions
    chrome_options = webdriver.ChromeOptions()
    
    # Add proxy and user agent settings if necessary
    if use_proxy:
        pluginfile = 'proxy_auth_plugin.zip'
        with zipfile.ZipFile(pluginfile, 'w') as zp:
            zp.writestr("manifest.json", manifest_json)
            zp.writestr("background.js", background_js)
        chrome_options.add_extension(pluginfile)

    if user_agent:
        chrome_options.add_argument('--user-agent=%s' % user_agent)

    # Initialize the Chrome WebDriver with the Service object and ChromeOptions
    driver = webdriver.Chrome(service=service, options=chrome_options)

    return driver


def main():
    driver = get_chromedriver(use_proxy=True)
    driver.get('https://www.google.com/search?q=my+ip+address')
    driver.get('https://httpbin.org/ip')

if __name__ == '__main__':
    main()

I am trying to reduce the proxy data usage

1

There are 1 answers

1
r-log On

Well there are few things I should point out

Background Services: Selenium may be starting a fresh instance of the browser with each execution, which could trigger updates or data synchronization processes in the background that do not occur when you manually browse.Web Extensions: The method you're using to handle proxy authentication involves creating a Chrome extension on the fly. This extension is packed and installed each time the Selenium script runs, which could involve additional data exchanges that you wouldn't encounter in manual browsing.Caching: Browsers use caching to reduce data usage by storing resources locally. Selenium, particularly when set up to use a fresh profile for each run, may not utilize caching, causing all resources to be downloaded every time.Additional Requests: Selenium might be making additional HTTP requests that you're not making manually. This could be due to different behaviors in the automated browser environment.Proxy Configuration: Your proxy configuration might be different when used with Selenium. It could be that the proxy is serving higher-quality content or not compressing the data when requests come from Selenium.

You should maybe

Monitor Network Traffic: Use tools like the browser's Developer Tools Network tab to monitor the network traffic generated by Selenium and compare it with manual browsing.Har Files: You can export the network traffic as HAR files and analyze them for any differences in the content size.Reduce Resources: Modify the browser settings in Selenium to block images, JavaScript, or CSS to see if that reduces data usage.Check Proxy Logs: Since you're using your phone as a proxy, check if the proxy software on your phone provides logs to see what might be causing the extra data usage.Persistent Profile: Use a persistent profile for Selenium that allows caching between runs.Headless Mode: Run the browser in headless mode to potentially reduce overhead.Extension Caching: Instead of creating the proxy extension on the fly every time, you could create it once, save it, and reuse it across Selenium sessions to see if that reduces data usage.Direct Proxy Configuration: If possible, configure the proxy directly in Selenium without using an extension, which might reduce overhead.