C# WebClient.DownloadString to access US site of a URL

501 views Asked by At

I am passing a URL into WebClient.DownloadString("http://someurl.com") to download the page's HTML but it always downloads my country's version of the page (i.e http://someurl.com/en-cn) I need to download the US site of the URL.

here's my function that I call to download the html:

public static String GetHtmlStringWC(string url)
    {
        string htmlString = string.Empty;

        try
        {
            using (WebClient webClient = new WebClient())
            {
                try
                {
                    webClient.Headers["User-Agent"] = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.15) Gecko/20110303 Firefox/3.6.15";
                    WebProxy myProxy = new WebProxy();
                    myProxy.IsBypassed(new Uri(url));
                    webClient.Proxy = myProxy;
                    htmlString = webClient.DownloadString(url);
                }
                catch (Exception ex)
                {
                    throw;
                }
                finally
                {
                    webClient.Dispose();
                }
            }
        }
        catch (WebException wex)
        {
            throw;
        }
        catch (Exception ex)
        {
            throw;
        }
        finally { }

        return htmlString.Replace("\r", string.Empty).Replace("\n", string.Empty).Replace("\t", string.Empty); 
    }

Am I missing a parameter? What do I need to pass to have it always download the US-version of the page?

1

There are 1 answers

0
user5010130 On

If the server responds with a localized version of the page your options are:

  1. Try to include the version you want in the URL like http://someurl.com/en-us
  2. Go to the website and see if there is an option to set the language and if so try to do so in your program

  3. Try using a proxy, a VPN, Tor or such.