Intermittent crashes in Azure Web Application

4.4k views Asked by At

Our Web Application has begun crashing of no reason, and I have no clue at the moment to what it can be.

We are running Basic Authentication for SOAP services and ADFS for the main web application. The crashes can occur at any time during the day. It is a test environment, and has fairly low traffic. I have extracted some logs below when the crash was detected.

<Event>
    <System>
      <Provider Name="ASP.NET 4.0.30319.0"/>
      <EventID>1309</EventID>
      <Level>2</Level>
      <Task>0</Task>
      <Keywords>Keywords</Keywords>
      <TimeCreated SystemTime="2015-06-12T11:23:21Z"/>
      <EventRecordID>274964734</EventRecordID>
      <Channel>Application</Channel>
      <Computer>RD0003FF410F64</Computer>
      <Security/>
    </System>
    <EventData>
      <Data>3001</Data>
      <Data>The request has been aborted.</Data>
      <Data>6/12/2015 11:23:21 AM</Data>
      <Data>6/12/2015 11:23:21 AM</Data>
      <Data>b1c5d35e8a26444ba38a8c6a0af0236f</Data>
      <Data>1305</Data>
      <Data>4</Data>
      <Data>0</Data>
      <Data>/LM/W3SVC/698610343/ROOT-1-130784515189471125</Data>
      <Data>Full</Data>
      <Data>/</Data>
      <Data>D:\home\site\wwwroot\</Data>
      <Data>RD0003FF410F64</Data>
      <Data></Data>
      <Data>6384</Data>
      <Data>w3wp.exe</Data>
      <Data>IIS APPPOOL\xxxx-test</Data>
      <Data>HttpException</Data>
      <Data>
        Request timed out.

      </Data>
      <Data>https://xxx.yy:443/</Data>
      <Data>/</Data>
      <Data>111.11.11.11</Data>
      <Data></Data>
      <Data>False</Data>
      <Data></Data>
      <Data>IIS APPPOOL\xxxx</Data>
      <Data>963</Data>
      <Data>IIS APPPOOL\xxxx</Data>
      <Data>False</Data>
      <Data>
      </Data>
    </EventData>
  </Event>
</Events>


 <EventData>
      <Data>3005</Data>
      <Data>An unhandled exception has occurred.</Data>
      <Data>6/18/2015 5:43:35 AM</Data>
      <Data>6/18/2015 5:43:35 AM</Data>
      <Data>ff2588624f0f47bc86f14cb636d4ca12</Data>
      <Data>1759</Data>
      <Data>3</Data>
      <Data>0</Data>
      <Data>/LM/W3SVC/1001219836/ROOT-1-130789123624036190</Data>
      <Data>Full</Data>
      <Data>/</Data>
      <Data>D:\home\site\wwwroot\</Data>
      <Data>RD0003FF410F64</Data>
      <Data></Data>
      <Data>6988</Data>
      <Data>w3wp.exe</Data>
      <Data>IIS APPPOOL\xxx__70d6</Data>
      <Data>WebException</Data>
      <Data>
        Unable to connect to the remote server
        at System.Net.HttpWebRequest.GetResponse()
        at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1 cmd, IRetryPolicy policy, OperationContext operationContext)

        An attempt was made to access a socket in a way forbidden by its access permissions 111.11.11.111:443
        at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
        at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket&amp; socket, IPAddress&amp; address, ConnectSocketState state, IAsyncResult asyncResult, Exception&amp; exception)

      </Data>
      <Data>https://111.111.11.11:443/</Data>
      <Data>/</Data>
      <Data>111.111.11.11</Data>
      <Data></Data>
      <Data>False</Data>
      <Data></Data>
      <Data>IIS APPPOOL\xxx__70d6</Data>
      <Data>1116</Data>
      <Data>IIS APPPOOL\xxx__70d6</Data>
      <Data>False</Data>
      <Data>
        at System.Net.HttpWebRequest.GetResponse()
        at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1 cmd, IRetryPolicy policy, OperationContext operationContext)
      </Data>
    </EventData>
  </Event>
2

There are 2 answers

1
Puneet Gupta On

Azure webapps have limits on maximum number of TCP connections that can be made simultaneously at a given point of time and the error that you are getting "An attempt was made to access a socket in a way forbidden..." typically happens when this limit is hit. This limit is higher in Large instances and less is small instances (I think 4000 for small but I may be wrong)....You may face this situation if you are not closing TCP connections properly to external services OR opening thousands of connections in an interval of few minutes. Most of the times, the issue is not closing connections properly. Isolating which site is opening connections can become a bit challenging if you have many sites hosted in the same App hosting plan but if you have just a few sites in one hosting plan, then you can collect a dump using DAAS (Diagnostic as a service) WHEN THE ISSUE IS HAPPENING and you will have to download the dumps locally and open them in tools like WinDBG to see how many System.Net.Sockets.Socket object are there. If you can, you may want to isolate the site responsible for opening too many connections by splitting sites in different app hosting plans or just scale them to a larger instance to allow Moore TCP connections....

Troubleshooting this is a bit trickier so you can engage Microsoft Support and they an assist but hope this gives you a starting point... If you need further assistance, please email me puneetg[at]Microsoft.com and we can try a few things and post that we can share our findings here with the community. I am trying to see how we can make troubleshooting this scenario easier in future

EDIT - December 4, 2017

As of now, you can monitor TCP Connections for your WebApp by going to "Diagnose and Solve" blade and clicking on TCP Connections. Quick screenshots available @ https://twitter.com/puneetguptams/status/936669451931459584

3
Mikael Nyborg On

I tried to use the crash-dumps and run the through WinDBG with various result. It was hard to get any real information out of WinDBG as I hade a hard time getting all symbols to load correctly. So I built a windows console app instead and deployed my application and my console app to the same Azure Cloud service and collected information about open tcp-ports. The result was very clear then as I saw that my Redis-Cache never (or very seldom) closed it's tcp-ports and I soon hade more than 3000 connections and the server crashed. I refactored my code to use table-storage instead and now it seems to work. I attach my little console-app for anyone who is interested in testing their own apps for leaking tcp-ports.

    using System;
    using System.Collections.Generic;
    using System.Linq;

    namespace tcp_ports
    {
        using System.Net.NetworkInformation;
        using System.Threading;

        class Program
        {
            static void Main(string[] args)
            {
                do
                {
                    IPGlobalProperties properties = IPGlobalProperties.GetIPGlobalProperties();
                    TcpConnectionInformation[] connections = properties.GetActiveTcpConnections();
                    Dictionary<String, int> ips = new Dictionary<string, int>();
                    Dictionary<String, String> ipsLocal = new Dictionary<String, String>();

                    Console.Clear();
                    Console.WriteLine("Number of open TCP Connections = {0}", connections.Count());
                    Console.WriteLine("=========================================");

                    foreach (TcpConnectionInformation c in connections)
                    {
                        String ip = c.RemoteEndPoint.Address.ToString();
                        if (ips.ContainsKey(ip))
                        {
                            ips[ip]++;
                            ipsLocal[ip] = c.LocalEndPoint.Address.ToString();
                        }
                        else
                        {
                            ips.Add(ip, 1);
                            ipsLocal.Add(ip, c.LocalEndPoint.Address.ToString());
                        }
                    }

                    var sortedIPs = from entry in ips orderby entry.Value descending select entry;

                    int no = 20;
                    foreach (var ip in sortedIPs)
                    {
                        Console.WriteLine("{0} <==> {1} = {2}", ip.Key, ipsLocal[ip.Key], ip.Value);
                        if (--no < 0) break;
                    }

                    Thread.Sleep(1000);

                } while (true);

            }
        }
    }