How to create alert on the azure iot device on disconnect status

645 views Asked by At

I have many azure IOT central devices among them 2 of the devices are getting disconnected frequently, But could someone please tell me if there is a way to create a genuine alert if event_status is disconnected for more than 5 mins.

Thank you

2

There are 2 answers

0
Roman Kiss On

The following screen snippet is an example of using the Device Connectivity Events in the Data Export feature (mentioned by @humblejay's answer) for your device connectivity watchdog:

enter image description here

As the above picture shows, the concept is very simple, when the device is disconnected, the watchdog message (such as the CloudEvents message) with the TimeToLive (5 minutes) is sent to the watchdog queue. In the case, when the device is reconnected within the watchdog time, the message is deleted from the watchdog queue, otherwise the message is forwarded to the Alert queue when its TTL is expired.

Note, that the device connectivity events is generated with some limitations, see more details here.

  1. Data Export:

1a. Destination - Webhook:

enter image description here

1b. Data Transform:

if .messageSource == "deviceConnectivity" then
{
  specversion:"1.0",
  id:.applicationId,
  source: .device.id,
  subject: ("/" + .applicationId + "/" + .device.id + "/" + .messageSource),
  type:.messageType,
  time:.enqueuedTime,
  dataschema:"#",
   data:{  
    reportedProperties: .device.properties.reported,
    cloudProperties: .device.cloudProperties
  }
}
else
  empty
end
  1. HttpTrigger Azure Function (version 3.x):
    #r "Newtonsoft.Json"
    #r "..\\bin\\Microsoft.Azure.ServiceBus.dll"

    using System.Text;
    using System.Net;
    using Microsoft.Azure.ServiceBus;
    using Microsoft.Azure.ServiceBus.Core;
    using Microsoft.AspNetCore.Mvc;
    using Microsoft.Extensions.Primitives;
    using Newtonsoft.Json;
    using Newtonsoft.Json.Linq;

    public static async Task<IActionResult> Run(CloudEvent ce, IDictionary<string, string> headers, ILogger log)
    {
        log.LogInformation($"Device: {ce?.source}, Type: {ce?.type}, Subject: {ce?.subject}");

        string servicebusConnectionString = System.Environment.GetEnvironmentVariable("rk2016_SERVICEBUS"); 
        string queueName = headers.ContainsKey("queueName") ? headers["queueName"] : "iotc";
        double timeToLiveInMinutes = headers.ContainsKey("timeToLiveInMinutes") ? Convert.ToInt32(headers["timeToLiveInMinutes"]) : 5;
   
        if(ce != null && ce.subject.LastIndexOf("/deviceConnectivity") > 0)
        {
            // delete a device watchdog message
            var receiver = new MessageReceiver(servicebusConnectionString, queueName, ReceiveMode.PeekLock);
            var messages = await receiver.ReceiveAsync(100, TimeSpan.FromSeconds(2)); 
            if(messages != null)
            {  
                log.LogInformation($"Number of watchdogs {messages.Count()} in the queue '{queueName}'");
                foreach(var message in messages) 
                {          
                    string deviceId = message.UserProperties.ContainsKey("deviceId") ? message.UserProperties["deviceId"].ToString() : "";    
                    if(message.Label.StartsWith("Watchdog_") && ce.source == deviceId)  
                    {
                        await receiver.CompleteAsync(message.SystemProperties.LockToken);
                        log.LogInformation($"Watchdog deleted: Device={ce.source}, ttl={ce.time - message.SystemProperties.EnqueuedTimeUtc}.");
                        // break;
                    }
                };
            } 
            else 
            {
                log.LogInformation($"No Watchdog in the queue '{queueName}'.");
            }
            await receiver.CloseAsync();

            if(ce.type == "disconnected")
            {
                // create a device watchdog message
                var sender = new MessageSender(servicebusConnectionString, queueName);
                var message = new Message(Encoding.UTF8.GetBytes(JsonConvert.SerializeObject(ce)))
                {
                    ContentType = "application/json",
                    Label = $"Watchdog_{ce.source}",
                    TimeToLive = TimeSpan.FromMinutes(timeToLiveInMinutes)
                };
                message.UserProperties.Add("deviceId", ce.source);

                await sender.SendAsync(message);
                log.LogInformation($"Watchodg created: Device={ce.source}, time={ce.time}.");
            }
        }
        else
        {
        log.LogWarning("Wrong event message");
        }
  
        return await Task.FromResult<IActionResult>(new NoContentResult());
    }

    public class CloudEvent 
    {
        public string  specversion {get; set;}
        public string  type {get; set;}
        public string  source {get; set;}
        public string  id {get; set;}
        public DateTime  time {get; set;}
        public string  subject {get; set;}
        public JObject  data {get; set;}
    }
  

Also, I do recommend to add one more destination target, such as the endpoint of the Azure Event Grid custom topic with a CloudEvents input schema, see the following screen snippet:

enter image description here

Moving the above HttpTrigger Azure Function to the AEG subscriber is giving for your solution a Pub/Sub flexibility for distributing the device connectivity events such as a filtering, etc.

1
humblejay On

In IoT Central rules can be applied only to Telemetry values on devices targeted by their associated template and optionally by their reported properties. The device connection/disconnection events are available via data export but then the rule logic needs to run external to the app in Azure function or logic app, api, etc.

Unless you have heartbeat telemetry coming as a boolean or numeric from the device you cannot time aggregate count to infer disconnection event using the built-in rules functionality. A hacky and non-ideal approach is to count number of times a metric appeared in the time aggregation window say 5 minutes and if its less than 1 infer that as a disconnected device. This will not work if telemetry frequency for that metric is higher than aggregation window.

As an example, if humidity measurement is received at-least once by the device in the aggregation window then it can be used to infer a disconnected device by counting how many times it was received in the 5min time aggregation window.

enter image description here