I am using Azure Face APi to detect faces in video stream, but for each detected face Azure returns a unique faceId( which is exactly what the documentation says).

The problem is, Let's say Mr.ABC appears in 20 video frames, 20 unique faceIds gets generated. I want something that Azure Face should return me a single faceId or a group of FaceIds generated particularly for Mr.ABC so that I can know that its the same person that stays in front of camera for x amount of time.

I have read the documentation of Azure Facegrouping and Azure FindSimilar, but didn't understand how can I make it work in case of live video stream.

The code I am using for detecting faces using Azure face is given below:

from azure.cognitiveservices.vision.face import FaceClient
from msrest.authentication import CognitiveServicesCredentials
from azure.cognitiveservices.vision.face.models import TrainingStatusType, Person, SnapshotObjectType, OperationStatusType
import cv2
import os
import requests
import sys,glob, uuid,re
from PIL import Image, ImageDraw
from urllib.parse import urlparse
from io import BytesIO
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient,__version__

face_key = 'XABC' #API key
face_endpoint = 'https://XENDPOINT.cognitiveservices.azure.com' #endpoint, e.g. 'https://westus.api.cognitive.microsoft.com'

credentials = CognitiveServicesCredentials(face_key)
face_client = FaceClient(face_endpoint, credentials)

camera = cv2.VideoCapture(0)
samplenum =1
im = ""
work_dir = os.getcwd()

person_group_id = 'test02-group'
target_person_group_id = str(uuid.uuid4())
face_ids = []

#cv2 font
font = cv2.FONT_HERSHEY_SIMPLEX
#empty tuple
width = ()
height = ()
left=0
bottom=0
def getRectangle(faceDictionary):
    rect = faceDictionary.face_rectangle
    left = rect.left
    top = rect.top
    right = left + rect.width
    bottom = top + rect.height
    
    return ((left, top), (right, bottom))

while True:
    check,campic = camera.read()
    samplenum=samplenum+1
    cv2.imwrite("live_pics/"+str(samplenum)+".jpg",campic)
    path = work_dir+"/live_pics/"+str(samplenum)+".jpg"
    #im = cv2.imread("pics/"+str(samplenum)+".jpg")
    stream = open(path, "r+b")
    detected_faces = face_client.face.detect_with_stream(
        stream,
        return_face_id=True,
        return_face_attributes=['age','gender','emotion'],recognitionModel="recognition_03")
    for face in detected_faces:
        width,height = getRectangle(face)
        cv2.rectangle(campic,width,height,(0,0,170),2)
        face_ids.append(face.face_id)
    #cv2.waitKey(100);
    if(samplenum>10):
        break
    cv2.imshow("campic", campic)
    if cv2.waitKey(1) == ord("q"):
        break

camera.release()
cv2.destroyAllWindows()   
2

There are 2 answers

2
Nicolas R On BEST ANSWER

There is no magic on Face API: you have to process it with 2 steps for each face found.

What I suggest is to use "Find similar":

  • at the beginning, create a "FaceList"
  • then process your video:
    • Face detect on each frame
    • For each face found, use find similar operation on the face list created. If there is no match (with a sufficient confidence), add the face to the facelist.

At the end, your facelist will contain all the different people found on the video.


For your realtime use-case, don't use "Identify" operation with PersonGroup / LargePersonGroup (the choice between those 2 depends on the size of the group), because you will be stuck by the need of training on the group. Example, you would be doing the following:

  • Step 1, 1 time: generate the PersonGroup / LargePersonGroup for this execution
  • Step 2, N times (for each image where you want to identify the face):
    • Step 2a: face detect
    • Step 2b: face "identify" on each detected face based on the PersonGroup / LargePersonGroup
    • Step 2c: for each unidentified face, add it to the PersonGroup / LargePersonGroup.

Here the issue is the fact that after 2c, you have to train your group again. Even if it is not so long, it cannot be used in real time as it will be too long.

0
Stanley Gong On

Per my understanding, you want to show a person's name/identity instead of the face ID detected from Face API.

If so, after you get face ids via Face Detect API, you should use the Face Identify API to do this. You can get a person ID if faces could be recognized by Azure Face service, With this ID, you can just use PersonGroup Person API to get this person's information.

I also wrote a simple demo for you, in this demo, there is only 1 image, we can just image it as a video frame. I created a person group with one superman person and added some faces to him.

This is the code below :

import matplotlib.pyplot as plt
import matplotlib.patches as patches
from PIL import Image
import numpy as np
import asyncio
import io
import glob
import os
import sys
import time
import uuid
import requests
from urllib.parse import urlparse
from io import BytesIO
from azure.cognitiveservices.vision.face import FaceClient
from msrest.authentication import CognitiveServicesCredentials

imPath = "<image path>";
ENDPOINT = '<endpoint>'
KEY = '<key>'
PERSON_GROUP_ID = '<person group name>'

face_client = FaceClient(ENDPOINT, CognitiveServicesCredentials(KEY))
im = np.array(Image.open(imPath), dtype=np.uint8)

faces = face_client.face.detect_with_stream(open(imPath, 'r+b'),recognition_model='recognition_03');

# Create figure and axes
fig,ax = plt.subplots()

 # Display the image
ax.imshow(im)

for i in range(len(faces)):
    face = faces[i]
    rect =patches.Rectangle((face.face_rectangle.left,face.face_rectangle.top),face.face_rectangle.height,face.face_rectangle.width,linewidth=1,edgecolor='r',facecolor='none')
    detected_person = face_client.face.identify([face.face_id],PERSON_GROUP_ID)[0]
    if(len(detected_person.candidates) > 0):
        person_id = detected_person.candidates[0].person_id
        person = face_client.person_group_person.get(PERSON_GROUP_ID,person_id)
        plt.text(face.face_rectangle.left,face.face_rectangle.top,person.name,color='r')
    else:
        plt.text(face.face_rectangle.left,face.face_rectangle.top,'unknown',color='r')

    
    ax.add_patch(rect)

plt.show()

Result:

enter image description here