What are modern facial recognition technologies based on?

One of the directions of computer vision development is face recognition technology. Currently, there is a huge variety of facial recognition systems, ranging from the face localization technology in smartphones to complex systems that can find a person in a crowd. In this article we will discuss the principles on which modern facial recognition systems are based.

To explain how AI recognizes faces, we need to understand how people distinguish each other’s faces. In the picture below, you can see how someone’s eyes move when looking at a face.

This means that a person first sees the image as a whole, then identifies specific parts like eyes, nose, chin, forehead, ears, recognises the features of the face, and after that perceives a holistic image and recognizes a person.

In a conversation, a person intermittently focuses their attention on just a part of the face, while the rest of it is perceived in less detail, periodically switching their attention to see the whole face.

Fig. 1. Stabilized images usually fade out.

The remaining visible parts of the profile are always meaningful elements or groups of elements (face, upper half of the face).

Nevertheless, the human brain perceives the image as stable and constant, despite the movement of the eyes and the body of the viewer as well as the movement of the perceived objects.

On this basis, the following stages of recognizing a human face by a computer can be distinguished: feature extraction, preliminary analysis, hypothesize, hypothesis testing (comparison of images with a standard taken from memory).

Ways to recognize faces in general

There are several ways to recognize faces. The first method is a geometric comparison based on the recognition of facial elements. Despite the slight differences in the elements of faces, the overall image will be correctly recognized as a human face.

The second method is a reference comparison. The image, presented as an array of bytes, is compared with the standard whole face. Several standards are used to recognize faces from different angles.

The third approach is when a person is represented as a set of small standards.

A more complex method is the use of a single standard with a basic face model, which makes it possible to evaluate the transformation of a recognizable face when the viewing angle is changed. That is, the computer by reference comparison identifies the elements of the face (eyes, nose, mouth) in the image, and then normalizes the image in scale and orientation in space.

Another way of recognizing faces utilizes neural networks. Due to a large amount of data, neural networks are able to recognize a person’s face from all angles and under different lighting. But this method is infrequently used as it is only effective for recognizing a certain set of faces.