Face Recognition with Dlib in Python

2021. 2. 4. 08:23

Face Recognition with Dlib in Python

Dlib는 OpenCV와 유사하게 이미지 프로세싱 커뮤니티에서 폭넓게 도입하고 있는 강력한 라이브러리이다. 연구자는 주로 이 라이브러리의 얼굴 탐지(detection)와 정렬(alignment) 모듈을 사용한다. 이를 넘어 dlib는 즉시 사용가능한 강력한 얼굴인식(recognition) 모듈도 제공한다. 비록 C++로 작성되었지만 파이썬 인터페이스도 가지고 있다. 이 글에서는 파이썬으로 Dlib의 얼굴인식을 적용하는 법을 알아본다.


Person of interest (2011)

Face recognition pipeline

현대의 얼굴인식파이프라인은 탐지, 정렬, 표현, 검증의 4단계로 구성된다. 이 모든 단계가 dlib의 구현에 포함된다.

Vlog

다음 비디오는 dlib로 얼굴인식을 어떻게 적용하는지를 설명한다.

Model

Dlib는 주로 ResNet-34 model에서 영감을 얻었다. Davis E. King은 일반적인 ResNet 구조를 수정하고 몇몇 레이어를 뺐다. 그리고 29개의 합성곱(convolution) 레이어로 구성된 신경망을 재구성했다. 이 신경망은 입력으로 150 X 150 X 3의 크기를 받고 128차원 벡터로 얼굴 이미지를 표현한다.


ResNet-34

그는 그리고 FaceScrub과 VGGFace2를 포함하는 다양한 데이터셋으로 모델을 재훈련했다. 즉, 모델은 3M 샘플로 얼굴을 표현하는 방법을 학습한다. 그리고 얼굴인식 연구자들에게 기준선으로 받아들여지는 LFW(Labeled Faces in the Wild) 데이터셋으로 테스트했고 99.35%의 정확도를 얻었다. 사람은 동일 데이터셋에서 간신히 97.53%이다. 이는 dlib 얼굴 인식 모델이 다른 최신 얼굴인식 모델 및 사람과 경쟁할 수 있음을 의미한다.

Prerequisites

Dlib는 facial landmark detector와 resnet model파일이 필요하다. 수동으로 다운로드하여 압축해제하거나 아래 코드를 이용하여 현재 디렉토리에 파일이 없다면 파일을 다운로드한 후 압축을 풀 수 있다.


def unzip_bz2_file(zipped_file_name):
    zipfile = bz2.BZ2File(zipped_file_name)
    data = zipfile.read()
    newfilepath = output[:-4] #discard .bz2 extension
    open(newfilepath, 'wb').write(data)


if os.path.isfile('shape_predictor_5_face_landmarks.dat') != True:
    print("shape_predictor_5_face_landmarks.dat is going to be downloaded")

    url = "http://dlib.net/files/shape_predictor_5_face_landmarks.dat.bz2"
    output = url.split("/")[-1]
    gdown.download(url, output, quiet=False)

    unzip_bz2_file(output)

if os.path.isfile('dlib_face_recognition_resnet_model_v1.dat') != True:
    print("dlib_face_recognition_resnet_model_v1.dat is going to be downloaded")  

    url = "http://dlib.net/files/dlib_face_recognition_resnet_model_v1.dat.bz2"
    output = url.split("/")[-1]
    gdown.download(url, output, quiet=False)

    unzip_bz2_file(output)

Loading pre-trained models

이제 사전에 훈련된 모델을 생성하자.


import dlib

detector = dlib.get_frontal_face_detector()
sp = dlib.shape_predictor("shape_predictor_5_face_landmarks.dat")
facerec = dlib.face_recognition_model_v1("dlib_face_recognition_resnet_model_v1.dat")

Face detection and alignment

다음 코드는 이미지 로딩, 탐지, 정렬 단계를 다룬다. 정렬된 얼굴의 모양은 (150, 150, 3)이 된다.


#load images
img1 = dlib.load_rgb_image("img1.jpg")
img2 = dlib.load_rgb_image("img2.jpg")

#detection
img1_detection = detector(img1, 1)
img2_detection = detector(img2, 1)

img1_shape = sp(img1, img1_detection[0])
img2_shape = sp(img2, img2_detection[0])

#alignment
img1_aligned = dlib.get_face_chip(img1, img1_shape)
img2_aligned = dlib.get_face_chip(img2, img2_shape)

한편 여기서 오픈소스 솔루션중 최고의 솔루션이 아니기 때문에 dlib내 얼굴 탐지를 적용해야만 하는 것은 아니다.

얼굴탐지는 OpenCV, Dlib, MTCNN같은 많은 솔루션으로 완료할 수 있다. OpenCV는 haar cascade, SSD(Single Shot multibox Detector)를 Dlib는 HoG(Histogram of Oriented Gradient), MMOD(Max-Margin Object Detection)를 제공한다. MTCNN은 오픈소스 커뮤니티에서 유명한 솔루션이다. 여기서 SSD, MMOD, MTCNN은 현대적인 딥러닝 기반인 반면 haar cascade, HoG는 고전 방법이다. 그리고 SSD는 가장 빠르다. 다음 비디오에서 각 방법들의 성능을 볼 수 있다.

아래 비디오에서는 파이썬으로 각기 다른 얼굴 탐지를 사용하는 방법을 보여준다.

좀 더 감각적인 방법

얼굴탐지에 꼭 사각형 영역을 적용해야만 하는 것은 아니다. 이것을 Dlib의 얼굴 랜드마크 탐지(facial landmark detection)으로 좀 더 감각적으로 할 수있다. 얼굴 랜트마크 탐지는 하관(jaw), 턱, 눈, 눈썹, 입술의 안쪽과 바깥쪽 영역, 코를 포함하여 68개의 얼굴 랜드마크 위치를 찾을 수 있다.

facial landmarks detection with dlib에서 좀더 깊게 알아 볼 수 있다.

Represention

정렬된 얼굴을 ResNet 모델로 전달하면 128차원 벡터로 얼굴을 표현한다.


img1_representation = facerec.compute_face_descriptor(img1_aligned)
img2_representation = facerec.compute_face_descriptor(img2_aligned)

비록 dlib가 dlib.vector 타빙으로 표현을 찾지만 아래 코드로 이를 numpy로 쉽게 바꿀 수 있다.


img1_representation = np.array(img1_representation)
img2_representation = np.array(img2_representation)

유클리드 거리(Euclidean distance)

Davis King은 [세부조정된 임계치(tuned threshold)]https://sefiks.com/2020/05/22/fine-tuning-the-threshold-in-face-recognition/)를 찾았기 때문에 얼굴 검증을 위해 유클리드 거리(Euclidean distance)를 사용할 것을 제안한다.


def findEuclideanDistance(source_representation, test_representation):
    euclidean_distance = source_representation - test_representation
    euclidean_distance = np.sum(np.multiply(euclidean_distance, euclidean_distance))
    euclidean_distance = np.sqrt(euclidean_distance)
    return euclidean_distance

Verification

이 얼굴 이미지쌍의 표현을 가지고 있고 이 표현 벡커간 거리를 찾는 방법을 알고 있다. 게다가 King은 세부 조정된 임계치를 공유했다.


distance = findEuclideanDistance(img1_representation, img2_representation)
threshold = 0.6 #distance threshold declared in dlib docs for 99.38% confidence score on LFW data set

if distance < threshold: 
    print("they are same")
else: 
    print("they are different")

Tests

몇가지 이미지쌍으로 dlib의 얼굴인식 모듈을 테스트했다. 아래 코드는 나란히 이미지 쌍을 그린다. 여기서는 deepface의 unit test images의 일부를 사용했다.


def plotPairs(img1, img2):
    fig = plt.figure()
    ax1 = fig.add_subplot(1,2,1)
    plt.imshow(img1);plt.axis('off')
    ax1 = fig.add_subplot(1,2,2)
    plt.imshow(img2); plt.axis('off')
    plt.show()

결과는 아래와 같이 매우 만족스럽다.


Tests

Conclusion

Dlib의 바로 사용가능한 얼굴인식 모듈을 사용하는 법을 알아보았다. Dlib는 까다로운 얼굴인식 서비스와 함께 제공되는 것 같고 또한 현대 얼굴인식 파이프라인의 전 단계를 모두 포함한다. 단지 dlib를 임포트하는 것으로 얼굴 검증을 적용하기에 충분하다.

여기서 사용된 소스코드는 여기에서 찾아 볼 수 잇다.

아래와 같이 몇줄의 코드로 eepface를 사용할 수도 있다.


Dlib ResNet model in deepface package

데이터셋 다운로드 1, 2, 3
소스코드 다운로드
shape_predictor_5_face_landmarks.dat
dlib_face_recognition_resnet_model_v1.dat : zip, 01

저작자표시 비영리 동일조건

Dead & Street Deadend Street... 항상 막다른 길목인 것같은 느낌... 하지만 가보지 않으면 후회할지도 모르니까...

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Dead & Street