Abduallah ELmaraghy, Ganna Ayman, Mohamed Khaled, Sara Tarek, Maha Sayed, Mennat Allah Hassan, Yomna M.I. Hassan
January 6, 2022
Dental pathology is a wide field of study as it passes through several stages of diagnosis and treatment for patients. This project aims to assist orthodontists in classifying dental occlusion and measuring the asymmetry caused by it. The system takes a 2D facial image as input and uses it to reconstruct the 3D model, as 3D models have a lower error rate in information loss, they are more accurate than 2D images. Then, it uses a deep learning model to detect 3D facial landmarks on a 2D image to measure facial asymmetry. The challenges in this approach include achieving the highest possible accuracy in the reconstruction process and detecting 3D landmarks on the 3D facial model.
1.1 Purpose of this document
The purpose of this document is to illustrate and outline the requirements of our project (Automatic facial profile detection and occlusion classification for dental purposes). In addition, the documentation will serve as guidelines for developers and as a record of product approval for the required functions. It provides a better understanding of our project and what we plan to achieve.
1.2 Scope of this document
This document is intended to provide an overview of the project. Furthermore, the documentation will be used as a guide for the developers and as a product approval record for the needed functions. This document will explain the software characteristics, functionalities, problem statement, and software design constraints. Also, it will explain the data design and operational scenarios so that it covers all of the software requirements.
1.3 System Overview
The proposed system consists of the following:
• Image Preprocessing: This phase is to ensure that the input image is ready for the 3D reconstruction. It starts by taking it and applying some preprocessing techniques to it. It begins by detecting the patient’s face using OpenCV. Furthermore, the system should then detect any distortion or blur in the image by applying Fast Fourier Transform (FTT). An FFT computes the frequencies in an image at various points. It determines whether it is blurred or sharp based on the level of frequencies measured. When there is a low frequency based on the established frequency level, the image is declared blurred. Finally, the patient’s distance from the camera lens should be determined using triangle similarity.
• 3D Model Reconstruction: In this phase, a 3D reconstruction deep learning model called Deep3DFaceRecon_pytorch reconstructs a 3D facial model. It uses a single 2D facial image as an input. It is implemented using PyTorch with a weakly supervised CNN learning algorithm. Data augmentation is used in the training process, which includes random image shifting, scaling, rotation, and flipping. The training process used 300,000 trained images in its model. The model takes the supplied image and prepossesses it through image matching, feature extraction such as landmarks detection, and feature matching. Also, some mathematical computations, such as depth calculations, should be done. Then, the model creates the 3D facial model by predicting each vertex of the mesh and creating the texture. This method uses several models to complete the reconstruction process. Nvdiffrast is a PyTorch library that is used to provide high-performance primitive operations for rasterization-based rendering. Also, the Arcface model is used, which is a state-of-the-art face recognition model. For the final representation of 3D models, Basel Face Model 2009 (BFM09) is used.
• Landmark Detection and Facial Analysis: This phase is designed to output a complete facial analysis report after detecting the 3D facial landmarks on the facial image using the MediaPipe face landmark model. The model measures the facial analysis using detected 3D facial landmarks. Firstly, the system performs facial proportions on the patient to measure the asymmetry. Then, it measures the facial profile of the patient and, finally, evaluates facial aesthetics required by clinicians involved in the treatment of dentofacial deformity.
• Classification of occlusion: In this phase, MeshCNN is used to classify the class of occlusion (Normal, Retrognathic, Prognathic). MeshCNN studies the vertices and their connections, or edges, jointly by treating the 3D model as a graph or manifold. This method defines convolution and pooling layers over the edges of 3D meshes, allowing us to more or less use the standard toolset of CNN.
1.4 System Scope
Face analyzer 3D system shall:
• Process a 2D facial image to check if it valid for reconstruction or not.
• Reconstruct a 3D facial model using single 2D image.
• Detect all 3D facial landmarks of the patients face.
• Output a full facial analysis report measured using the detected 3D landmarks.
• Classify the class of occlusion of the reconstructed 3D model.
• Develop a web-based interface for the system.