Anonymizing Videos & Accuracy Prediction
The project below was produced as a final year research project of the Bsc. Honours in Computing with Data Analytics (2017-2018) on ITT Tallaght, Dublin, Ireland, under the supervision of Keith Quille, AI lecturer. Once submitted, it was later on reviewed and Paper published, co authored by Keith Quille, Jelena Vasic and Sean McHugh.
Abstract
This document (2018) presents the possibility of anonymizing sports videos, and assessing the accuracy of the outputs with Machine Learning. Collaborating with the NCCA, it covers their specific business case where their main objective is to preserve identity of their students before the examination videos presented for grading are visualized by the agents on the organisation, in order to comply with the new GDPR regulation. Combining both, Development and Data Analytics perspective, the project is divided in two different phases.
On a first stage, a Video Anonymizer process has been developed in C# and is delivered as an executable file (.exe). Supported by two different libraries (Accord and Emgu), given one video input, the optimal target is to generate an output video where any visible faces from the students have been recognized and blurred. 100 video inputs have been used for this stage resulting in 200 outputs (1 for each library)
A second component was developed using WEKA, aiming to predict the accuracy of the outputs from the first component, based on attributes manually extracted from them, such as Camera quality, Camera Angle, Sport, and others that can be found in sections below. This model also hints limitations and boundaries on the libraries used.
This project not only demonstrates the capabilities and limitations of the two libraries used by the first component by stressing them under a multiple set of environments and conditions, such as multiple sports (soccer, karate, boxing, etc.), but also sets a base ground for NCCA research, automation and improvement on a full video anonymization system combined with a Machine Learning component that aims to predict the quality of the blurred videos. The output model provided a 94% accuracy using Naïve Bayes Classifier, with a 95.2% specificity, meaning it may be likely to predict, given a video, how feasible is to detect scenarioswith difficulties for automated blurring.
Available Downloads
- Original Paper (7 pages) (Author César Marrades Cortés)
- Full Project document (20 pages)
- Project Presentation (PDF)
- Published Paper (PDF) (CO Authors Cesar Marrades Cortés, Keith Quille, Jelena Vasic, and Sean McHugh)
- Polibits Magazine Cover
Main Points
- Two anonymizing libraries analyzed and stressed under multiple sport conditions.
- Detected limitations on these libraries when recognizing faces.
- Analytics and prediction on video anonymizing accuracy of these libraries.
- Project presented based on business case exposed by NCCA.