Semi-automated labeling of video using active learning for object detection

Labeling video sequences is a critical task that is required for a wide range of supervised learning applications. In general, manually labeling videos is an extremely repetitive and time- consuming task. Often, the process is sped up by sharing the workload across multiple workers, but this can create other problems, such as varying quality and consistency of labels. Meanwhile, the area of active learning has been proposed for assisting in the labeling of images for classification and object detection tasks. However, minimal prior work is centered around the utility of active learning for video labeling. In this thesis, we attempt to address the gap in prior efforts by proposing a Semi-Automated Labeling of Video (SALV) framework using active learning to support supervised object detection applications. Firstly, we propose a general architecture for the SALV framework that is built on intra-video training and testing. The proposed SALV architecture exploits the fact that labeling video provides a unique opportunity where training and testing can be performed on consecutive frames that contain highly correlated information. Secondly, we incorporate traditional active learning methods that utilize the confidence values produced by detections to select important frames for the next iteration. Thirdly, we propose two strategies for active learning of video labeling: minimal-Distance Iterative Active Learning (min-DIAL) and maximal-Distance Iterative Active Learning (max- DIAL). Lastly, we explore information theory to select frames with the most diversity using the Jensen-Shannon divergence to calculate the difference between certain frames based on the location of detections. We analyze the performance of the proposed SALV architecture in terms of the time taken to complete the labeling of the video sequences and present our results using the popular KITTI Tracking dataset. We show that our proposed max-DIAL framework is the most efficient method and can reduce the time taken to label video by a factor of 10.

Read

In Collections: Electronic Theses & Dissertations

Copyright Status: In Copyright

Material Type: Theses

Authors: Muntaner Whitley, Roberto

Thesis Advisors: Radha, Hayder

Committee Members: Morris, Daniel
Bopardikar, Shaunuk

Date Published: 2023

Subjects: Computer science
Engineering

Program of Study: Electrical and Computer Engineering - Master of Science

Degree Level: Masters

Language: English

Pages: 43 pages

ISBN: 9798379593278

Permalink: https://doi.org/doi:10.25335/6nj2-xt64

Semi-automated labeling of video using active learning for object detection

Full text