Our paper DeepAbstraction: 2-Level Prioritization for Unlabeled Test Inputs in Deep Neural Networks has been accepted for publication in AITest 2022, the 4th international conference on Artificial Intelligence Testing.
The abstract of our paper is below:
Deep learning systems recently achieved unprecedented success in various industries. However, DNNs still exhibit some erroneous behaviors, which lead to catastrophic results. As a result, more data should be collected to cover more corner cases. On the other hand, a massive amount of data consumes more human annotators (oracle), which increases the labeling budget and time. We propose an effective test prioritization technique, called DeepAbstraction to prioritize the more likely error-exposing instances among the entire unlabeled test dataset. The ultimate goal of our framework is to reduce the labeling cost and select the potential corner cases earlier before production. Different from existing work, DeepAbstraction leverages runtime monitors. In the literature, runtime monitors are primarily used to supervise the prediction of the neural network. Then, monitors trigger a verdict for each prediction: acceptance, rejection, or uncertainty. Monitors quantify the acquired knowledge into box abstraction during the training. Each box abstraction contains instances that share similar high-level features. In the test part, the verdict of monitor depends in which box abstraction a test instance resides.
Moreover, we study intensively where corner cases can reside in the feature space, either near-boundary regions or near-centroid regions. The existing test prioritization techniques can only prioritize many near-boundary instances and a few near-centroid instances. Nevertheless, DeepAbstraction can effectively prioritize numerous instances from both regions. Therefore, our evaluation shows that DeepAbstraction outperforms the state-of-the-art test prioritization techniques.
This is joint work with Hamzah Al-Qadasi, Changshun Wu, and Saddek Bensalem.