Uncertainty Estimation in Deep Neural Networks for Dermoscopic Image Classification (2)

The high performance of machine learning algorithms for the task of skin lesion classification has been shown over the past few years. However, real-world implementations are still scarce.

One of the reasons could be that most methods are not easily interpretable and to be used in real clinical practice, algorithms need to provide some sort of interpretability as per the MDR EU Regulation on Medical Devices.

Regardless of the quick speed increase of deep learning research in medical applications, with potential use cases being exhibited across different healthcare domains, at present, only a handful of these strategies are effectively conveyed into clinical practice. The limited success scenarios for the use of such frameworks are related to a few variables: morals and administrative viewpoints, information accessibility, inconstancy, and interpretability issues characteristic of AI arrangements.

Deep neural networks are often black boxes that receive input, process it, and output a result without further details on why the architecture gave a specific prediction. There is not any limitation to querying the weights of the system. Still, the complexity of the architecture makes it impossible to provide interpretability on how the neural network performed a specific prediction. Interpretability of deep neural networks is a vast active research field that tries to offer mathematical interpretations or visualization of the neural network parameters conveyed into a prediction.

Several methods have been developed since the late 2000s to give meaningful explanations to neural network architectures. Perturbation-based methods alter the input features (e.g., pixels of an image) followed by a forward pass of the model, measuring the difference on the unaltered input to compute the attribution of each feature. These perturbations typically involve removing, masking, or altering the original information. On the other side, back-propagation methods offer a way to compute the attributions for every feature in a single forward and backward pass. Techniques such as the illustration of class appearance models learned by CNNs were proposed by Simonyan et al. as a way of understanding what kind of patterns the network learned by querying about the spatial support of a particular class in a given image. Grad-CAM, coined by Selvaraju et al., aimed to produce a coarse localization map of the critical regions in the image via using class-specific gradient information flowing into the final convolutional layer of a CNN. Some of the latest methods of interpretability include the use of SHAP (Shapley Additive exPlanations) values which are a game-theoretic approach to explaining a machine learning model's output by optimal credit allocation.

In the past year, and with the advent of machine learning methods, the European Medical Device Regulation is enforcing the use of interpretable machine learning to obtain a CE Marking as a medical device. In the EU project iToBoS granted with 12 million euros, an international consortium of 19 academic institutions and technical partners will address the use of interpretability techniques for algorithm interpretation in the problem of skin lesion classification and segmentation.

The iToBoS will produce a machine learning tools integrated in a "cognitive assistant" for clinicians to use in clinical studies with patients in three clinical University Centers in Europe (Hospital Clinic of Barcelona, Spain;  University of Trieste, Italy) and Australia (University of Queensland.

Authors:

Marc Combalia1,4,5; Josep Malvehy1,2,3,4,5

[1].FCRB and IDIBAPS, Barcelona, Spain. [2].Hospital clinic of Barcelona. [3].University of Barcelona. [4].Athena Tech. [5].iToBoS.