ADA Library Digital Repository

Refining Neural Network Interpretability through Activation Modification Techniques: An Exploration of Threshold-Based Approaches

Show simple item record

dc.contributor.author Mammadova, Nigar
dc.date.accessioned 2025-08-26T08:01:20Z
dc.date.available 2025-08-26T08:01:20Z
dc.date.issued 2025-04
dc.identifier.uri http://hdl.handle.net/20.500.12181/1437
dc.description.abstract Interpretability in deep learning models has recently emerged as a major and growing concern, especially in high-stakes settings such as medical diagnostics. This thesis focuses on the problem of how to design real post-hoc modifiable Deep Neural Networks (DNNs) that can achieve or exceed state-of-the-art performance while also providing increased transparency that can help in understanding how predictions made by DNNs were reached. Existing techniques for interpretability are mostly concentrated on inspecting neuron activations as is. Here, we study controlled neuron activation adjustments during inference and examine whether these adjustments can help improve the explainability and generalization of Fully Connected Neural Networks (FCNNs) without retraining. The dataset utilized in our research is a publicly available benchmark brain tumor classification dataset, which has been divided into four classes: glioma, meningioma, pituitary tumor, and no tumor. Both a baseline Fully Connected Neural Network (FCNN) was constructed and assessed, and an interpretable framework was created where activation patterns along the layers were visualized. Finally, we further studied the activation dynamics through experiments with partial network connections, underfitting, and overfitting, in order to investigate the relationship among sparsity, generalization, and interpretability of the network. Based on these results, the study introduces three activation method adaptation strategies: a thresholding method based on qualitative analysis of heatmaps, a robust Linear moments ( Lmoments), and a probabilistic Gaussian Mixture Model (GMM) threshold determination method. All of them introduce a systematic adjustment of neuron activations according to individual activation magnitude, which tends to make the latent feature representation more significant in the inference phase. Experimental results show that the improvement of classification accuracies can be significant on misclassified samples as well as on overall model performance, achieving up to 14% improvements without retraining. The proposed approaches provide realistic post-deployment methodologies to enhance model performance without significant computational overhead or regulatory liability. Furthermore, the improvements in interpretability achieved by activation visualization and correction provide useful information regarding how deep neural networks arrive at their decisions, building trust with users. The thesis observes this “post-hoc” activation manipulation as a promising, scalable path to improving the interpretability and the usability of Deep Learning (DL) models, especially in sensitive domains where not only performance but also transparent decision-making is paramount. en_US
dc.language.iso az en_US
dc.publisher ADA University en_US
dc.rights Attribution-NonCommercial-NoDerivs 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/us/ *
dc.subject Deep learning (Machine learning) -- Interpretability en_US
dc.subject Neural networks (Computer science) -- Transparency en_US
dc.subject Artificial intelligence -- Medical applications en_US
dc.subject Medical imaging -- Data processing en_US
dc.subject Brain -- Tumors -- Diagnosis -- Data processing en_US
dc.title Refining Neural Network Interpretability through Activation Modification Techniques: An Exploration of Threshold-Based Approaches en_US
dc.type Thesis en_US


Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States

Search ADA LDR


Advanced Search

Browse

My Account