Brito Chacón, Eduardo Alfredo: Explainable Resource-Aware Representation Learning via Semantic Similarity. - Bonn, 2023. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-72981
@phdthesis{handle:20.500.11811/11174,
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-72981,
doi: https://doi.org/10.48565/bonndoc-173,
author = {{Eduardo Alfredo Brito Chacón}},
title = {Explainable Resource-Aware Representation Learning via Semantic Similarity},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2023,
month = dec,

note = {The rapid advancement of artificial intelligence (AI) systems in recent years is largely due to the impressive capabilities of artificial neural networks. Their powerful capabilities in natural language understanding and computer vision have paved the way for the wide adoption of AI solutions. However, these models often demand significant computational resources and operate as "black boxes", limiting their utility in sensitive domains, such as finance and healthcare, where strict personal data protection regulations apply.
This thesis addresses the triadic trade-off between accuracy, explainability, and resource consumption in the context of supervised learning, with an emphasis on representation learning for text applications. It starts presenting three use cases: semantic segmentation for autonomous driving, sentiment analysis via language models, and text summary evaluation. These cases underscore the need for robust evaluation techniques to enhance system trustworthiness but also highlight their limitations, motivating the development of RatVec, an explainable, resource-efficient framework leveraging kernel PCA and k-nearest neighbors, which is presented subsequently. RatVec demonstrates a competitive performance under certain conditions, especially when tasks can be represented as sequence similarity problems, e.g., protein family classification. For situations where RatVec is less suitable, such as text classification, the thesis proposes an analogous pipeline using Transformer-based text representations. This approach, when fine-tuned, approximates the accuracy from pure neural models while maintaining architectural explainability, and enables granular explanations of semantic similarity via a novel technique of pairing contextualized best-matching tokens.
In sum, this thesis advances the pursuit of trustworthy AI systems by introducing RatVec, a resource-efficient, explainable framework optimally suited to settings that are naturally translatable to sequence similarity problems, and proposing an explainable Transformer-based pipeline for text classification tasks. These advancements address some of the challenges of deploying AI in sensitive domains and suggest several promising avenues for future research.},

url = {https://hdl.handle.net/20.500.11811/11174}
}

The following license files are associated with this item:

InCopyright