Low-Quality Position-Agnostic Image Recognition Using Sequence Models

Jamalova, Narmin

Library MyADA ADA University

Home
→
CB5. ADA Theses, Dissertations and Final Projects
→
School of Information Technologies and Engineering
→
View Item

Low-Quality Position-Agnostic Image Recognition Using Sequence Models

Jamalova, Narmin

URI: http://hdl.handle.net/20.500.12181/732

Date: 2022-04

Abstract:

Information contained in printed or digitized chemical structures is required to research and develop new chemical products. Currently, few automatic recognition and translation systems of structural formulas exist in the industry leading to a lot of manual effort spent on their identification and analysis. Many older printed publications remain on paper due to the amount of effort required to accurately translate them to a computer-friendly format. Machine learning solutions are in development to address this problem yet such issues as drawing style variations and low-quality are not well-accounted for in existing research. This leads to unstable model predictions of incoming data from older sources. Hence, the purpose of this study is to develop a low-quality and position-agnostic chemical structure recognition model to address the problem. The study analyzes and develops several feature extraction methods, custom augmentations that preserve correct textual orientations, applications of random noise to translate images to sequences using LSTM networks with attention. The results show that InceptionV3 extraction method performs significantly better than Autoencoders due to its depth and several differently scaled filters. The baseline image-to-sequence model achieves a minimum Levenshtein score of circa 19 characters on the validation set, which constitutes approximately a 10% error rate. Custom augmentations and lowering of image quality do not significantly impact the score, which can be due to text ordering, random placement of noise and model overfitting on the original dataset.

Show full item record