Low-Quality Position-Agnostic Image Recognition Using Sequence Models

Jamalova, Narmin

Home
→
CB5. ADA Theses, Dissertations and Final Projects
→
School of Information Technologies and Engineering
→
View Item

dc.contributor.author	Jamalova, Narmin
dc.date.accessioned	2023-10-16T11:51:57Z
dc.date.available	2023-10-16T11:51:57Z
dc.date.issued	2022-04
dc.identifier.uri	http://hdl.handle.net/20.500.12181/732
dc.description.abstract	Information contained in printed or digitized chemical structures is required to research and develop new chemical products. Currently, few automatic recognition and translation systems of structural formulas exist in the industry leading to a lot of manual effort spent on their identification and analysis. Many older printed publications remain on paper due to the amount of effort required to accurately translate them to a computer-friendly format. Machine learning solutions are in development to address this problem yet such issues as drawing style variations and low-quality are not well-accounted for in existing research. This leads to unstable model predictions of incoming data from older sources. Hence, the purpose of this study is to develop a low-quality and position-agnostic chemical structure recognition model to address the problem. The study analyzes and develops several feature extraction methods, custom augmentations that preserve correct textual orientations, applications of random noise to translate images to sequences using LSTM networks with attention. The results show that InceptionV3 extraction method performs significantly better than Autoencoders due to its depth and several differently scaled filters. The baseline image-to-sequence model achieves a minimum Levenshtein score of circa 19 characters on the validation set, which constitutes approximately a 10% error rate. Custom augmentations and lowering of image quality do not significantly impact the score, which can be due to text ordering, random placement of noise and model overfitting on the original dataset.	en_US
dc.language.iso	en	en_US
dc.publisher	ADA University	en_US
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.subject	Image-to-sequence.	en_US
dc.subject	Encoder.	en_US
dc.subject	Decoder.	en_US
dc.subject	LSTM.	en_US
dc.subject	CNN.	en_US
dc.subject	InceptionV3.	en_US
dc.title	Low-Quality Position-Agnostic Image Recognition Using Sequence Models	en_US
dc.type	Thesis	en_US