Optical Structure Recognition of Chemical Images Using Transformers

Musazade, Fidan

Library MyADA ADA University

Home
→
CB5. ADA Theses, Dissertations and Final Projects
→
School of Information Technologies and Engineering
→
View Item

Optical Structure Recognition of Chemical Images Using Transformers

Musazade, Fidan

URI: http://hdl.handle.net/20.500.12181/725

Date: 2022-04

Abstract:

The problem of optical chemical structure recognition has been tackled by various researchers using both rule-based and machine learning approaches. However, it still does not have a viable solution that would produce the end-to-end pipeline with high enough accuracy. The approaches tried in this research include implementation of the concept of Transformers to solve this problem as well as image manipulation tactics. The research is focused around applying attention mechanism used in Transformer architecture and Transfer Learning to arrive at results with low Levenshtein Distance, which is a measure of difference between the actual and predicted label for chemical images. The label for images in the study is InChI. Several setups, including Vision Transformers in combination with Vanilla Decoders, as well as EfficientNetV2 backbone with Transformer Encoder and Decoder have been tried. The study suggests that using EfficientNetV2 in couple with Transformer architecture produces best results for the chemical images in Bristol-Myers Squibb dataset published in 2021 electronically. Additionally, resizing with padding instead of stretching produces significantly better results due prevention of information loss. Background and foreground inversion appears to improve the results as well. As a suggestion, further work is recommended to increase the number of epochs and generalize the results for the full dataset instead of a sample used in the study.

Show full item record