Data collection, augmentation, classification, and generation for a sign language

Dadash-zada, Samir

Home
→
CB5. ADA Theses, Dissertations and Final Projects
→
School of Information Technologies and Engineering
→
View Item

dc.contributor.author	Dadash-zada, Samir
dc.date.accessioned	2023-10-16T09:51:39Z
dc.date.available	2023-10-16T09:51:39Z
dc.date.issued	2022-04
dc.identifier.uri	http://hdl.handle.net/20.500.12181/721
dc.description.abstract	This thesis discusses and brings a wide range of solutions for different aspects of the data collection, augmentation and generation for an automatic sign language to text translation system. Even though the data collection, augmentation and generation parts were implemented for the Azerbaijani Sign Language (AzSL) dataset, all of the mentioned stages can be used for a different set of sign language images and videos with minimal adjustment to the parameters of the corresponding models. In order to reach out to as many subscribers as possible and collect an original set of images and videos from a public, an exclusive Telegram bot was developed and used during the course of 7 months. It was shared on various social networks in order to maximize the variance of features for each single sign. Multiple fixes and updates were applied on the bot in order to lead the bot subscribers to upload an image or a video that was more scarce. Validation of the original dataset was done by running a few clustering algorithms. The results of the clustering revealed many similar signs, some anomalies, and feature patterns from the dataset and led to some important takeaways. As the dataset will be mainly fed into Machine Learning models, the amount of original images and videos were not enough, especially after dropping out invalid media files among them. That was the main driving force to bring the data augmentation and generation parts. The image and video augmentation techniques used in the thesis are essentially selected because of their feature preservation and speed features. In addition to the augmentation techniques, a state-of-the-art technique for generating synthetic images is also used called Generative Adversarial Network (GAN). A very specific version of GAN called StyleGAN is slightly customized to the dataset which is mainly used for generating photo-realistic high-quality images.	en_US
dc.language.iso	en	en_US
dc.publisher	ADA University	en_US
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.subject	Data collection.	en_US
dc.subject	Azerbaijani Sign Language.	en_US
dc.subject	Generative Adversarial Network.	en_US
dc.title	Data collection, augmentation, classification, and generation for a sign language	en_US
dc.type	Thesis	en_US