ADA Library Digital Repository

Data collection, augmentation, classification, and generation for a sign language

Show simple item record

dc.contributor.author Dadash-zada, Samir
dc.date.accessioned 2023-10-16T09:51:39Z
dc.date.available 2023-10-16T09:51:39Z
dc.date.issued 2022-04
dc.identifier.uri http://hdl.handle.net/20.500.12181/721
dc.description.abstract This thesis discusses and brings a wide range of solutions for different aspects of the data collection, augmentation and generation for an automatic sign language to text translation system. Even though the data collection, augmentation and generation parts were implemented for the Azerbaijani Sign Language (AzSL) dataset, all of the mentioned stages can be used for a different set of sign language images and videos with minimal adjustment to the parameters of the corresponding models. In order to reach out to as many subscribers as possible and collect an original set of images and videos from a public, an exclusive Telegram bot was developed and used during the course of 7 months. It was shared on various social networks in order to maximize the variance of features for each single sign. Multiple fixes and updates were applied on the bot in order to lead the bot subscribers to upload an image or a video that was more scarce. Validation of the original dataset was done by running a few clustering algorithms. The results of the clustering revealed many similar signs, some anomalies, and feature patterns from the dataset and led to some important takeaways. As the dataset will be mainly fed into Machine Learning models, the amount of original images and videos were not enough, especially after dropping out invalid media files among them. That was the main driving force to bring the data augmentation and generation parts. The image and video augmentation techniques used in the thesis are essentially selected because of their feature preservation and speed features. In addition to the augmentation techniques, a state-of-the-art technique for generating synthetic images is also used called Generative Adversarial Network (GAN). A very specific version of GAN called StyleGAN is slightly customized to the dataset which is mainly used for generating photo-realistic high-quality images. en_US
dc.language.iso en en_US
dc.publisher ADA University en_US
dc.rights Attribution-NonCommercial-NoDerivs 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/us/ *
dc.subject Data collection. en_US
dc.subject Azerbaijani Sign Language. en_US
dc.subject Generative Adversarial Network. en_US
dc.title Data collection, augmentation, classification, and generation for a sign language en_US
dc.type Thesis en_US


Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States

Search ADA LDR


Advanced Search

Browse

My Account