ADA Library Digital Repository

Machine Learning Repository

Show simple item record

dc.contributor.author Maharramli, Gunay
dc.date.accessioned 2025-02-27T11:01:49Z
dc.date.available 2025-02-27T11:01:49Z
dc.date.issued 2024-04
dc.identifier.uri http://hdl.handle.net/20.500.12181/1017
dc.description.abstract The thesis proposes a design and practical implementation of a database system that is made for holding and managing the descriptive data derived from various repositories. The driver of this project is the goal of giving researchers, data scientists, and enthusiasts a place to access, probe, which is to help analyze, data spread across different repositories. The main purpose is building up a powerful and easy to be used database solution which will be able to keep in memory a wide range of data descriptions and also provide efficient storing, retrieving and querying functions. For the implementing of the project, Mongodb as a NoSQL database was chosen because of following reasons. MongoDB provides a schemafree framework to be able to keep documents with various data types which do not belong to a predefined schema. This is necessary due to the fact that the structures of the dataset schemas are quite different in the data repositories. On the top of that, the flexibility and high performance of MongoDB render it a reliable solution for handling big volumes of data as well as to enable concurrent access for many users. The UCI World Machine Learning Repository, Kaggle and OpenML are the main data repositories in this database. As a initial phase, 100 documents exist in the MongoDB database. Other than that, MongoDB makes provision for lots of querying mechanisms such as Mongo Shell, MongoDB Compass, graphical user interface (GUI), which are meant to enhance the users' choice of interacting with the database depending on the preference, skill and expertise. Additionally, new GUI for querying datasets information was created using flask in the python code, and GUI improves the overall usability and accessibility of the database system by giving users an easy-to-use interface to search, filter, and view dataset descriptions. Broadly, the database system designed and described in this paper is a convenient asset for the data science community en_US
dc.language.iso en en_US
dc.publisher ADA University en_US
dc.rights Attribution-NonCommercial-NoDerivs 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/us/ *
dc.subject Database systems -- Design en_US
dc.subject Data management -- Research en_US
dc.subject Information retrieval -- Databases en_US
dc.subject Global -- Data science community en_US
dc.subject Global -- Machine learning repositories en_US
dc.title Machine Learning Repository en_US
dc.type Thesis en_US


Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States

Search ADA LDR


Advanced Search

Browse

My Account