RELIABILITY OF SUPERVISED TOPIC MODELS OVER UNSUPERVISED TOPIC MODELS FOR THE PREDICTION TASK
Abstract
The study investigated the depth of machine learning's capacity to perform prediction tasks. The study used textual data, specifically the daily actions of cryptocurrency (Bitcoin) dealers, which were found in news articles. The data was employed merely because it produced crowd knowledge of trade from News articles that affected the market price trend. For the goal of making predictions, 4073 pre-processed, scraped news articles from CNBC's market section website were analysed using the Latent Dirichlet Allocation (LDA) model and its variation, the Supervised Latent Dirichlet Allocation Model (sLDA). The document-term matrix and "k" with different values ranging from 3 to 200 were used to train and test the models. The study used four metrics for evaluation because of our multinomial classification method: mean absolute percentage error (MAPE), mean absolute error (MAE), root mean square error (RMSE), and R2. The outcome demonstrated that for label prediction for unlabeled new documents, the sLDA model performed better than the LDA model plus (classification or regression model). The response variable, which was tagged "users' or traders' interest," was the daily closing price of each corresponding document.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Science World Journal
This work is licensed under a Creative Commons Attribution 4.0 International License.