Affordable Access

Access to the full text

Construction of a news article evaluation model utilizing high-frequency data and a large-scale language generation model

Authors
  • Nishi, Yoshihiro1
  • Suge, Aiko1
  • Takahashi, Hiroshi1
  • 1 Keio University, Hiyoshi 4-1-1, Kohoku-ku, Yokohama-shi, Kanagawa-ken, Japan , Yokohama-shi (Japan)
Type
Published Article
Journal
SN Business & Economics
Publisher
Springer International Publishing
Publication Date
Jul 21, 2021
Volume
1
Issue
8
Identifiers
DOI: 10.1007/s43546-021-00106-0
Source
Springer Nature
Keywords
Disciplines
  • Original Article
License
Yellow

Abstract

News articles have significant impacts on asset prices in financial markets. A great number of attempts have been conducted to ascertain how news articles influence stock prices. News articles have been reported to contain sentimental and fundamental information that affects stock price fluctuations, and many studies have been conducted to evaluate stock price fluctuations using them as analytical data. However, the limitations in the number of available datasets usually become the hurdle for the model accuracy. This study aims to improve the analytical model’s accuracy by generating news articles using language generation technology. We tested whether the model that used the generated data was better than the trained model with real-world data. The model constructed in this research is a model that evaluates news articles distributed to financial markets based on the price fluctuation rate of stock prices and predicts and evaluates stock price fluctuations. This study labeled based on high-frequency trading data and generated news articles using a large-scale language generation model (GPT-2). Also, we analyzed and verified the effect. In this study, we succeeded in generating news articles using the large-scale language generation model and improving the classification accuracy. Our method proposed in this paper has great potential to improve text analysis accuracy in various areas.

Report this publication

Statistics

Seen <100 times