Questions? +1 (202) 335-3939 Login
Trusted News Since 1995
A service for banking industry professionals · Monday, April 28, 2025 · 807,256,232 Articles · 3+ Million Readers

Unsupervised generation of tradable topic indices through textual analysis

a schematic representation of the topic index algorithm

FAYETTEVILLE, GA, UNITED STATES, April 27, 2025 /EINPresswire.com/ -- Scientists developed a new way of investing in stocks using natural language processing. Using dynamic topic modeling, a variant of Latent Dirichlet Allocation, the new model uncovers hidden risk factors directly from company reports and translates them into tradable indices with minimal human oversight. Investors can now trade these risk factors directly and track industry trends using only the information contained in words.

A recent article in The Journal of Finance and Data Science introduces an innovative method for constructing investment instruments directly from financial reports — without the need for human intervention.

This novel approach employs dynamic topic modeling (DTM), a variant of Latent Dirichlet Allocation (LDA), to analyze annual and quarterly reports from companies, uncovering hidden risk factors and translating them into tradable indices.

"The beauty of this method lies in its simplicity and transparency; it combines several established algorithms to achieve what previously was not possible,” says co-author Marcel Lee. “By automating the process, we eliminate biases and provide a cost-effective alternative to traditional index construction."

This unsupervised technique automatically selects optimal parameters, discovering implicit risk factors through the semantic analysis of corporate publications, thereby creating a new class of investment instruments — thematic indices.

The study describes the model's capacity to dynamically track economic and industrial trends, illustrating that sectors considered static are in reality constantly evolving. This method captures the fluid nature of industries more accurately than traditional static classifications like GICS or ICB.

"We're observing the industrial landscape through a much sharper and multicoloured lens, enabling investors to tap into nuanced market themes and risk factors previously inaccessible," adds co-author Alan Spark.

In several cases, the research demonstrated that these newly created thematic indices closely mimic established indices, yet are derived without the predefined biases of manual classification systems. “This not only paves the way for a more unbiased benchmarking tool but also reveals industry trends and vocabulary shifts over time, offering a fresh perspective on sectoral dynamics,” says Lee.

One notable challenge acknowledged by the researchers is the approach’s reliance on a ‘bag-of-words’ model, which, while instrumental in parsing large datasets, overlooks the nuanced relationships between words. “Future iterations of this work aim to incorporate more complex models that capture these subtleties, potentially enhancing the predictive power of thematic indices on corporate actions and industry shifts,” shares Spark.

References
DOI
10.1016/j.jfds.2025.100149

Original Source URL
https://doi.org/10.1016/j.jfds.2025.100149

Lucy Wang
BioDesign Research
email us here

Powered by EIN Presswire

Distribution channels: Banking, Finance & Investment Industry, Science, Technology

Legal Disclaimer:

EIN Presswire provides this news content "as is" without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the author above.

Submit your press release