Machine learning methodology - predicting pharmaceutical market response to clinical trial releases
Pharmaceutical companies function inside a tightly controlled and exceedingly precarious setting, where even a minor mistake can result in significant financial consequences. Consequently, the public constantly monitors the announcements of clinical trial results as they have a significant impact on future occurrences.
The majority of studies concentrate on analyzing the impact of announcements on business stock prices after they have occurred, without taking into account the problem from a predictive perspective. The objective of this study is to address this discrepancy by presenting a framework that enables the prediction of the specific numerical values of stock price fluctuations caused by announcements. Indeed, the issue at hand pertains to accurately forecasting the influence of a particular occurrence on the associated sequence of data across time.
This platform incorporates a BERT model to extract the sentiment polarity of announcements, a Temporal Fusion Transformer for forecasting the expected return, a graph convolution network to capture event linkages, and gradient boosting for predicting price changes. We process the vast dataset from the Food and Drug Administration (FDA), which is one of the largest in terms of size. This dataset comprises 5436 clinical trial releases, originating from 681 distinct firms, covering the period from 2018 to 2022. Throughout the study, we have obtained numerous significant results and specific insights within the domain. Firstly, we gather statistical evidence to determine the impact of publicizing clinical results on the market value of pharmaceutical companies. Furthermore, we observe distinct patterns of responses to positive and negative announcements, evident in a more intense and noticeable reaction to poor clinical news.
Additionally, we identify two significant factors that are essential in a predictive framework: the size of the company's drug portfolio, which indicates a higher vulnerability to announcements when there is limited diversification among drug products, and the impact of the announcement network, which enhances predictive accuracy by leveraging the interconnections between events within the same company or medical field. Ultimately, we demonstrate the effectiveness of the forecast configuration by achieving scores that are mostly over 0.7 when classifying price changes using historical data. We prioritize the applicability and universality of the established framework to additional datasets and domains, contingent upon the existence of two crucial components: events and their corresponding time series.
Extracting sentiment polarity from clinical releases. The detection of emotion polarity is crucial in predicting the market value change caused by a public clinical statement. We focus on three distinct polarity categories: the collection of positive announcements , the collection of negative announcements, and lastly, the collection of neutral announcements.
In accordance with the widely accepted approach in machine learning, we want to create a feature space that contains valuable information from multiple data sources in order to get a high-quality predicted solution. Within our framework, the feature space is characterized as a collection of parameters that influence the fluctuation in stock price subsequent to the occurrence of an announcement. Therefore, we create novel characteristics that pertain to either the market, company, or announcement areas. Market and company features encompass the commerce and financial aspects of the company's operations. The announcement includes sentiment polarity and qualities connected to the medical field.