Stock Price Manipulation in the Iran Stock Market Using VAE-LSTM Hybrid Model

Document Type : Original Article

Authors

1 Ph.D. Candidate in Financial Engineering , Qom branch, Islamic Azad University, Qom, Iran.

2 Assistant Prof., Department of System and Productivity Management, Tarbiat Modares University, Tehran, Iran

3 Assistant Prof., Department of Accounting, Qom branch, Islamic Azad University, Qom, Iran.

4 Assistant Prof., Department of Financial Management, University of Kharazmi, Tehran, Iran.

10.48308/jfmp.2024.105131

Abstract

Purpose: The stock market, as one of the main economic sectors of countries, plays an important role in the development and expansion of economic activity. With the development of technology and complex trading algorithms, stock manipulation has become more easily, which makes the use of tools such as artificial intelligence and deep learning to identify manipulation by supervise institutions inevitable. The aim of this research is to identify stock manipulation in the Iran stock market. For this purpose, information on 73  stocks from 19  industries admitted to the stock exchange during 1398 to 1402, approximately 71,300 trading days, was used.
Method: Identifying manipulation in stock transactions poses a significant challenge due to the temporal correlation of stock price data and its dynamic. This challenge is also exacerbated by the unavailability of labeled data. Therefore, given the lack of announcement of manipulated stocks by the stock exchange supervise in the Iran stock market, data identification: 1) Statistical tests such as abnormal returns, manipulated stocks, and the exact date of manipulation have been determined. 2) Random data simulating the stock manipulation pattern has been injected into the time series of stocks that have not been manipulated with high confidence (expert questionnaire). In the next step, using a combination of variable autoencoding models and long short-term memory, the VAE-LSTM algorithm has been designed to compare with some machine learning models such as decision tree, random forest, logistic regression, etc., which calculates the probability of stock manipulation.
Findings: After running the models, the accuracy and recall indices and F1 and F2 were calculated. Because in the stock market, the classification of manipulated and unmanipulated stocks is not of equal importance, the performance evaluation index F2  has been used to rank the models. In order, the VAE-LSTM, decision tree, random forest, multilayer neural network, support vector machine, and logistic regression models showed better performance. The approximate F2  values ​​of the mentioned models are: 72%, 69%, 50%, 41%, 4%  and 26%, respectively. Findings: After implementing the deep learning models, the accuracy and recall indices and F1 and F2 were calculated. Because in the capital market, the classification of manipulated and unmanipulated stocks is not of equal importance, the performance evaluation index F2 was used to rank the models. The VAE-LSTM, decision tree, random forest, multilayer neural network, support vector machine, and logistic regression models performed better, respectively. The approximate F2 values ​​of the aforementioned models were: 72%, 69%, 50%, 41%, 40%, and 26%. After the VAE-LSTM hybrid model, the decision tree model is ranked next, which also has a good balance between the accuracy and recall indices. This indicates that perhaps one of the most effective ways to identify manipulation is to use predetermined rules that are extracted by decision tree models and can be updated at different time intervals.
Conclusion: Finally, the proposed model based on the F2 performance evaluation index has shown a better ability to detect manipulation than other models. It is important to note that other machine learning models used in this study also performed well, especially in the accuracy evaluation index, but unfortunately, they performed poorly in terms of the more important recall performance index. After determining the proposed model as the selected model, based on the Tehran Stock Exchange's total index, we considered the capital market's bullish period in the period from 1398/12/01 to 1399/05/31, the capital market's bearish period in the period from 1399/05/21 to 1399/08/20, and the year 1400 as the capital market's equilibrium period. As expected, the probability of manipulation is higher in bullish, balanced, and bearish markets, respectively. These results are generally consistent with other previous studies. The results are conceptually consistent with reality. Since short selling is not possible in the Iranian capital market, manipulators can only make a profit by manipulating by “raising the price and emptying” and it is not possible to use the manipulation method of “lowering the price and buying back”. Therefore, in a bear market, creating a trend change in the capital market requires a lot of resources, which reduces the incentive to manipulate the share.

Keywords


Purpose: The stock market, as one of the main economic sectors of countries, plays an important role in the development and expansion of economic activity. With the development of technology and complex trading algorithms, stock manipulation has become more easily, which makes the use of tools such as artificial intelligence and deep learning to identify manipulation by supervise institutions inevitable. The aim of this research is to identify stock manipulation in the Iran stock market. For this purpose, information on 73  stocks from 19  industries admitted to the stock exchange during 1398 to 1402, approximately 71,300 trading days, was used.
Method: Identifying manipulation in stock transactions poses a significant challenge due to the temporal correlation of stock price data and its dynamic. This challenge is also exacerbated by the unavailability of labeled data. Therefore, given the lack of announcement of manipulated stocks by the stock exchange supervise in the Iran stock market, data identification: 1) Statistical tests such as abnormal returns, manipulated stocks, and the exact date of manipulation have been determined. 2) Random data simulating the stock manipulation pattern has been injected into the time series of stocks that have not been manipulated with high confidence (expert questionnaire). In the next step, using a combination of variable autoencoding models and long short-term memory, the VAE-LSTM algorithm has been designed to compare with some machine learning models such as decision tree, random forest, logistic regression, etc., which calculates the probability of stock manipulation.
Findings: After running the models, the accuracy and recall indices and F1 and F2 were calculated. Because in the stock market, the classification of manipulated and unmanipulated stocks is not of equal importance, the performance evaluation index F2  has been used to rank the models. In order, the VAE-LSTM, decision tree, random forest, multilayer neural network, support vector machine, and logistic regression models showed better performance. The approximate F2  values ​​of the mentioned models are: 72%, 69%, 50%, 41%, 4%  and 26%, respectively. Findings: After implementing the deep learning models, the accuracy and recall indices and F1 and F2 were calculated. Because in the capital market, the classification of manipulated and unmanipulated stocks is not of equal importance, the performance evaluation index F2 was used to rank the models. The VAE-LSTM, decision tree, random forest, multilayer neural network, support vector machine, and logistic regression models performed better, respectively. The approximate F2 values ​​of the aforementioned models were: 72%, 69%, 50%, 41%, 40%, and 26%. After the VAE-LSTM hybrid model, the decision tree model is ranked next, which also has a good balance between the accuracy and recall indices. This indicates that perhaps one of the most effective ways to identify manipulation is to use predetermined rules that are extracted by decision tree models and can be updated at different time intervals.
Conclusion: Finally, the proposed model based on the F2 performance evaluation index has shown a better ability to detect manipulation than other models. It is important to note that other machine learning models used in this study also performed well, especially in the accuracy evaluation index, but unfortunately, they performed poorly in terms of the more important recall performance index. After determining the proposed model as the selected model, based on the Tehran Stock Exchange's total index, we considered the capital market's bullish period in the period from 1398/12/01 to 1399/05/31, the capital market's bearish period in the period from 1399/05/21 to 1399/08/20, and the year 1400 as the capital market's equilibrium period. As expected, the probability of manipulation is higher in bullish, balanced, and bearish markets, respectively. These results are generally consistent with other previous studies. The results are conceptually consistent with reality. Since short selling is not possible in the Iranian capital market, manipulators can only make a profit by manipulating by “raising the price and emptying” and it is not possible to use the manipulation method of “lowering the price and buying back”. Therefore, in a bear market, creating a trend change in the capital market requires a lot of resources, which reduces the incentive to manipulate the share.