Published on: 2019-09-09 | Updated on: 2019-09-09

Journal of Autonomous Intelligence

Article: A Method to identify anomalies in stock market trading based on Probabilistic Machine Learning.


Financial operations involve a significant amount of resources and can directly or indirectly affect the lives of virtually all people. For the efficiency and transparency in this context, it is essential to identify financial crimes and to punish the responsible. However, the large number of operations makes it infeasible for analyzes made exclusively by humans.

Thus, the application of automated data analysis techniques is essential. Within this scenario, this work presents a method that identifies anomalies that may be associated with operations in the stock exchange market prohibited by law. Specifically, we seek to find patterns related to insider trading. These types of operations can generate big losses for investors.

Paulo Andre Lima de Castro and Anderson R.B. Teodoro made a tese about a new method to identify anomalies in stock market tradingg and the survey result published on the journal of autonomous intelligence.

In this paper, they use the public available information from SEC and CVM, based on real cases on BOVESPA, NYSE and NASDAQ stock exchanges, that it was used as a training base. The method includes the creation of several candidate variables and the identification of which are the most relevant. With this definition, classifiers based on decision trees and Bayesian networks are constructed, evaluated and then selected. The computational cost of performing such tasks can be quite significant, and it grows quickly with the amount of analyzed data. For this reason, the method considers the use of machine learning algorithms distributed in a computational cluster. In order to perform such tasks, they use the WEKA framework with modules that allows the distribution of the processing load in a Hadoop cluster. The use of a computational cluster to execute learning algorithms in a large amount of data has been an active area of research .

This work contributes to the analysis of data in the specific context of financial operations. The obtained results show the feasibility of the approach, although the quality of the results is limited by the exclusive use of publicly available data.

For more information, please visit: