Hybrid Forex prediction model using multiple regression, simulated annealing, reinforcement learning and technical analysis

ABSTRACT

. The exchange rates of the major currencies from tuesday February 05 th , 2019 at 15:13 UTC

STATE OF THE ART
As the Forex market is a dynamic environment and sensitive to any event that can disrupt its stability in terms of sudden changes in currency exchange rates, speculation is considered a risky operation and can contain unexpected surprises, therefore, the prediction represents an essential tool to have an overview of the market trend during the following hours or days; because it can allow knowing when and how the trader should react to avoid losing money and to maximizing profits of his investment in Forex. In the literature, there is a large bibliography used different techniques and methods [4]- [6] shows success. The model proposed in this paper is characterized by using various contributing techniques in the aim to obtain a good prediction accuracy of currency exchange rates, the added value of this model consists in combining several types of machine learning (supervised learning and reinforcement learning (RL)) [7], [8]. Optimization method [9] and also technical analysis [10], [11] through the relative strength index (RSI) indicator. The proposed model will not only help to make an investment decision but also to calculate the investment profit of a specific capital. Hence, be gone this section provides an overview of bibliographic literature related to currency exchange rate prediction and Forex investment.

Investing in Forex
Making decisions in business and finances is always facing the problem of risk. Reliability and risk categories frequently are invoked together, while analyzing processes, system conditions and development sustainability. However, often these categories are conceived as opposite to each other: when risk increases, reliability possibility decreases, and conversely-when risk decreases, reliability possibilities increase [12]. Although trading in the Forex market is a risky business, therefore, an investment in this market must first begin with a market study, this study must include three essentials elements: money management, fundamental analysis, and technical analysis.
The first element, money management helps to determine investment risks in advance, develop and improve discipline and take trading to the next level, which helps protect the investment portfolio to avoid investment risks, and then it refers to the ability to manage earnings and trading so as not to take risks outside the investment strategy. The second element to do a market study is fundamental analysis, this element is based on five major factors which have a direct influence on the exchange rate of currencies, and which can help to anticipate the Forex trend and make a fair enough prediction, these major factors are: i) economic growth, ii) geopolitics or political stability, iii) monetary policy, iv) imports and exports, and iv) interest rates. It is very important to consult the economic and political calendar to be able to establish a global vision on the fundamental factors which influence the economic market and consequently the currency pair to invest. The economic calendar is based on GDP, economic growth and the inflation rate of the investment countries, because if there is good economic growth we can anticipate an increase in the exchange rate, and vice versa. In addition the interest rates of a country is the most important fundamental factor in determining the exchange rate of a currency because when a country increases its interest rate, consequently the investors are directed toward the appreciation of the currency and thereafter the investment in this currency, suddenly they will buy it more. Therefore, fundamental or macro-economic analysis of the Forex market is used as a tool to predict prices in the medium and long term [13] which is not the case in our study and it will not be used in the proposed model. It also makes it possible to assess currency prices and anticipate future market trends by relying on economic publications from countries around the world. Indeed, several times a day, economic data is published by financial institutions, governments, central banks or even certain private organizations, sometimes having significant impacts on the evolution of financial market prices.
The third element: technical analysis is appeared in Japan during the 16 th century, it aimed to forecast the evolution of the price of rice, but the rise of technical analysis was reinforced in the 1970s with the appearance of indicators. Techniques and computer development, which makes modeling more and more sophisticated. In the case of Forex, technical analysis is used to examine changes in the market by analyzing signals and then interpreting those signals to be able to open, close or modify a buy or sell transaction.
Technical analysis is a method that tries to predict Forex movements by looking at historical market data including prices, volumes traded and open interest and identify trends and significant price levels with a high probability of rebounds such as media and resistances. The technical analysis is essentially based on the fact that certain market configurations are cyclical and that price action is repeated over time. Technical analysts and traders use technical and mathematical indicators to make their investment decisions. These indicators are viewed in real-time on graphs that are interpreted to identify buying or selling opportunities.
In contrast to fundamental analysis that examines economic factors and long-term trends, technical analysis is the method most used by Forex traders to predict short-term currency price movements such as weekly, daily or even hourly prediction [13]. Technical analysis can help you identify the trend; identify the strength and stability of the trend over time. Technical analysis can increase discipline and decrease the influence of emotions in your trading plan. While no system can guarantee you can identify the Forex trend 100%, technical analysis can help you create a trading plan and follow it more objectively. Forecasting in Forex is a statistical process to help informing decisions on currency rate prediction planning and therefore short, medium and long term investment in Forex.
Exchange rate forecasting has a direct effect on the rates themselves. If there is a well-publicized forecast informing to rise exchange rate, people (investors) will immediately adjust the price they are willing to pay and the forecast will come true on its own. In a sense, exchange rates become their own predictions. Predicting whether the exchange rate will rise or fall tomorrow is about as predictable as predicting whether a coin toss will fall as a head or tail. Either way, you'll be right about 50% of the time, regardless of your prediction. It is said that predictions are not possible in a changing environment. Every environment changes and a good forecasting model capture how things will change.
Two kinds of methods are used to make currency exchange rate predictions, and subsequently have good speculation in the Forex market, the first kind is qualitative method, used in the case where there is no data available or if existing data is not relevant for forecasting. The second type is the so-called quantitative methods, used to exploit historical digital information (collected at regular intervals in time) while assuming 895 that the behavior of past achievements will recur in the future. In this study the method adopted is the quantitative method, because we have a database containing the exchange rates of the EUR/USD pair.

Related work
Many papers in the literature that propose various methodologies and techniques [10], [14], [15] for modeling the prediction of exchange rates in the Forex market, Yong et al. [13] studied the effects of different types of inputs including: The close price as well as various technical indicators derived from the close price are studied to determine its effects on the Forex trend predicted by an intelligent machine learning module and it has been found that the type of input data used for Forex price prediction is a crucial element that cannot be taken lightly. This means that the incorporation of trading rules and technical analysis as performed by the technical analyst in the initial phase of the forecasting algorithm will help increase the accuracy of Forex price forecasting. Samad et al. [16] uses financial information and historical price data, the problem of stock price prediction is modeled by data mining with machine learning algorithms, namely the support vector machine and random forest. Based on the results produced, the analysis gained more than 60% accuracy for textual analysis (financial information) and 90% for numerical analysis (historical price data). Fattah et al. [17] used deep neural network (DNN) to predict whether the closing price is reached at the profit which is determined by the investor or not and improve the accuracy of the prediction. Particle swarm optimization (PSO) and machine learning (AutoML) are used as optimizers with DNNs. Based on the experimental results, deep learning AutoML was found to have the best accuracy rate, which ranges from 81% to 92% across all companies, and accuracy after DNN optimization using PSO varies from 73% to 82% in all companies. Rassetiadi and Suharjito [18] examine the external factors that can influence forecast results, looking for the relationship between the value of indices such as S and P500 (standard and poor's) and the value of commodities such as gold and silver in the process EUR/USD prediction. When comparing the mean squared error (MSE) values, it turned out that the best combination was a combination of the FTSE100 (financial times stock exchange) and natural gas values. Meng and Khushi [19] reviewed all recent stock/Forex prediction or trading articles that used RL as their primary machine learning method and conclude that RL in stock/Forex trading is still in its early development and further research is needed to make it a reliable method in this domain. Carapuço et al. [20] developed a new system for short-term speculation in the foreign exchange market Forex, based on recent RL developments and concluded that learning in the training dataset is stable and it is apparent from a number of validations learning curves that the Q-network is indeed capable of finding relationships in financial data that translate to out-of-sample decision making. It was also shown that the model obtained from the training procedure can then be harnessed for profitable trading in atest dataset. Rundo [8] proposes in this work the use of an algorithm based both on supervised deep learning and on a RL algorithm for forecasting the short-term trend in the currency Forex market to maximize the return on investment in an high-frequency trading (HFT) algorithm. The trading system has been validated over several financial years and on the EUR/USD cross confirming the high performance in terms of return of investment (98.23%) in addition to a reduced drawdown (15.97%) which confirms its financial sustainability.

PROPOSED SYSTEM
Following the most of conclusions reached from the bibliographic research, the combination of two or more techniques can show better results [8], [9], [21] in terms of the exchange rate prediction accuracy, the proposed model in this paper combines more than one method to predict exchange rates: − Multiple regression for its ability to determine the relative influence of one or more predictor variables on the value of the criterion. − Simulated annealing to avoid getting locking in a local optimum. − In order to structure efficient trading systems, RL has enormous capacities in terms of decision-making using an intelligent agent. − The RSI technical indicator assesses overbought or oversold conditions of the exchange rate.

The dataset
The preparation of the data used in our research went through three main processes, the first is the collection of data from a specific time period, then the second process is data refinement, which is to eliminate unnecessary data for our research, finally the last process is the calculation of new variables from existing data. These steps are necessary to have good prediction results. This allows the trader to make correct investment decisions.

Data collection
The data used in this paper were collected on the site https://eatradingacademy.com [22]. These data represent the daily exchange rates of the EUR/USD pair recorded between 10/25/2016 and 09/27/2020, so almost 1,227 records, we chose the EUR/USD pair because it has around 25% trading volume in Forex, which makes it ideal for short term prediction. The dataset contains the daily values of the open, close, high and low prices. The period 2016-2020 was chosen because it was characterized by several world events having influenced the Forex market, in particular, the election of Trump as president of the United States of America for the presidential term 2017-2021 with all with all his decisions that havig marked the Americain economy and consequently the world economy, thus this period was marked by the health crisis linked to the coronavirus disease 2019 (COVID 19) pandemic with the restrictions that this entails, such as the contraction of the economy and per capita income, which will tilt millions of people into extreme poverty. The dataset used is separated into two parts, the training set 80% and the testing set 20%, The literature review revealed that the testing or the validation set should be approximately from one fourth to one eight of the training set [23].

Data refinement
The dataset used in this research consists of eight different variables that represent closing exchange rates over a one-month time frame. Figure 2 represents the eight variables making up the data set: the closing exchange rate for the current day D, day D-1, day D-2, day D-3, day D-4, of day D-5, day D-6, the average exchange rate for the last two weeks proceeding the current day D and the average value of the last month preceding day D. These variables are used to calculate other variables that are going to be introduced into the dataset to help improve prediction accuracy.

Calculation of variables
Since the use of technical analysis in Forex speculation has had good results [13], we have introduced the normalized RSI technical indicator as a variable in our dataset to design the proposed prediction model. The choice of the RSI indicator is simply because it allows traders to quickly see if the market is overbought or oversold, making it easy to interpret. Because of the results obtained by the literature review made on the prediction techniques used in Forex to improve the prediction of exchange rates and to have a fairly reliable accuracy, we propose an investing strategy in the trading market. The proposed strategy Figure 3 is composed of several blocks which are combined to result in a better prediction of the exchange rates of the EUR/USD pair. The first block is for collecting historical exchange rate data, then preparing the data through refining and removing unnecessary data, then dividing the database into two parts, the training set and testing and validation set with a percentage of 80% and 20% successively. The second block is used for the implementation of supervised learning through the multiple regression algorithm to predict the exchange rates from day D until day D+14. The third block was used to optimize the results obtained by the multiple regression algorithm to achieve a better fit between the actual values of the exchange rates of the EUR/USD pair and the predicted values obtained by the multiple regression. This optimization is made by the simulated annealing metaheuristic and this through the use of the results obtained by the multiple regression as input variables for the simulated annealing metaheuristic, while specifying the objective function, which is in this case the mean squared error (MSE) calculated between the real values and the predicted values obtained by the multiple regression, therefore the optimization is successful when the MSE arrives at a minimum point, in this case, the values predicted by the multiple regression will be replaced by optimized values. The fourth block used RL to train the model through the exploration of the environment which is in this case the values of the exchange rates of the EUR/USD pair. The agent will try to maximize the rewards for achieving a good result. First, it is mandatory to initialize the state s (t) then choose at random an action a (t) and then generate an action s (t+1). The fifth block is used to exploit the optimized predicted values obtained by multiple regression and simulated annealing, using RL to arrive at initial Forex investment decisions. The sixth block calculates the normalized RSI technical indicator which will be used in combination with the results of block five to improve the accuracy of prediction and have a final investment decision.

MODEL TECHNIQUES 4.1. Multiple regression
Multiple regressions are a family of statistics used to investigate the relationship between a set of predictors and a criterion (dependent) variable. This procedure is applicable in a variety of research contexts and data structures [10], [24]- [27]. This is the case in the model proposed by this study. The implementation of the regression algorithm in this model is started by the use of the values of the exchange rates of days D, D-1, D-2, D-3, D-4, D-5, D-6, the average of last two weeks and the average of the last month, as input values.
One of the advantages of multiple regression is the ability to determine the relative influence of one or more predictor variables on the value of the criterion, also its simplicity of interpretation and its ease of calculation. Figures 4 to12 illustrate a shift between the predicted values and the real values of exchange rates, this shift widens as the prediction day moves away from day D, this shift is clearly represented by the calculation of the coefficient of determination R² which makes it possible to measure the adequacy between the real values and the predicted values. Figure 13 represents the coefficient of determination calculated for the predicted days. The prediction of the exchange rate values using multiple regression method shows a high accuracy of more than 99% for day D, but on the other hand, this precision decreases for day D+1 until day D+14. In this case, we propose to integrate an optimization solution such as simulated annealing meta-heuristic to adjust the predicted values with the real values.

Simulated annealing
Simulated annealing is a metaheuristic method used more in industry and which is based on heating two metals and then cooling them slowly or rapidly, which represent completely different properties [28]. So it is intended to solve optimization problems, therefore, it is used in this model to optimize the currency prediction results of the EUR/USD pair obtained by using the multiple regression algorithm, the principle of use was to take the outputs of the EUR/USD exchange rate prediction results obtained by applying the multiple regression algorithm as input variables for the simulated annealing, certainly with other parameters such as temperature and limit threshold , to establish an optimization algorithm Figure14 that will allow the predicted values to be adjusted with the actual values. Figures 15 to 23 illustrate the results obtained by integrating the optimization meta-heuristic.
The results of the integration of the simulated annealing meta-heuristic Table 1, showed a prediction improvement compared to the results obtained by the use of the multiple regression algorithm, the smaller errors of prediction MSE in Table 1 is better. In other words, the use of several techniques shows more efficiency. Precision compared to the use of a single technique [9]. Hence, we thought of introducing another type of learning, which is RL to predict an action (decision) to be taken (buy, sell, hold), using results obtained by multiple regression and simulated annealing.

Reinforcement learning
Scientific research in the field of artificial intelligence is mainly devoted to the development of programs to solve complex problems without the intervention of human beings, these programs are designed using learning algorithms that extract lessons from repeated experiments by proceeding through the notions of test and error, as in the case of RL, which allows an autonomous agent to learn the actions to take, from experiments, in order to optimize a quantitative reward over time. This makes it possible to develop intelligent solutions to solve complex management problems [8], [19], [20]. RL is, after supervised and unsupervised learning, the third possibility for teaching algorithms to make decisions for themselves. In other words, develop intelligent solutions to solve complex management problems. The major advantage of RL is that it does not need any data for conditioning, unlike supervised learning which must be data fed beforehand. Either RL allows the machine to learn through interactions with the environment and to apply what it has learned to solve complex problems in whether the human being has to enter data or intervene in one way or another. Noting that the RL models are inspired by the behaviorist approach to psychology and learning. At the turn of the 20 th century, American psychologists like Watson, Thorndike, Skinner and Pavlov rejected the study of mental processes from an introspective approach [29]. They state that these cannot be studied objectively and that only measurable and observable behaviors can and should be. Their work on learning sheds light on the links between a situation (or stimulus), a behavioral response and its consequence.

Stimulus⟶Response⟶Consequence
Thorndike postulates the existence of learning laws. On one hand,the law of exercise states that the link between stimulus and response is strengthened by exercise and that the probability of response increases with the number of trials performed. On the other hand, the law of effect states that the link between the stimulus and the response is strengthened or weakened by the immediate effect of its consequences.
RL is different from supervised learning because it does not need inputs and outputs to learn, it is the agent who decides what to do in a specific situation. so, in the proposed model, the environment is the Forex market, the stocks are buy, sell and hold, the state is represented by the exchange rates predicted by the combination of the multiple regression algorithm and the metaheuristic of simulated annealing (outputs from block 3). Figures 24 to 32 represent the initial decisions obtained based on the predictions of the exchange rates for days D, D+1, D+2, D+3, D+4, D+5, D+6, D+ 7, D+14. Also the model proposes to calculate the total gains and the percentage of investment according to the predictions obtained, Table 2 and Table 3 present the results for an investment of €10,000 and €50,000.

RSI indicator
The RSI indicator (for relative strength index) is an indicator used by traders to measure the magnitude of price changes and highlight areas of overbought or oversold exchange rates. In addition, the graphical representation of the RSI makes it possible to distinguish three situations Figure 33: − An extreme band from 0 to 30 symbolizing an oversold zone. − The zone of neutrality with a low median of 30 to 50 and a high median of 50 to 70. − An extreme band from 70 to 100 symbolizing an overbought zone. When the RSI indicator indicates an overbought situation, we can therefore go short. When the indicator indicates an oversold market, one can go long. The idea underlying this strategy is as follows. An overbought market is ready to enter a selling phase, an oversold market in a buying phase.
The reason behind choosing the RSI indicator is simply because it allows you to quickly see whether the market is overbought or oversold, which makes it easy to be interpreted. By the combination of RL outputs and RSI technical indicator results, the model proposed can take investment decisions using rules mentioned in Table 4. The scenario proposed for the trader to make the investment decision is that the result of the RL algorithm is identical to the result obtained by the RSI indicator; otherwise, the trader must put his investment on hold.

CONCLUSION
This paper represents the continuity of a previously published work of a model proposed to combine two techniques which are: multiple regression and simulated annealing, the results showed that the combination of two techniques gives good precision of prediction. From this conclusion, we proposed an improvement of the model through the introduction of new techniques which are learning by reinforcement and RSI indicator. All about harnessing the strengths of these techniques to refine investment decision-making, which must be made based on the combination of the four techniques. It must be proven that the purpose of this model is not only to maximize investment profits for traders, but also to limit the risks associated with the prediction and this through the multitude and diversity of techniques used in terms of type to reassure better prediction compared to using a single technique, we used supervised learning, meta heuristics, RL and technical analysis. As a future work and following the current political and economic situation which is characterized by its disturbed and non-stationary nature, we aim to extend the application domain of this model to contain the oil market to predict the daily movements of this marketand analyze its influence on the global economy.