Basic Market Forecasting with Encog Neural Networks

In this article I will show how to apply a neural network to financial prediction. This program is implemented as a Microsoft WPF application using C#. For neural network processing, it makes use of the Encog Artificial Intelligence framework. This application attempts to use some of the same principles that technical market analysts use.

Technical analysts attempt to forecast future security direction through the analysis of past market data. Pure technical analysts are primarily interested in price and volume information, rather than any sort of fundamental information about the company, such as earnings, debt ratios, or product offerings. The theory is that all such fundamental data is already factored into the price and volume of the underlying security. Many investment disciplines advocate the use of both technical and fundamental components. However, the focus of this article is the use of pure technical data.

This article will use purely technical data. This is not necessarily the investing discipline that I advocate. Some of the research I have conducted uses fundamental data as well. However, to keep this article closer to the introductory level, only technical data will be used. This greatly simplifies the implementation of this application.

The program presented in this article is meant primarily as a starting point for further exploration. A neural network is a tool that recognizes pattern. This is in much the same way as one of the primary functions of our own human brains is to form memories and recognize patterns. The real trick in using neural networks for market prediction is representing the market data in a way that truly captures the essence of the underlying patterns in a way that the neural network will be able to recognize them. This program demonstrates a simple way to represent technical market data for recognition.

This program is by no means meant to be taken as an investment strategy. It is for educational purposes only, and you should not base an investment direction on it. However, this program is a very good starting point for further exploration in the application of neural networks to market forecasting. This is an area of ongoing research for me. I frequently use this program as a starting point. It has built in capabilities for Encog, charting, and accessing market data from Yahoo Finance. From here you can adjust how the market data is presented to the neural network, and how the results are interpreted. My personal goal in the area of market forecasting is to try a number of neural network architectures and perhaps ultimately produce a book on these findings.

The examples presented here will make use of the stock market. However, the same technical analysis principals are often applied to currency exchange markets (FOREX). This program could easily be adapted to work with currency pairings as well. This is an area I intend to research as well.

Candlestick Charts

We will begin by looking at one of the most common tools of the technical trader: the candlestick chart. Many neural network examples for market forecasting simply make use of the daily closing price of a security and attempt to predict patterns using this data alone. While this can give some insight, technical analysts typically use more data. The following five pieces of information, listed here, are of particular interest to the technical analyst.

    * Opening Price
    * Closing Price
    * Day High
    * Day Low
    * Volume

Using more than just the daily closing price allows the analyst to capture the “emotion” of the market. This is how the fundamental data is factored into the purely technical model. The first four data items are used to create something called a candlestick chart. The candlestick chart is a hybrid of the line and bar charts. An example of a candlestick chart is shown here.

The chart is made up of “candlesticks”. Each candlestick shows one day’s worth of activity. It is important to understand how to read each individual candlestick. This is illustrated below.

Candles are made up of bodies, with shadows (or wicks) at each end. Candles are either white or black. A white candle indicates a day where the closing price was higher than the opening price. The stock price increased on a day that had a white candle. A black candle indicates a day where the closing price was lower than the opening price. The stock price decreased on a day that had a black candle.

Candlestick Patterns

These candles are used to form patterns. There are several levels to these patterns. At the lowest level, the individual candles have names. The following chart shows some of the more common names for the various candlesticks.

You will notice that many of these candlesticks have Japanese names. Candlestick charts originated in Japan. A Japanese rice trader, who is credited with the invention of the candlestick chart, made his fortune using these charts to trade rice. As a result these same techniques can be used in markets other than the stock market, as the original creator used candlestick charts for the commodities market back in the eighteenth century.

It can be difficult to classify which pattern is which. For example, how long of legs does a long leg doji actually need. Or how big does the body of a spinning top become before it is a white or black candlestick. The program presented in this example uses a simple object I created to classify them. However, it picks arbitrary sizes for the body and shadows to classify them. One area that I want to experiment with, in the future, is using another neural network to figure out which of these micro patterns each candlestick falls into.

Each of these patterns has a meaning. For example the white marubozu is a bullish symbol, whereas the black marubozu is bearish. Some symbols, such as the hammer, can be bullish or bearish depending on what is around them. These individual patterns are often grouped into much larger patterns. One classic pattern is the cup and handle pattern, popularized by William J. O’Neal. It typically signals an upward trend. The following diagram shows one type of cup and handle.

Just as it can be difficult to identify the individual candlestick patterns it can be even more difficult to identify the larger patterns. You can see the cup and handle above. This is a more v-shaped cup. This cup is followed by a shallow dip, which is the handle. After the shallow dip the security experiences an upward movement.

Using the Encog Candlestick Example

This example program works by first obtaining data and then training the neural network. To obtain data choose the “Obtain Data” item from the “Neural Network” menu. This will present you with the following seven options. All are required fields.

    * Symbol
    * Starting Date
    * Ending Date
    * Prediction Window (in days)
    * Bull/Bear Window (in days)
    * Bearish Percent
    * Bullish Percent

Once you have entered these values, click the “Obtain Data” button. The program will pause briefly, while the data is downloaded from Yahoo Finance. The symbol is the company you are using for training. Any company can be used. Further, the training from one company can be used to predict for another. A future enhancement to this program will likely be the ability to use a whole series of companies to train the neural network.

The starting and ending dates define the range of data to train for. Usually you do not want to train all the way up to the current date. You want to leave a few years of data to evaluate the neural network with. You want to test the network with data that it was never trained with to see how truly effective of a neural network you have created.The prediction window is the number of prior days that the neural network uses to make a bullish or bearish determination. The bull/bear window is the number of days forward that the network will look to decide if the previous window should be considered an indication of a bull or bearish period. For example, if you choose 30 days as the bull/bear period, that period will be bullish if the security price rises above the bullish percent, or bearish if it falls below the bearish percent. As the program looks over the date range, it creates many training examples of prediction windows that resulted in either a bullish or bearish period.

Once the data has been gathered, the neural network can be trained. While gathering data is relatively quick, training can be very slow. A neural network is made up of layers. Data is fed to the input layer and predictions come from the output layer. There are also hidden layers between the input and output layers. The following diagram shows a simple neural network.

The network is made up of connections between the neurons. The training process adjusts those connections so that the inputs produce the desired outputs. Training is an iterative process. To begin training select the “Train” item from the “Neural Network” menu. The training will begin. You will see an error percent. The goal is to try to minimize that. If it never minimizes much then the data you entered for gathering is not conducive to determining patterns. Training can go on for hours or days. Once you are done training you can click the button to stop the training. Your neural network is now ready to predict.

Enter a stock symbol and starting date at the top of the window and click “Chart/Predict”. You will see something similar to the following.

Here you see one segment of time. The neural network shows its predictions by the red or green bars. Red indicates bearish sentiment, whereas green is bulish. The above chart shows a time slice of Apple Computer being predicted with a neural network trained on GE. The decision for these two was purely arbitrary. GE is very representative of the broader market so I often use it as a baseline. Ideally you would pick a handful of companies to form the baseline, this would be a good enhancement to this program.

You can see above that the program started to get bullish about apple just as the upward trend began. Then the program became neutral as it leveled. Two bearish lines were draw as it began to drop again. It is not always this accurate. But this demonstrates the type of data it returns.

Implementing the Application

This application makes use of Encog. Encog is an open source neural network framework released under the Lesser GNU Public License (LGPL). There are versions of Encog available for Java, .Net, and Silverlight. Encog is hosted at Google Code. Download links, as well as more information on Encog can be found at the following URL: http://www.heatonresearch.com/encog

The application must create training data for the neural network. This training data is gathered by the GatherUtil class included as part of the program. This is the class that you would modify to enhance the training strategy. It is used both to create training data for a large block of data, as well as querying the neural network for a prediction. We will first examine how the training data is generated. It is generated in a method called LoadCompany. This method begins by creating a YahooFinanceLoader. The YahooFinanceLoader is provided by Encog.

IMarketLoader loader = new YahooFinanceLoader();TickerSymbol ticker = new TickerSymbol(symbol);	

We must create a list to tell Encog what market data we are interested in. We are using the close, open, high and low values for each day. We are only using the actual candlestick data. To keep this example simple, we are also not using volume. Volume, however, can be very useful for predicting trends and is often considered in different disciplines.

IList dataNeeded = new List();dataNeeded.Add(MarketDataType.ADJUSTED_CLOSE);dataNeeded.Add(MarketDataType.CLOSE);dataNeeded.Add(MarketDataType.OPEN);dataNeeded.Add(MarketDataType.HIGH);dataNeeded.Add(MarketDataType.LOW);

All of the requested is then loaded into a large list and sorted.

List results = (List)loader.Load(ticker, dataNeeded, from, to);results.Sort();

Bullish or Bearish Trend

We will now loop over all of the data and build training elements. We need to peek backwards from the present day over the prediction window. So we begin our loop far enough into the list to have at least one prediction window to look back on. We end just short of an evaluation window. The evaluation window is used to determine if a bullish or bearish trend started after the prediction window.

for (int index = PredictWindow; index < results.Count - EvalWindow; index++){

The market data is obtained and we create two flags to determine if this is a bullish or bearish region of the chart.

    LoadedMarketData data = results[index];    // determine bull or bear position, or neither    bool bullish = false;    bool bearish = false;	

To make the bullish or bearish determination we loop over the evaluation period and look at the percent changes.

    for (int search = 1; search <= EvalWindow; search++)    {        LoadedMarketData data2 = results[index + search];        double priceBase = data.GetData(MarketDataType.ADJUSTED_CLOSE);        double priceCompare = data2.GetData(MarketDataType.ADJUSTED_CLOSE);        double diff = priceCompare - priceBase;        double percent = diff / priceBase;

If the percent change is greater than the bullish percent, then this is a bullish region. Otherwise it is bearish.

        if (percent > BullPercent)        {            bullish = true;        }        else if (percent < BearPercent)        {            bearish = true;        }    }

Now that we know if this region was bullish or bearish we can create training data. If it was neither bullish nor bearish, no training data will be created. We do not use the entire chart for training.

    INeuralDataPair pair = null;    if (bullish)    {        pair = CreateData(results, index, true);    }    else if (bearish)    {        pair = CreateData(results, index, false);    }

If a training pair was created, add it to the Encog training data.

    if (pair != null)    {        training.Add(pair);    }

You will notice above that a method called CreateData was called to create the input data to the neural network. This method is used both to generate the input side of the training pair, as well as to create input to the neural network for a prediction. This method looks at the evaluation window and creates the input to the neural network. This very important function is analyzed here.

This function begins by creating a BasicNeuralData object. This is what Encog uses for all neural network input and output. There are 14 elements. These correspond to the 14 basic candlestick patterns seen earlier. We are going to use a very simple pattern. We are simply going to sum up the number of each of the 14 candlestick patterns. This is a very simple way to do this, but it can produce some good results. To do something more advanced just think about how you want to represent the evaluation window to the neural network as an array of floating point numbers. The input and output to a neural network is always an array of floating point numbers.

Another advantage to this very simple approach is the candlesticks are atomic. We are only considering the movement on a given day. We are not considering the actual price change day-to-day. This means the code does not need to worry about stock-splits. This removes another layer of complexity from the neural network. When I compare day-to-day I need to make use of the "adjusted close" field, which informs me of stock splits.

INeuralData neuralData = new BasicNeuralData(14);int totalPatterns = 0;int[] patternCount = new int[14];	

We begin by looping over the entire evaluation window.

for (int i = 0; i < EvalWindow; i++){    LoadedMarketData data = marketData[(marketDataIndex-EvalWindow) + i];	

We use the provided IdentifyCandleStick class to determine what sort of a candlestick this is.

    IdentifyCandleStick candle = new IdentifyCandleStick();    candle.SetStats(data);    int pattern = candle.DeterminePattern();

The pattern variable now holds a value, between zero and thirteen that indicates what candlestick this was. We keep count of each type.

    if (pattern != IdentifyCandleStick.UNKNOWN)    {        totalPatterns++;        patternCount[pattern]++;    }}

Not every possible combination of shadow and body produces a named candlestick. If there were no named candlesticks in the entire evaluation window, then it will not be considered.

if (totalPatterns == 0)    return null;

The input and output floating point numbers for a neural network are usually between zero and one. To accommodate this we represent each o the 14 candlestick patterns as a percent. These are the 14 input neurons.

for (int i = 0; i < 14; i++){    neuralData[i] = ((double)patternCount[i]) / ((double)totalPatterns);}return neuralData;

The neural network is trained to accept the above input and produce a 0.9 for bullish, a 0.1 for bearish. When we use the network for prediction we will feed the average number of candlesticks into the network and use the output to make a forecast. We consider an output of .8 or higher to be bullish, a 0.2 or lower to be bearish, and anything in-between means the network is neutral on the current day.

Conclusion

This article served as a very simple introduction to using neural networks to assist in technical market analysis. It should not be used as a starting point for exploring other technical or fundamental strategies you may wish to explore. Areas for further exploration are volume, day-to-day price changes, and better representing the patterns presented to the neural network. For more information about Encog, and other examples, visit the Encog homepage at http://www.heatonresearch.com/encog.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

Recent Articles: