Machine learning is that mysterious field where the computer can supposedly learn on its own and doesn’t need us lowly software engineers crafting algorithms and data structures and instructing it how to go about the day. As I will explain, this is not the case at all (so far). Machine learning has connections to many fields such as artificial intelligence, data science, data mining, pattern recognition, statistical computing and more. All of them overlap to some degree and the exact boundaries are not very well defined.
At it’s core, machine learning is about a piece of code looking at a lot of data that represents many instances of some concept (for example: many credit card transactions) and building some representation of this data that can be used to make predictions about future instances (e.g. when a new transaction comes in, is it valid or fraudulent?). How can the code know that without a programmer writing explicit rules? That’s the power of statistics. The basic idea is that things in the world usually change slowly and follow patterns that can be discovered by looking at (enough) past behavior. When this is not the case, machine learning is used to detect anomalies which are deviations from the learned pattern (a person wins the lottery and his or her credit transactions suddenly take a huge bump).
Machine learning is typically incorporated into a traditional software system and provides some special sauce that is difficult to do with traditional techniques. For example, the recommendations for similar items you see on pretty much every web site are the product of recommendation engines that use machine learning techniques.
What Problems can Machine Learning Solve?
There are a huge number of areas in which machine learning is employed successfully. Here are some examples: machine vision, speech recognition, algorithmic trading, recommendations, spam filtering, path planning, scheduling, packing, natural language processing, sentiment analysis, anomaly detection and weather prediction.
Machine Learning Techniques
There are many different techniques that can be applied to different types of problems or different types of domains. Below is a very truncated list. Some of these techniques are more general principles and some of them are more specific (and may use a combination of the principles):
- Supervised learning
- Unsupervised learning
- Reinforcement Learning
- Ensemble Learning
- SVM (Support Vector Machines)
- RBM (Restricted Boltzmann Machines)
- McMc (Markov Chain Monte Carlo)
- HTM (Hierarchical Temporal Memory)
- Deep Learning
- Recurrent Neural Networks
- PCS (Principal Component Analysis)
What you should take out of it is that there are a bunch of tools and methods and that some of them have pretty cool names. Use them to impress your boss and peers. Under no circumstances mention them to your significant other.
How to Build Your Own Machine Learning System
First of all, don’t! If you have no background or expertise in at least several of the areas I describe below then just don’t do it. You would be better served to focus on your core competency. There are a lot of options to incorporate machine language into your system without building it yourself. Many companies offer various services, consulting and integration options. The developer experience on your side may be as simple as streaming or uploading a bunch of data and getting back a stream of predictions or some type of report. That’s the best case scenario. In practice, it will often take much more that. Let’s explore the steps involved and why it might be difficult for some third-party service or off-the-shelf product to provide you the insights you need without a lot of work and customization on your end.
Let’s see what it takes to build your own machine learning system from scratch. Many of these steps will still be necessary
Understand Your Problem
This is a crucial piece. What are you trying to do? Do you really have to resort to machine learning? Maybe there is a traditional technique that solves the problem. Do you have enough data available to make machine learning practical? Can you afford some mistakes?
At this stage you will often engage in exploratory data analysis which involves playing with some subset of the data, visualizing it in different ways and trying to get a sense of what you got on your plate.
Choose the Right Technique/System/Framework
Here you’ll need to acquire some expertise and a general broad understanding of the various techniques and how they are suitable for your problem and you data. A great starting point is the flow chart found here.
Pre-process the Data
This is where the rubber meets the road. It’s not as glamorous as playing around with a bunch of cool algorithms, but this is what will make or break your system. Machine learning algorithms need relevant and clean data to produce good results. Often, your machine learning algorithm will work pretty well, but unfortunately pretty well is not good enough. Consider a speech recognition system that is accurate 95% of the time. That percentage sounds very impressive until you realize that it also means an error every 20 words. Try dictating sometimes, where you need to correct every 20th word. You won’t last long.
If we continue with the speech recognition example, you may need to invest in sound proofing, good microphones and some standard signal processing methods before you feed the data to your fancy machine learning algorithm.
Model, Train, Test, Tweak
Programming a machine language system is very different from normal programming. You have a very simple API that you can feed training data, as well configure a (sometimes overwhelming) number of parameters. What comes out is typically this model, which is really a black box. You have no idea how it operates but you can feed it new examples (test data) and get back predictions, recommendations, classifications, etc.
With most machine learning systems this is a black art. Some of them try to do that under cover and will ask you just for very rough guidelines and then partake in a huge game of trial and error with different parameter sets and options. This also brings out the over-fitting monster where your algorithm is optimized superbly to the training data, but when you feed it real world data it fails miserably.
Machine learning is great. It accomplishes daily amazing feats that seemed beyond the capabilities of machines not long ago. At the core it is just a lot of data with some fancy statistics. There is no intelligence there. Alternatively this is all there is to intelligence and our brains are also just a “machine learning” system that sifts through abundance of data.