For those of you who would like a simple and easy-to-understand machine learning primer, here you go.
If you use Google’s search engine, then you’ve seen machine learning in action. In this case, it’s the computing model that powers the recommendation system that fills out the rest of the words in the search box or offers prompts such as “People also ask.”
YouTube, Facebook, Netflix, and many other services we use today also leverage machine learning to ensure you see videos, posts, ads, and entertainment suggestions that interest you. Each of these platforms uses machine learning to collect as much data about you and your actions as possible to determine your likes and dislikes, as well as what you want to see next.
But how exactly do they do this? To answer this question, we need to first look at the definition of machine learning.
Machine Learning Primer
Machine Learning – A Definition:
According to IBM, machine learning is a form of artificial intelligence that focuses on developing applications that learn from data and improve the accuracy of their output over time—without humans programming them to do so.
At its most basic, machine learning consists of two main elements: an algorithm and data. The algorithm is “trained” to recognize features and patterns in large datasets to make predictions and decisions based on the input of new data. The stronger the algorithm, the more accurate the predictions and decisions become as it processes more data.
In short, machine learning is a powerful form of computing that’s transforming many aspects of our lives—from information, entertainment, and shopping to cybersecurity, financial services, and healthcare.
How does machine learning work?
Let’s take a more in-depth look at how machine learning works. There are four steps involved:
Step 1: Preparing a training dataset
The training dataset is similar to the datasets the machine learning model processes when working on the problem the data scientist wants it to solve. This data can be “labeled data,” highlighting features and classifications the model is designed to recognize. It can also be “unlabeled data” from which the model will have to extract features to assign classifications by itself.
Before engaging the computing model, the data scientist needs to de-dupe and randomize the training data and review it for imbalances that could impact the training. They also need to divide it into a training subset—which they use to train the model—and an evaluation subset, which they use to evaluate and improve the model.
Step 2: Selecting an algorithm for the training dataset
The type of algorithm the data scientist uses depends on utilizing a labeled or unlabeled dataset, how large the dataset is, and what kind of problem they’re looking to solve.
If they’re using a labeled dataset, they can choose from several types of algorithms:
- Decision tree: A decision tree leverages labeled data to make decisions or recommendations based on a set of decision rules. For example, a decision tree that recommends candidates for job placements could use data about the candidates’ GPA, years of experience, and skill levels and apply rules to that data to pinpoint high-potential job applicants.
- Regression algorithm: There are three regression algorithms: logistic, linear, and support vector machine. The data scientist will use logistic regression when the dependent variable is binary—A or B. With linear regression, the value of a dependent variable is predicted based on the value of an independent variable. Finally, when dependent variables are more challenging to classify, the data scientist can utilize a support vector machine.
- Instance-based algorithm: An instance-based algorithm uses classification to evaluate how likely a data point is to be a member of a specific group or another specific group based on its location relative to other data points.
If the data scientist is using an unlabeled dataset, they can choose from the following types of algorithms:
- Association algorithm: An association algorithm pinpoints relationships and patterns in data and identifies “association rules” based on frequent “if…, then…” relationships.
- Clustering algorithm: A clustering algorithm identifies groups of similar records and labels them accordingly—without the data scientist informing it about the groups and their characteristics ahead of time.
- Neural network: A neural network involves a layered network of calculations. It has an input layer into which the data scientist feeds data, one or more hidden layers where it performs calculations, and an output layer where the algorithm assigns a probability to the outcomes it calculates.
Step 3: Training the algorithm to develop the model
Training the algorithm is an iterative process. It involves inputting data, applying the algorithm, comparing the output—i.e., the solution, decision, or recommendation—to the one it should have produced, altering w and b to generate a more precise output, and repeating the process until the algorithm almost always returns an accurate output. Once trained, the algorithm is the machine learning model.
Step 4: Leveraging and refining the model
The data scientist can then use the trained machine learning model to input new data and see the model’s recommendations or decisions. At the same time, it simultaneously becomes more effective and accurate.
The three machine learning methods
There are three distinct machine learning methods: supervised machine learning, semi-supervised learning, and unsupervised machine learning. Each has its pros and cons, as well as specific uses.
- Supervised machine learning: The model uses a labeled—and sometimes already classified—dataset to train itself with supervised machine learning. For instance, the data scientist would likely train a computer vision model developed to identify 1950s automobiles on a dataset with labeled car images. One advantage of supervised machine learning is that it’s faster and easier than other machine learning methods because the data scientist can compare the model’s results to the labeled results. However, a disadvantage is that it takes time—and resources—to label data. Additionally, there’s the risk of creating a model that’s so closely associated with the training data that it cannot handle variations in new datasets with any accuracy.
- Semi-supervised learning: Semi-supervised learning leverages a restricted, labeled dataset to guide feature extraction and classification of a large, unlabeled dataset. The main advantage of this model is that data scientists can leverage it when there isn’t sufficient labeled data to train a supervised machine learning model. Nevertheless, it has a significant disadvantage in that the model can’t rectify its errors, so even one incorrect prediction or recommendation can result in bias in the entire model.
- Unsupervised machine learning: With unsupervised learning, the algorithm processes large amounts of unlabeled data from which it extracts features that enable it to classify the data. This occurs without any direction from the data scientist. Instead, data scientists use it for applications that involve vast datasets such as genetics and cybersecurity. The main advantage of unsupervised machine learning is that models can identify relationships and patterns in data that humans would miss, ranging from fraudulent activities to the operational drivers behind events in manufacturing processes. Additionally, recognizing new relationships and patterns can also uncover new business opportunities. The key challenge of unsupervised machine learning for enterprises is that it can be expensive because organizations may need to bring experts on board to determine how useful the output is.
Reinforcement machine learning
Reinforcement machine learning is a more recently developed method of machine learning. It’s different from the three methods above because it’s a behavioral machine learning model in which the algorithm doesn’t process a training dataset to learn. Instead, it learns as it makes decisions to proceed towards the objective. The data scientist inputs unlabeled data and either rewards or penalizes the algorithm depending on the accuracy of the outcome.
Data scientists use reinforcement machine learning to teach a computer new actions by penalizing it for incorrect responses and rewarding it for correct responses, for example, in games—like the famous AlphaGo—and personalized recommendations, like on e-commerce websites or smart TV channels.
Reinforcement machine learning has the advantage that a data scientist can use it to solve highly complex problems to which more conventional techniques don’t apply. Also, because the model consistently corrects its errors, it can ultimately create a perfect process. And due to the speed at which reinforcement machine learning models can learn while processing large amounts of data, they can outperform humans.
As Towards Data science reports, there are some significant disadvantages of reinforcement machine learning:
- Creating the simulation environment may be difficult, depending on the function of the model. For example, if the function is to play a game like chess, it’s relatively easy to prepare the simulation environment. But if the function of the model is to drive an autonomous vehicle, it becomes quite a bit more complicated to create a simulation environment in which the model needs to know when it has to drive faster, stop, or brake suddenly.
- It can also be challenging to modify and scale the model’s neural network. Because the only way to connect with the network is with penalties and rewards, acquiring new knowledge sometimes involves deleting some of the previously learned knowledge from the network.
- Another challenge involves the way a model reaches an outcome. Sometimes the model does arrive at the correct solution, but not in the required way. This can result in key data being overlooked or misclassified.
Deep learning
A subset of machine learning, a deep learning algorithm delineates an artificial neural network based on the structure of a mammalian brain and imitates the way the human brain works—and learns.
A deep learning model requires a vast dataset that the algorithm processes through a layer of calculations. It applies weights and biases before the dataset, then proceeds to the next layer, where the algorithm repeats the same process with revised w and b. By doing this, the model generates continuously more refined outputs.
Deep learning is currently used in several exciting research areas, including natural language processing and computer vision. Its primary advantage is that a deep learning model is flexible, allowing the data scientist to tweak it and apply it to new problems. However, it has the disadvantage of requiring many resources, which makes it a more expensive method than the previously mentioned ones.
Machine learning platforms
Scientists have been researching machine learning for decades, so it should be no surprise that the number of platforms is growing. Some are open source and used by a range of interested people, from students to scientists. Others are geared towards enterprise use.
Machine learning platforms include Alteryx, Anaconda, DataRobot, Google Cloud AI, H2O.ai, IBM Watson Studio, KNIME, Azure Machine Learning, and SAS Visual Machine Learning.
Roadmap to leveraging machine learning to address issues in the enterprise
Due to its ability to process enormous datasets and recognize patterns or solve problems that would take humans exponentially longer to address, machine learning holds a lot of promise for organizations. Here’s a basic overview of how you can leverage machine learning to address issues in the enterprise:
- Identify the problem you need to solve and delineate it clearly.
- Build your team. It’s advisable to include data scientists, business experts, and developers.
- Select the right algorithm for the task.
- Gather a vast amount of the right kind of data and prepare it.
- Deploy the algorithm and start building the model.
- Based on the model, develop an application that addresses the problem.
- Verify the model to ensure it consistently generates accurate outcomes.
The future of machine learning
To a large extent, the computing power we can provide currently defines the capabilities of machine learning. But although our computers are becoming increasingly powerful, there’s a limit to what they’ll be able to do.
The solution to this dilemma—and the key to seeing truly great progress with machine learning, as well as other forms of AI—is the adoption of quantum computing, according to Microsoft CEO Satya Nadella in Yahoo! Finance.
Instead of merely using binary bits of zeroes and ones, quantum computing uses qubits, which can represent more than one value at the same time and, as a result, perform larger and more complex computations in a short time. As a result, combining machine learning and quantum computing could create machine learning models that are significantly faster and more accurate—and therefore have a much greater potential than the models in use today.
Would you mind letting us know what you think of the machine learning primer?