Table of Content
- Machine Learning Scales Predictions of Unknown Situations to Make Decisions
- Steps in Creating a Machine Learning Use Case
- Types of ML Problems
- Discovering Machine Learning Use Cases
- Take Away
How to use ML to improve business processes? What kind of problems can we solve using ML?
Machine learning is a type (class) of algorithm that uses data examples to solve an AI problem. Machine learning uses standard algorithms to analyze data and make predictions or generate insights to make decisions.
An algorithm is a mathematical function that models data and learns patterns and behaviors to make predictions. Once an algorithm is trained, it becomes a trained model.
The same algorithm can be trained to solve different problems and make predictions. We must provide large amounts of specific data samples (labeled data) to the algorithm to train it for the machine learning use case.
You can teach a computer to learn from video, images, audio, and tabular data to analyze and create a solution or solve a business challenge.
The data quality and volume used for training machine learning models are critical for a successful model. Machine learning use cases that feed with larger data examples create more accurate predictions using one or more machine learning models working together.
Similarly, Deep Learning is a subset of ML or a specific class of algorithms called neural networks that can work with unstructured data. So, when we refer to Deep Learning, we often refer to machine learning.
Machine Learning Scales Predictions of Unknown Situations to Make Decisions
Machine Learning use cases with large volumes of data and high frequency of situations or instances are suitable for building prediction services or advanced data insights that solve a problem.
Using ML combines domain expertise.
Steps in Creating a Machine Learning Use Case
Define the Problem (Problem Statement).
Identify potential ML use cases by defining and analyzing a clear business problem that will solve a challenge and could be feasible or attainable but still difficult to access. The goal is to use machine learning to improve a business process or a domain-specific problem to create or improve an existing product/service that can have an impact in the business.
Brainstorm by answering some questions that affect the business objective: What is the benefit of solving this problem? What prediction would benefit your business? What benefits will ML bring to your core services? How can you enable a new scenario? What data do you have available?
ML problem assessment helps to determine the type of data to collect. First, establish a clear goal and assess the problem before discussing it with an ML engineer.
The data is the most important aspect of ML solutions. You need good data to make good predictions. Therefore, an ML project needs data quality that fits the purpose of making good predictions. Data is considered optimal if it is clean or without inconsistencies, covers possible or related scenarios (data coverage), and has complete data aspects for machine learning use case (or domain) scenarios.
Explore the data and gain insights into the relationship, distribution, and patterns to identify outliers, missing or irrelevant data, and other issues that need to be addressed.
Collecting data requires observing the bias that may originate in the data to avoid misrepresentations (skewed datasets), unfairness (unfair bias), and unconscious human bias in collection and labeling procedures that might be reflected in the datasets.
This translates into building models that make decisions that create neutral outcomes, avoiding prioritizing information or using sensitive characteristics that may impact users (race, ethnicity, social class, etc.), e.g., benefiting a specific type of user (biased toward this group).
When preparing the data for building labeled datasets, data engineers think about data and attributes. To train ML models, you need a lot of data examples or labeled datasets.
Data is identified according to its relevant attributes or characteristics (input features) necessary to create a label type (the desired prediction outcome). Then, data examples are labeled and used to predict a problem solution.
You need a labeled dataset to train a custom ML model. You can build a labeled dataset using historical company data, a proxy label, creating (using) a labeling system (to collect data), or using a labeling service according to your criteria.
Feature engineering is performed after cleaning the data and before defining the machine learning model.
You can use pre-trained models (that use API calls) to build labeled datasets and classify them into categories using their labels or using your labels and data.
Data features and labels (data collections) are stored in a data warehouse to be accessible by ML models. Datasets are connected from cloud storage to an AI platform such as Vertex AI.
Machine learning models use high volumes of data that work well using cloud solutions and implementing your machine learning use cases using fully managed tools.
Define Product Objectives and Metrics.
It’s important to use objectives matching predictions with the expected outcomes. Using a loss function and metrics specific to the machine learning use case is the way to optimize the model. Also, the business objectives should be accounted for when evaluating the model.
Train an ML Model.
Choosing the correct algorithm for your machine learning use case is critical for creating a successful prediction service. Having the necessary computing power to scale training is also necessary. Consider the use of accelerators such as GPUs and TPUs.
Consider the objectives and metrics when training ML models. Training the model is done with about 50% of your available data. Consider creating ML models that can be explained and create outcomes that are understandable or easily interpreted by their users.
Evaluation of ML Models.
Evaluating the accuracy of ML models is done by using formulas (confusion matrix) to calculate how many times the model predicts correctly using a portion of the labeled data (splitting the data set) that has not been used before. For example, consider using the F1 score and AOC if you have an unbalanced dataset. Evaluating the ML model is critical before deployment.
If you have collected a good amount of data, consider splitting the data into train, validate, and test. A validation dataset will ensure the model is not overfitting the training data. When validating the model’s effectiveness, observe bias-related issues and determine if the model is fair (level of fairness), has not amplified a pre-existing bias, or is implicitly biased.
Before the final deployment happens, it is necessary to test (cross-validation, error analysis, and data validation) whether the input data is in line with the output data and is yielding maximum predictions.
Monitoring the Model.
The model monitoring phase ensures the performance and helps to discover errors in production, detect degradation, and ensure consistency of inference data and metrics with business objectives. Monitor the code to check the model performance live.
Deploy ML Model.
Deployment is when the model prediction service goes live. Consider the machine learning use case and consider that prediction services can work in batch or online with an API endpoint. The model can generate live predictions or can be saved to be consumed by downstream services later.
Types of ML Problems
- Supervised Learning (Prediction)—The output is known, and training uses labeled data. You must have many examples or labeled datasets to train an ML model.
- Regression—predict numeric values
- Classification—predict categories
- Unsupervised Learning—Discover data structure and patterns.
Learn more about the types of machine learning before creating your ML use case.
Discovering Machine Learning Use Cases
One way to discover ML use cases is by thinking about the company processes and finding ways to replace or simplify rule-based systems that are fixed in nature and interdependent (that present limitations or require human intervention) with machine learning solutions that are more flexible or adaptative and can process data in real time.
Another way to improve the business is by automating operation processes and creating more processing power at lower costs. Also, there is much opportunity to understand and integrate unstructured relevant data from videos, audio, images, and text analyzed by ML models.
By examining current business processes and products and analyzing the variables and scenarios, you can come up with personalized ideas or creativity for improving them to develop a unique customer experience.
Other advances in ML frameworks, such as TensorFlow, the availability of pre-defined ML models, and hardware advances, such as TPUs and GPUs, make it easier for businesses to create machine learning use cases and exploit the technologies.