2022 increased the public spotlight on AI like no other year with the virality of image generation tools like DALL-E and Midjourney, text compositions from ChatGPT, and the increasing sophistication of humanoid robots. These systems certainly raise interesting societal questions around plagiarism and what “art” is, the ethics of AI content generation, and the actual beneficiaries of generalized task automation, among many others.

However, behind this controversial leading edge, the AI toolbox continues to evolve and provide helpful building blocks for solving challenging problems and automating error-prone workflows. Machine learning (ML), in particular, continues to provide valuable solutions to tricky issues across many sectors.

Technology teams looking to implement machine learning solutions can roll their own with many available open-source components with good portability between deployment options. However, it is difficult to justify the time and effort required to prepare a data set and train and evaluate a model before even starting solution integration, given the wide range of commodity solutions available. These range from individual building blocks to complete ready-to-run solutions for common workflow problems.

What type of problems can machine learning help solve?

Many industries need to classify data or extract features based on specific rules, for example, when processing tabular data, documents, or images containing pertinent information. Some systems still need to support paper trails for various reasons, while others can accept data that is convenient for users but requires additional processing to extract data from, such as a photo of an invoice. Machine learning has helped elevate OCR and related machine vision tasks, including processing video to track features over time to discern motion or even audio to extract speech.

Making sense of natural language is another common problem. Extracting individual characters from an image or words from audio and assembling them into text is just one step – making sense of what the text means and using that meaning to drive further rounds of feedback are more significant challenges. Examples of natural language processing (NLP) are entity extraction, isolating names of people and organizations in text, or identifying a multi-line rectangle in an image as an address. You can get a good picture of this spectrum’s current upper end by conversing with ChatGPT or similar AI bots.

Another everyday use case is understanding time series data and spotting outliers to a regular flow. Examples include looking at product user actions, spotting usage trends or similar marketing opportunities, or detecting out-of-place interactions that can constitute fraud or other security issues. Outliers in monitoring events can help spot issues before they negatively impact users or budgets. Many teams also need to forecast the future needs of their products as they grow, based on a wide variety of inputs beyond just data storage usage.

What options are available?

Cloud

AWS

Amazon provides a broad set of prebuilt ML services. Examples include vision analysis through Rekognition, advanced OCR capabilities through Textract, and natural language processing (including entity extraction) with Comprehend. Most services can be used with off-the-shelf and custom-trained models and often integrate with prebuilt workflow tools through Augmented AI to handle scenarios such as manually reviewing low-confidence predictions and auditing model accuracy over time. Among many other generalized capabilities, specific services also target industrial manufacturing, health, and business metrics use cases. SageMaker is Amazon’s umbrella capability for training custom ML models and other end-to-end MLOps activities.

Google Cloud

Google’s offerings include prebuilt NLP capabilities such as classification and entity extraction with its Natural Language AI suite, translation, and text-to-speech/speech-to-text services. AutoML steps above this to include similar capabilities plus vision processing (image and video), structured data understanding, and the ability to customize models in many of these domains. The Document AI suite is an excellent extension of its natural language and structured data processing. It goes beyond simple OCR to include models for extracting structured meaning from forms and specialized industry cases such as procurement, contract, identity, and medical documents. Vertex AI serves as Google’s unified MLOps suite, simplifying custom model creation and use of its other standalone capabilities.

Azure

Microsoft provides a variety of prebuilt computer vision, speech, and natural language processing capabilities through its Azure Cognitive Services. Highlights include customizable vision models and multilingual models in language services that can be trained in one language and used across others. Azure Machine Learning provides end-to-end MLOps and custom model creation workflows. Microsoft is investing in OpenAI, meaning GPT and DALL-E capabilities are becoming available as Azure services.

IBM

As one of the pioneers in the space, IBM provides decades of machine learning expertise via a wide range of prebuilt solutions within its Watson platform. Given its famous start, natural language processing remains a vital part of the suite with language and speech processing, plus information discovery and surfacing through conversational assistants. Industry-focussed solutions across health, risk and finance, supply chain, business operations, and advertising are also available. Watson Studio provides custom model creation and MLOps activities, and infrastructure flexibility across public and private clouds is possible through several Cloud Pak deployment options.

OCI

Oracle provides flexible open-source-based capabilities for running custom ML models on its OCI platform. The focus is on data adjacency – running models close to the data. Pre-built ML solutions are available as algorithms directly within Oracle Database, which is practical if your enterprise storage strategy revolves around Oracle DB. MLOps, including GPU-accelerated model training, is available through its Data Science Service.

Database

Problems that benefit from machine learning solutions can require large amounts of data, whether for training to improve accuracy, heavy processing, or both. Standalone cloud ML options may not be feasible when data transfer costs can outweigh the solution’s benefits. To give more flexibility in handling large data sets and to bring ML processing as close to the data as possible, several vendors are starting to offer specific machine learning functionality as part of their database product suites.

There is even an open-source ML extension for PostgreSQL that provides an attractive cost benefit.

What are ML topics prevalent in 2023?

MLOps

For teams that have already embraced machine learning solutions, MLOps is becoming an increasingly important topic. Similar to how DevOps brought infrastructure management closer to software delivery practices, MLOps focuses on the entire lifecycle and automation of machine learning model management. Machine learning models similarly need maintenance as software or infrastructure, such as being retrained on newly refined training data to improve solution accuracy. All the mechanics of deploying or rolling back a particular model version also need consideration. An effective MLOps strategy is required to provide tooling and automation to ease this ongoing maintenance requirement as part of total solution management.

Acceleration

Suppose your ML workload is complex enough, and data transfer costs or processing times are critical concerns. In that case, you might justify running a GPU cluster for accelerated machine learning. NVIDIA continues to push boundaries in GPU-accelerated workloads and comprehensive software offerings in the space. Many cloud providers also offer GPU-based options that provide a scaling path from single-GPU proofs-of-concept on a laptop to a fully-managed GPU cluster for the most demanding ML workloads.

Edge ML

You’re likely carrying around a device in your pocket that already uses machine learning for its camera software, amongst other features. Like GPU-based acceleration, mobile chipsets have been evolving to feature more ML cores as a significant selling point for each generation. Running machine learning solutions on a mobile device benefits providers and users alike. Providers require less server infrastructure investment, and users retain privacy by keeping data and processing on their personal devices.
Apple provides the Core ML framework for model execution optimized for its silicon, plus APIs tailored to vision, natural language, speech, and other pre-built audio analysis models. Similarly, Google provides its ML Kit framework for Android devices with a wealth of APIs for pre-built vision and natural language processing requirements.

Portability

Although prebuilt capabilities are convenient and often cost-effective, enterprises looking into AI and machine learning capabilities may want to avoid tying themselves to a single service provider, given significant investment requirements or other critical factors. These cases can justify the development of custom internal solutions based on industry-proven frameworks such as TensorFlow, PyTorch, or MXNet (among others).

Given the importance of these frameworks in the ML space, solutions developed with them are often supported on managed platforms alongside or even integrated with the vendor’s prebuilt components. Implementors gain deployment portability and a scaling path from single development machines to global data centers. IBM is the only real outlier in this regard, given the intellectual property of their Watson platform (however, they do contribute to various open-source ML frameworks).

What’s next?

We’ve briefly covered some machine learning solutions available today. However, this just scratches the surface of an ever-growing ecosystem. While the latest developments in AI may raise some challenging concerns, machine learning as a subset will continue to provide valuable alternatives to rigid rule-based processing systems and classification problems on more challenging input data. As the space grows, more and more pre-built solutions become available for teams to easily integrate powerful application functionality without the overhead of training and maintaining custom ML models.

Let’s see what the rest of 2023 brings us in the AI space!