Data pipelines, delivery, scalability at the heart of machine learning success.
Machine learning. From a concept Gartner considered a “buzzword” in its 2016 hype cycle report, it’s now progressed to something that nearly all of us in IT are either pondering, exploring or executing. There’s no question that data-based analysis and predictions, where machines learn from informational assets and then inform and influence actions, both business and otherwise, are among the newest and hottest technology groundswells today. But the balancing act of the aspirational and practical is still precarious for those moving to the ML world; as with every new and evolving undertaking, infrastructure can make or break the deal.
Gartner has identified three key best practices that infrastructure and operation leaders can consider to prepare their organizations for the challenges of machine learning (ML) and artificial intelligence (AI). It’s all in their review entitled Three Elements of a Scalable Enterprise Machine Learning Strategy. Some Gartner recommendations:
- Modularize Access for Effective Data Piplines- per Gartner, “End users state that data preparation and management take up nearly 75% to 85% of the machine learning pipeline in a typical project.” The recommendation is to perform more effective data cleanup, transformation and integration across the organization.
- Create Efficient ML Model Delivery Strategies- the review states “I&O leaders can significantly accelerate their ML pipelines by providing access to model, feature and prediction repositories.” This can help bridge the resource gap between experimental and production-level systems.
- Deliver Scalable Compute Infrastructures- Gartner points out “The second most time-intensive portion of the machine learning pipeline is usually the model engineering phase.” Again, the suggested take away is for core players to be assembled, combining the best skills of data scientists, business experts and software engineers, to enable collaboration and a promote a “machine learning mindset across teams.”
The need to weigh time to production, accuracy and to deliver an organization-wide ML strategy, often across siloed data sources (public, private, databases, big data ecosystems, legacy datastores) is a typical if equally difficult scenario. This Gartner review can help, and is well worth the few minutes of time it takes to read. It promises to address some important questions and considerations regarding the latest infrastruture options available to implement an effective, cohesive ML/AI initiative in your organization.
Three Elements of a Scalable Enterprise Machine Learning Infrastructure Strategy, Chirag Dekate, 27 June 2017