August 30, 2024
A guest post from Fabrício Ceolin, DevOps Engineer at Comet. Inspired by the growing demand…
Join us for Convergence a free, virtual ML Conference 2023 on March 7-8th, 2023!
“With the rapid growth of data and increase in compute capacity, more and more companies are adopting data science and machine learning to solve complex real-world business problems,” says Sandhya Gopalan – Head of AI and MLOps at EY Global Delivery Services.
Indeed, 84% of business executives believe they need to implement AI solutions to achieve growth. As such, several organizations are building strategies to work on ML-based products and services to boost profitability.
However, strategies are just the beginning, as organizations have to overcome several challenges on a day-to-day basis before realizing the benefits of this new wave of technology.
Comet spoke with some of the expert speakers from Convergence ML Conference 2023 and asked them about their favorite ML projects, thoughts on emerging ML trends and challenges as well as their opinions regarding MLOps as a viable solution to current ML challenges.
Includes the opinions from:
Machine Learning Operations (MLOps) – a process for seamlessly taking ML projects from development to deployment at scale – is what most executives are focusing on. EY Global for example successfully implemented MLOps within the finance domain to migrate their legacy Software-as-a-Service (SaaS) based infrastructure to a cloud platform that helps them create a scalable and reusable framework for efficient model experimentation, deployment, monitoring, and governance.
“With new framework implementation, operationalization of models was made simple, and timeline was reduced by more than 60%,” says Sandhya Gopalan – AI and MLOps Practice Head at EY Global Delivery Services.
Similarly, Carvana – an eCommerce company that sells used cars – aims for a robust ML platform to reduce the gap between data scientists and ML engineers when building scalable ML systems. The goal is to streamline the ML development process and educate its data engineering team regarding the ML lifecycle. “We are working to solidify MLOps across all current machine learning initiatives. I am excited to see what we build across them and how this better enables our data scientists and ML engineers in their 2023 initiatives,” says Brittney Monroe – Caravana’s Senior Engineer.
Another initiative organizations are taking is migrating to cloud-native platforms. This allows data scientists to work in standard environments using minimal Application Programming Interfaces (APIs) for developing and deploying ML models within the cloud. Mikiko Bazeley – Head of MLOps at Featureform, worked on a project like this during her time at MailChimp. She recalls, “My experience showed me the value of developing the right abstractions focused on data scientists and how crucial virtualization is in enabling ML platform teams to build the MLOps toolchain that works for them.”
Exciting ML projects at Gong include Smart Trackers, the first user-trained AI systems for deep understanding of customer conversations. “We see an amazing reaction to this approach, that lets end users train their own ML model in an easy-to-use game-like interface. Behind the scenes, Smart Trackers involved technologies to semantically embed and index billions of sentences, smart selection of data points for labeling, utilization of LLMs, and efficient serving of thousands of different transformer models.”, says Omri Allouche – Head of Research at Gong.
Heartbeat, is a highly ranked medium publication that you should explore if you want to keep yourself informed about the newest trends in MLOps.
It’s clear that organizations are turning their attention more toward ML deployment as ML models are moving from research phase to production ML deployment is a complex process involving several teams from different domains, making communication and collaboration a pressing issue. As such, the focus is to establish MLOps that will support cross-functional team collaboration in these distinct domains and allow for more holistic strategies, letting organizations operationalize large ML projects quickly.
Sandhya Gopalan comments that if the current sentiments continue, “MLOps will lead to citizen MLOps as well where data scientists with limited engineering skills will be able to industrialize ML solutions by themselves with minimal dependency on the engineering team.”
Christina Taylor – Senior Staff Data Engineer at Catalyst Software echoes this opinion by saying, “I think there will be a lot more engineering tools that can empower DS/ML to self-service. Teams can spend less time on infrastructure/framework and more time on action/insights.”
Listen to this roundtable where enterprise ML leaders from Uber, The RealReal and WorkFusion discuss how to develop ML at enterprise scale
The general sentiment among all experts is that taking ML projects to production requires making the data scientist’s job as easy as possible. The rise of cloud-native environments results from such beliefs. The cloud-native systems let data scientists work on a single platform without hopping from one API to another to get their jobs done. With rising demand for low-latency systems to process streaming data, a robust cloud-native infrastructure becomes necessary.
As Mikkio Bazely points out, the four major trends in ML are the growth of cloud-native development environments, virtualization tools that make deployment more manageable, better technologies to process batch and streaming data, and unification between MLOps and DataOps for kicking the ML flywheel off – which happens when the teams involved in an ML project work more collaboratively by understanding each other’s domain.
“The demand for fast iteration & push of ML native products (including Gen AI) will put further pressure on data science & MLOps companies to become even easier for data scientists and even non-technical entrepreneurs to go-to-market with innovative ideas,” predicts Bazely.
Further, strong data governance and validation procedures will help fuel the trend as data scientists have access to more and better quality data to produce high-performing models. “Teams having access to better and more reliable data will speed up the timeline to ML production and better solidify outputs of those models,” Monroe comments.
Reinforcing Monroe’s statement, Omri Allouche – Head of Research at Gong – says: “There will be efficient methods to make these models run faster and cheaper, through knowledge distillation, quantization, pruning, compression and more.”
The increasing adoption of AI and ML is raising privacy concerns as more significant and better models use sensitive datasets to deliver high-quality performance. As Jigyasa Grover – Senior Machine Learning Engineer at Twitter, comments, “It is imperative to retain the confidentiality of data, maintain the privacy of proprietary design, and stay compliant with the latest regulations and policies.” Failing to do so can have “disastrous outcomes propelled by cyber-attacks, reverse engineering, and leakage of sensitive data like personal conversations, financial transactions, medical history, and so on,” she fears.
It’s evident from the trends discussed above that ML deployment is the most prominent issue that organizations will face when working on ML products. As Gopalan points out, “Organizations continue to struggle with ML adoption despite having a world-class data science pool. Not having an enterprise-wide ML operations strategy and implementation framework could be the top reasons why ML adoption is limited.”
Other reasons include the divide between several teams responsible for an ML project. Following the thought, Monroe highlights: “The biggest challenge in machine learning in the real world is bridging the gap between various teams.”
However, as Monroe worries over the gap, she remains optimistic as several tools, such as ML platforms, emerge to streamline the ML development process. Simultaneously, the high availability of such tools creates yet another challenge as it becomes more important to carefully consider which tool suits a company’s goals. Tylor says: “There are myriads of tools, platforms, clouds, and solutions! It can be disorienting to choose the right approach to run ML in production.”
Ethical concerns are also significant as AI and ML become more pervasive. As data regulations become more stringent and users become more sensitive toward sharing their private data in return for better performance, the ML community must design systems that efficiently strike a balance between these aspects.
According to Omri Alloche, the largest barrier to embracing ML in the real world is gaining the user’s trust. “It’s a delicate process with psychology, user experience, communication and openness playing as important roles as model performance. The success of AI and its penetration into mass media lead people to trust ML models more. With that, their expectations from ML models increase, and they are less tolerant to ML “hiccups”. In the real-world, ML products require meticulous work to ensure they constantly deliver, even in edge cases, out of distribution or under drifts and shifts.”
All speakers agree that MLOps plays an integral role in ML projects and helps organizations to be more efficient and therefore more successful with their AI initiatives.
According to Grover: “Irrespective of a ML project nature whether it is a POC or only development or operationalization or end to end, the MLOps principles are adopted and implemented in every stage of ML lifecycle. MLOps paves the way for rapid and formalized experimentation in case of POC (proof-of-concept), 40-50% reduced timeline when POC turns into development, reusable and scalable deployment patterns, robust and novel monitoring and governance leading to Trusted AI.”
Similarly, Monroe believes “MLOps will lead to less time spent manually running pieces of the project; which in turn, also ensures that they are less prone to human error.”
As far as what constitutes MLOps, Bazely explains it eloquently: “MLOps isn’t about a specific toolset, programming languages, or model architectures. MLOps is about how to support the development of ML at scale, using a combination of tools and practices.”
MLOps provides massive returns when organizations develop a robust and efficient system. Learn how to create a stronger workflow with this ultimate guide.
The ML market has come a long way, from simple statistical models, algorithms, and tools, to sophisticated platforms that focus on developing and deploying ML at scale.
The above trends and challenges highlight the crucial role of MLOps best practices and tools, which help organizations break team silos and implement governance frameworks for faster deployments and seamless delivery to meet the ever-increasing user expectations.
If you are interested in learning more from the experts interviewed above and many more speakers, join us for Convergence ML Conference 2023 on March 7-8th, 2023! The free virtual event features two days of talks, panels and Q&A sessions covering state-of-the-art machine learning applications and tools with a special focus on MLOps.