Startups

5 machine learning essentials nontechnical leaders need to understand

Comment

Jumble of multicoloured wires untangling into straight lines over a white background. Cape Town, South Africa. Feb 2019.
Image Credits: David Malan (opens in a new window) / Getty Images

Snehal Kundalkar

Contributor

Snehal Kundalkar is the chief technology officer at Valence. She has been leading Silicon Valley firms for the last two decades, including work at Apple and Reddit.

We’re living in a phenomenal moment for machine learning (ML), what Sonali Sambhus, head of developer and ML platform at Square, describes as “the democratization of ML.” It’s become the foundation of business and growth acceleration because of the incredible pace of change and development in this space.

But for engineering and team leaders without an ML background, this can also feel overwhelming and intimidating. I regularly meet smart, successful, highly competent and normally very confident leaders who struggle to navigate a constructive or effective conversation on ML — even though some of them lead teams that engineer it.

I’ve spent more than two decades in the ML space, including work at Apple to build the world’s largest online app and music store. As the senior director of engineering, anti-evil, at Reddit, I used ML to understand and combat the dark side of the web.

For this piece, I interviewed a select group of successful ML leaders including Sambhus; Lior Gavish, co-founder at Monte Carlo; and Yotam Hadass, VP of engineering at Electric.ai, for their insights. I’ve distilled our best practices and must-know components into five practical and easily applicable lessons.

1. ML recruiting strategy

Recruiting for ML comes with several challenges.

The first is that it can be difficult to differentiate machine learning roles from more traditional job profiles (such as data analysts, data engineers and data scientists) because there’s a heavy overlap between descriptions.

Secondly, finding the level of experience required can be challenging. Few people in the industry have substantial experience delivering production-grade ML (for instance, you’ll sometimes notice resumes that specify experience with ML models but then find their models are rule-based engines rather than real ML models).

When it comes to recruiting for ML, hire experts when you can, but also look into how training can help you meet your talent needs. Consider upskilling your current team of software engineers into data/ML engineers or hire promising candidates and provide them with an ML education.

machine learning essentials for leaders
Image Credits: Snehal Kundalkar

The other effective way to overcome these recruiting challenges is to define roles largely around:

  • Product: Look for candidates with a technical curiosity and a strong business/product sense. This framework is often more important than the ability to apply the most sophisticated models.
  • Data: Look for candidates that can help select models, design features, handle data modeling/vectorization and analyze results.
  • Platform/Infrastructure: Look for people who evaluate/integrate/build platforms to significantly accelerate the productivity of data and engineering teams; extract, transform, load (ETLs); warehouse infrastructures; and CI/CD frameworks for ML.

Again, consider the power of training — an engineer with the right curiosity and interest can become the ML expert you need.

Regularly engaging with industry advisors and academics is another way to provide the team with updates on the latest and greatest approaches to ML. Quality bootcamps can be a great way to upskill your teams.

2. Organizational structure

How to best structure the role of the ML team within the larger organization is a significant decision that impacts the efficiency and predictability of the business and should be guided by the stage and size of the company.

Early stage (<25 members): At this size, a shared central team is the safest and quickest way to develop infrastructure and organizational readiness. In the early stage, your ML team should constitute 10%-20% of the entire engineering team.

Midstage (25-500 members): By midstage, it’s best to focus on vertically integrated teams. Gavish is a huge fan of vertical ML teams “because they have a huge advantage in terms of gaining deep understanding of the problem being solved.”

A vertical integration also allows for sustained focus and prioritization, which is needed because midstage ML projects tend to be longer and more uncertain.

Mature (500+ members): At this stage, the business should create a separate ML platform/infra team. For example, Square is a 2,500-plus engineering org with over 100 data scientists and ML engineers and more than 15 ML platform/infra engineers. The ML teams are aligned with individual business units such as chatbots, risk/fraud detection, etc., rather than specific technology. And they have an ML platform/infra team shared across other teams in the company.

However, remember that the size of the team varies depending on how key ML is to the product and services being developed.

3. ML pipeline

Deploying and maintaining ML pipelines is not dramatically different from deploying and maintaining general software. ML knowledge is required around building, tuning, testing, verifying and versioning the model — as well as monitoring it.

The key steps to successfully build, deploy and maintain an ML pipeline are:

  • Define a product problem and determine a fit for ML.
  • Refine datasets.
  • Know and isolate data issues versus model drawbacks.
  • Test, debug and version your models.

Using off-the-shelf software can be an incredibly effective way to reduce the cost and dependency on highly skilled and specialized ML engineers, but be careful of unintentionally creating a disorganized spaghetti solution suite that is difficult to maintain.

While the industry is nascent, tools like Databricks, AWS SageMaker, Tecton, Cortex and others will save time and resources. As far as platforms and libraries, there are many competing solutions in the market: TensorFlow, PyTorch, Keras, Scikit-learn, Pandas, NLTK, etc.

4. Metrics and evaluation

The key challenge around ML is reliability. How can you be sure your model is performing adequately before it’s deployed? How do you monitor production performance and troubleshoot issues? The solution is pretty similar to software engineering: observability.

It’s critical to monitor and track application performance. Hadass recommends “Building Machine Learning Powered Applications” by Emmanuel Ameisen to understand how to do so.

A model that performs better than a baseline (where there is no ML) and is both stable and secure should be good enough to take to production. As a framework, I would vouch for iteration over perfection.

Rolling out models under a feature flag is safe and ensures that you can turn it off quickly before disaster hits. The ability to run multiple versions via A/B testing of the model in production will drastically increase the confidence in the new model and will guarantee an overall higher level of reliability.

A good dataset is a must. It should be one that is meticulously created and reflects production scenarios. Build a system that allows you to backtrack against historical datasets and compare with predictions made by previous versions of the model.

You need metrics and evaluation to address concerns around good models versus bad, such as:

  • Usefulness to end user.
  • Data security.
  • Stability of the model.
  • Practicality of the predictions and recommendations.
  • Ability to explain why a model made the recommendation it did.

5. Common pitfalls

On first read, some of these pitfalls may seem like common sense, but they are worth both reiterating and reflecting on since they can help guide your team to making the best decision during a critical moment.

Don’t:

  • Apply ML to problems that aren’t a good fit for ML, like straightforward sequence of steps.
  • Expect instant results. Impactful ML takes patience and iterations to get solid results.
  • Focus on model success metrics without enough attention to product success metrics.
  • Underestimate the tooling and infrastructure costs leading to slow engineering progress.

Within the last decade, ML has established itself as a technology accelerator. It’s critical in driving automation and bottom-line profitability and growth. This necessitates the need for leaders to know and embrace ML and keep up with the lightning-speed advances in ML technology.

Integrating ML teams effectively into the business starts with an understanding of what makes the right candidate and how to structure the team for maximum velocity and focus.

Leaders should focus on guiding the team to build end-to-end models with integrated observability and monitoring before the models hit production. Evaluate models based on product success, not model success. Avoid common pitfalls under high-stress situations by being intentional about monitoring for them and proactively engaging industry experts and academics to help keep the team up to date on the latest developments.

To solve all the small things, look to everyday Little AI

More TechCrunch

Featured Article

A comprehensive list of 2024 tech layoffs

The tech layoff wave is still going strong in 2024. Following significant workforce reductions in 2022 and 2023, this year has already seen 60,000 job cuts across 254 companies, according to independent layoffs tracker Layoffs.fyi. Companies like Tesla, Amazon, Google, TikTok, Snap and Microsoft have conducted sizable layoffs in the…

3 hours ago
A comprehensive list of 2024 tech layoffs

Featured Article

What to expect from WWDC 2024: iOS 18, macOS 15 and so much AI

Apple is hoping to make WWDC 2024 memorable as it finally spells out its generative AI plans.

4 hours ago
What to expect from WWDC 2024: iOS 18, macOS 15 and so much AI

We just announced the breakout session winners last week. Now meet the roundtable sessions that really “rounded” out the competition for this year’s Disrupt 2024 audience choice program. With five…

The votes are in: Meet the Disrupt 2024 audience choice roundtable winners

The malicious attack appears to have involved malware transmitted through TikTok’s DMs.

TikTok acknowledges exploit targeting high-profile accounts

It’s unusual for three major AI providers to all be down at the same time, which could signal a broader infrastructure issues or internet-scale problem.

AI apocalypse? ChatGPT, Claude and Perplexity all went down at the same time

Welcome to TechCrunch Fintech! This week, we’re looking at LoanSnap’s woes, Nubank’s and Monzo’s positive milestones, a plethora of fintech fundraises and more! To get a roundup of TechCrunch’s biggest…

A look at LoanSnap’s troubles and which neobanks are having a moment

Databricks, the analytics and AI giant, has acquired data management company Tabular for an undisclosed sum. (CNBC reports that Databricks paid over $1 billion.) According to Tabular co-founder Ryan Blue,…

Databricks acquires Tabular to build a common data lakehouse standard

ChatGPT, OpenAI’s text-generating AI chatbot, has taken the world by storm. What started as a tool to hyper-charge productivity through writing essays and code with short text prompts has evolved…

ChatGPT: Everything you need to know about the AI-powered chatbot

The next few weeks could be pivotal for Worldcoin, the controversial eyeball-scanning crypto venture co-founded by OpenAI’s Sam Altman, whose operations remain almost entirely shuttered in the European Union following…

Worldcoin faces pivotal EU privacy decision within weeks

OpenAI’s chatbot ChatGPT has been down for several users across the globe for the last few hours.

OpenAI fixes the issue that caused ChatGPT outage for several hours

True Fit, the AI-powered size-and-fit personalization tool, has offered its size recommendation solution to thousands of retailers for nearly 20 years. Now, the company is venturing into the generative AI…

True Fit leverages generative AI to help online shoppers find clothes that fit

Audio streaming service TuneIn is teaming up with Discord to bring free live radio to the platform. This is TuneIn’s first collaboration with a social platform and one that is…

Discord and TuneIn partner to bring live radio to the social platform

The early victors in the AI gold rush are selling the picks and shovels needed to develop and apply artificial intelligence. Just take a look at data-labeling startup Scale AI…

Scale AI founder Alexandr Wang is coming to Disrupt 2024

Try to imagine the number of parts that go into making a rocket engine. Now imagine requesting and comparing quotes for each of those parts, getting approvals to purchase the…

Engineer brothers found Forge to modernize hardware procurement

Raspberry Pi has released a $70 AI extension kit with a neural network inference accelerator that can be used for local inferencing, for the Raspberry Pi 5.

Raspberry Pi partners with Hailo for its AI extension kit

When Stacklet’s founders, Travis Stanfield and Kapil Thangavelu, came out of Capital One in 2020 to launch their startup, most companies weren’t all that concerned with constraining cloud costs. But…

Stacklet sees demand grow as companies take cloud cost control more seriously

Fivetran’s Managed Data Lake Service aims to remove the repetitive work of managing data lakes.

Fivetran launches a managed data lake service

Lance Riedel and Nigel Daley both spent decades in search discovery, but it was while working at Pinterest that they began trying to understand how to use search engines to…

How a couple of former Pinterest search experts caught Biz Stone’s attention

GetWhy helps businesses carry out market studies and extract insights from video-based interviews using AI.

GetWhy, a market research AI platform that extracts insights from video interviews, raises $34.5M

AI-powered virtual physical therapy platform Sword Health has seen its valuation soar 50% to $3 billion.

Sword Health raises $130M and its valuation soars to $3B

Jeffrey Katzenberg and Sujay Jaswa, along with three general partners, manage $1.5 billion in assets today through their Build, Venture and Seed strategies.

WndrCo officially gets into venture capital with fresh $460M across two funds

The startup targets the middle ground between platforms that offer rigid templates, and those that facilitate a full-control approach.

Storyblok raises $80M to add more AI to its ‘headless’ CMS aimed at non-technical people

The startup has been pursuing a ground-up redesign of a well-understood technology.

‘Star Wars’ lasers and waterfalls of molten salt: How Xcimer plans to make fusion power happen

Sēkr, a startup that offers a mobile app for outdoor enthusiasts and campers, is launching a new AI tool for planning road trips. The new tool, called Copilot, is available…

Travel app Sēkr can plan your next road trip with its new AI tool

Microsoft’s education-focused flavor of its cloud productivity suite, Microsoft 365 Education, is facing investigation in the European Union. Privacy rights nonprofit noyb has just lodged two complaints with Austria’s data…

Microsoft hit with EU privacy complaints over schools’ use of 365 Education suite

Since the shock of Russia’s 2022 invasion of Ukraine, solar energy has been having a moment in Europe. Electricity prices have been going up while the investment required to get…

Samara is accelerating the energy transition in Spain one solar panel at a time

Featured Article

DEI backlash: Stay up-to-date on the latest legal and corporate challenges

It’s clear that this year will be a turning point for DEI.

1 day ago
DEI backlash: Stay up-to-date on the latest legal and corporate challenges

The keynote will be focused on Apple’s software offerings and the developers that power them, including the latest versions of iOS, iPadOS, macOS, tvOS, visionOS and watchOS.

Watch Apple kick off WWDC 2024 right here

Hello and welcome back to TechCrunch Space. Unfortunately, Boeing’s Starliner launch was delayed yet again, this time due to issues with one of the three redundant computers used by United…

TechCrunch Space: China’s victory

The court ruling said that Fearless Fund’s Strivers Grant likely violates the Civil Rights Act of 1866, which bans the use of race in contracts.

An appeals court rules that VC Fearless Fund cannot issue grants to Black women, but the fight continues