5th Mar 2020

Should tech VCs be investing in machine learning drug discovery startups?

Read this analysis to understand why it’s an atypical investment

Last year MMC invested £100m into technology companies (1% of all UK VC investment in 2019) with a significant focus on applied machine learning startups. The application of ML to the difficulty of drug discovery seems a natural fit but a deeper look reveals a complicated and nuanced dynamic at play.

ML drug discovery startups have the potential to create huge financial and societal value yet they remain an atypical investment for tech and biotech VCs

ML drug discovery startups are an atypical investment because they do not fit the traditional investment frameworks of either tech investors or lifescience investors. They require most investors to move outside their comfort zone and invest in a business with unusual value inflection points, an unfamiliar business model and uncertain timelines for revenue generation.

Tech VCs invest in software startups that can achieve a series of important milestones in a short space of time i.e. build a product, prove product-market fit and scale rapidly. They invest at inflection points which, they believe, prove the startup can scale to $100m+ ARR in 5+ years.

On the other hand lifescience VCs invest in biotech start-ups that have an asset with an associated data package supporting its efficacy and safety. Their investment, depending on stage, will be used to fund a clinical trial, achieve FDA approval etc.

In the short term ML drug discovery startups offer neither of these investor groups what they are used to. Firstly, true ‘product-market’ fit for the technology takes several years to prove because validation ultimately comes from successful clinical trials. Tech VCs must be comfortable investing in a technology without true insight into whether it works. Secondly, lifescience VCs must be comfortable backing a ‘platform+product’ play, i.e. the quality of the new discovery process, rather than just individual assets. These companies are at the intersection of machine learning and medicinal chemistry and need this to be reflected in their investors as well as their teams.

Value accretion is slow, uncertain and often binary in nature

In many respects these companies remain ‘seed-stage’ for several years

‘How do we know that this works?’ The intersection of machine learning and medicinal chemistry is a specialist area and validation of the technology is difficult. Early stage startups will show in silico (validated in software) drug assets they have discovered but these are low value until they have been shown to be effective in vitro (validated in the lab) or, ultimately, in vivo (validated in animal / human). In many respects these companies remain ‘seed-stage’ for several years because moving an asset from a computer model to a compound that works in animals or humans takes time.

The initial value inflection points include retrospective ‘discovery’ of assets, securing partnerships with respected academics / pharma and recruiting high quality drug discovery specialists. In the medium term value accrual builds as assets successfully progress from the pre-clinical to clinical stage. Early validation can be achieved if an asset is acquired. However, this value accrual is fragile until a drug has successfully completed a Phase II trial (demonstrating efficacy in humans) and truly validated the technology. Ultimate validation is achieved when multiple assets successfully complete Phase II.

A technology platform that can discover new drugs more rapidly has a clear value. Outsize value creation is derived from discovery of high value drugs. A technology that can make incremental improvements to existing drug classes is less valuable than one that can generate novel drug classes. Huge value accrual depends on the drug class, disease indication and novelty as much as the platform technology itself.

Hence, value accrual is slow, uncertain and often binary in nature. The value is split between the tech platform, i.e. its ability to repeatedly discover therapeutic assets, and the assets themselves. The potential value of platform and assets is huge but investors will need to fund startups, often without repeatable revenue or true insight into whether the technology works, through to the clinical stage in order to capture outsize returns.

The economics incentivise startups to move along the spectrum from pure software to integrated biotech

A typical SaaS license fee model is unlikely to achieve a venture scale outcome and startups need to either secure royalties on the assets they discover or become integrated biotechs and take full ownership. Investors must be comfortable with this business model.

The SaaS-only model will not achieve a venture scale outcome because the significance of individual software tools is difficult to quantify in a complex value chain. Value accrues to the asset and individual tools are too far removed from it hence the value of Saas license fees will be capped. There are too few pharma / biotech companies to compensate for this.

Asset ownership, or royalties on the asset, is the key mechanism for value capture. This requires either a partnership model with Pharma, with the startup receiving royalties in exchange for access to their its platform, or the evolution of the tech startup into an integrated biotech company with the proprietary development of assets. The partnership model is attractive to tech investors because funding requirements are lower and clinical risk is mitigated. However, royalty agreements are notoriously difficult to extract from PharmaCos. Integrated biotechs offer the greatest upside but funding requirements are significant because wet-lab development is capital intensive.

Some create in house programmes for drug discovery and capture more value by moving along the spectrum to own the economics

Startups typically begin by providing a service to incumbent PharmaCos to help them solve a specific pain point in either target identification or medicinal chemistry. The best startups leverage these relationships to validate their approach, generate early revenue and attract investment. If the technology is valuable enough incumbent pharma has demonstrated appetite to enter strategic collaboration agreements with startups. Some use this as a launch pad to create proprietary in house programmes for drug discovery and capture more value by moving along the spectrum to own the economics and the IP through to the clinical stage.

Stage 1: Service provider to Pharma

Value creation is capped because multiple tools are used in a complex process
  • Proposition: Provide a tool or suite of tools to enable a specific point in the value chain e.g. bioactivity prediction.
  • Model: B2B SaaS +/- professional services
  • Strengths: can command six to seven figure ACVs. SaaS contracts with payment upfront
  • Challenges: Competing against open-source ML models and tools developed in house. Value creation is capped because multiple tools are used in a complex process. No control over economics or IP of assets

Stage 2: Strategic collaboration with Pharma

Collaboration with top tier pharma provides early stage validation
  • Proposition: Provide access to a proprietary technology platform and collaborate with pharma / biotech to solve a specific problem, e.g. target identification, in a specific disease.
  • Model: Typically an upfront payment to access the tech platform and collaborate + milestone payments and / or royalties for drugs discovered.
  • Strengths: The best companies command significant upfront payments and milestone & royalty agreements. Collaboration with top tier pharma provides early stage validation. Access to proprietary datasets and domain expertise. Minimal in-house wet lab development
  • Challenges: Do not control the economics and royalty agreements are difficult to secure.

Stage 2 example: Atomwise

  • Founded: 2012
  • Funding & Investors: $51m — Monsanto Growth Ventures, B Capital, Khosla, Y Combinator, Baidu Ventues, Tencent
  • Approach: using ML to assess bioactivity of small molecules to enable lead optimisation. Partnering with Pharma to solve medicinal chemistry problems.
  • Summary: A pure in silico approach which relies on the ability to command royalty payments on a portfolio of small molecules to accrue significant financial value.

Stage 3: Vertically integrated biotech company:

  • Proposition: Develop assets from in silico hypothesis to pre-clinical / Phase I / Phase II (depending on strategy)
  • Model: A ‘next generation’ biotech centred around a novel technology platform
  • Strengths: Full control of IP and economics
  • Challenges: Capital requirements: significant investment in wet lab facilities (either in house or outsourced). Portfolio strategy: requires development of multiple assets to reduce risk

Stage 3 examples: Benevolent AI

  • Founded: 2013
  • Funding & Investors: $292m — Temasek, Goldman Sachs, Woodford
  • Approach: knowledge graph to uncover insights from a diverse set of existing datasets. Initially focused on public data, now working in collaboration with Big Pharma to apply their approach to proprietary datasets. Use their own wet labs to develop assets.
  • Summary: applying knowledge graphs to curated, high quality data sets can yield high quality results. Using public data is more challenging because the quality is variable, many results are not reproducible, it is difficult to control the variables and generalise beyond the scope of the experiment

Stage 3 examples: Recursion Pharmaceuticals

  • Founded: 2013
  • Funding & Investors: $221m — Scottish Mortgage Investment Trust, Lux Capital, Data Collective, Bill & Melinda Gates Foundation
  • Approach: Computer vision enabled cellular imaging at scale to assess the impact of molecule libraries on disease processes. ML enabled analysis of results generated by robotic experiment platforms provides a tight feedback loop. Use their own wet labs to develop assets.
  • Summary: Cellular imaging of disease process is not itself novel but the use of ML to scale and industrialise the process could be transformative.

To date the most advanced startups (typically founded 2012–2015) have drugs in Phase II clinical trials.

Select tech VCs have taken the lead in backing these early companies

Most have a thesis on the intersection of machine learning and biology (A16Z) or are familiar investing in hardware+software propositions (Lux)

Tech investors have taken the lead so far (A16Z — Insitro, Lux / Data Collective — Recursion, B Capital / Khosla — Atomwise) reflecting their expertise in machine learning and the focus on the tech platform in these nascent startups. They are typically large funds willing to deploy significant capital into companies where the efficacy of the technology is not fully known. Most have a thesis on the intersection of machine learning and biology (A16Z) or are familiar investing in hardware+software propositions (e.g. Lux). They are backing the expert team, the novelty of the approach and early proof points incl. retrospective validation of the tech and partnerships with leading academics. Biotech investors have been less active in the space and are more likely to invest later in the life of the company or to fund specific assets that can be spun out.

Alignment between investor and founder is critical. Founders need to be clear about where on the spectrum they want their company to be active and what this means for the business model and future capital requirements. This space remains atypical for tech VCs and the number of funds with the expertise and funds to support these startups remains relatively small. Investors need to go in with their eyes open about the journey they are supporting and whether it aligns with their fund strategy.

Contact us

Reach out to the MMC team

Newsletter Sign Up

Sign up for latest news