Reference

How matches work

For every funding round we ingest, we try three match paths in precedence order. The first hit wins — you get at most one webhook per (your_user_id, funding_round_id) pair, even when multiple paths fire.

The three paths

PathPrecedenceMatch keyWhen it fires
account_domain1 (highest)crm_accounts.domain ↔ funding websiteYour account's vellum.ai matches Vellum's funding round.
email_domain2crm_contacts.email ↔ funding websiteContact's work email matches the funded company's domain. Free-mail domains (gmail, yahoo, etc.) are excluded.
account_name3crm_accounts.name ↔ funding company nameFallback for accounts where you don't have a domain populated. Normalized exact match.

Normalization rules

Before comparing, both sides go through normalization:

Domain normalization

  • Strip https:// / http:// prefix
  • Strip path (/about), query (?utm=x), fragment (#section)
  • Strip leading www.
  • Lowercase everything
  • e.g. https://www.Vellum.AI/aboutvellum.ai

Email domain extraction

  • Take everything after @
  • Apply the domain normalization rules above
  • e.g. Sarah@Vellum.AIvellum.ai

Name normalization (for account_name path)

  • Strip common legal suffixes: Inc, LLC, Corp, Ltd, Co, Holdings, Group, LP, etc.
  • Lowercase
  • Remove non-alphanumeric characters (except spaces)
  • Collapse whitespace
  • e.g. "Vellum AI, Inc.""vellum ai"

Free-mail domain exclusions

The email_domain match path skips these domains entirely — otherwise every funding round at "gmail.com" would notify every Gmail user in every CRM:

gmail.com, googlemail.com, yahoo.com, hotmail.com, outlook.com, live.com, aol.com, icloud.com, me.com, mac.com, protonmail.com, proton.me, pm.me, mail.com, gmx.com, zoho.com, fastmail.com

Contacts with personal email still match via their account_external_id linkage — see the Contacts page.

Worked examples

Example 1: Domain match (cleanest case)

CRM account:

{ "external_id": "0014x", "name": "Vellum AI", "domain": "vellum.ai" }

Funding round arrives:

{ "company_name": "Vellum AI", "website": "https://vellum.ai" }

→ Match path: account_domain. Webhook fires with matched.account_external_id = "0014x".

Example 2: Contact email match

CRM contact:

{ "external_id": "003C", "email": "sarah@vellum.ai", "account_external_id": null }

Funding round arrives:

{ "company_name": "Vellum AI", "website": "https://vellum.ai" }

→ Match path: email_domain. Webhook fires with matched.contact_external_id = "003C".

Example 3: Personal-email contact on a named account

CRM account:

{ "external_id": "0014x", "name": "Vellum AI", "domain": "vellum.ai" }

CRM contact (personal email):

{ "external_id": "003D", "email": "jane@gmail.com", "account_external_id": "0014x" }

Funding round arrives:

{ "company_name": "Vellum AI", "website": "https://vellum.ai" }

→ Match path: account_domain (wins by precedence). Webhook fires with matched.account_external_id = "0014x". Jane is captured because her account_external_id ties her to Vellum — your downstream system looks up all contacts on this account.

Example 4: Domain missing, name match saves the day

CRM account (no domain populated):

{ "external_id": "0014x", "name": "Vellum AI, Inc." }

Funding round arrives:

{ "company_name": "Vellum AI", "website": "https://vellum.ai" }

→ Match path: account_name. Normalized account name ("vellum ai") matches the normalized round name.

What we do NOT match on

  • First/last name. Common names cause too many false positives.
  • Phone numbers. Rarely tied to a company identity for B2B.
  • Industry. Way too coarse — we'd match every "AI" company to every "AI" contact.
  • Subdomains. support.vellum.ai normalizes to vellum.ai for matching, but if the funding round's website is at vellum-ai-research.com, that's a different domain.

Performance notes

Each path is a single indexed Postgres query. Match latency is dominated by the webhook-firing step (single HTTP POST per match, 5-second timeout) — typically 100-500ms total per match. We process the last 24 hours of funding rounds on every dispatch tick (every minute), with unique(user_id, funding_round_id) dedup at the row level.