Email Health KPIs, Mailchimp

Metrics type: Supporting Metrics • Category: Email Marketing

At a glance

A composite health score for the Mailchimp programme, weighted across six sub-metrics: delivery rate, open rate (MPP-adjusted), click-to-open rate, conversion rate, bounce rate (inverted), and unsubscribe rate (inverted). Designed for executive read so that “is the email programme healthy” can be answered with a single number rather than six. Scores below 60 indicate something needs attention; scores above 80 indicate a well-tuned programme; scores between 60-80 are normal operating range. The composite is more robust than any single sub-metric to distortion from Apple Mail Privacy Protection or to short-term campaign-level noise.


What it counts	A weighted composite: `0.20 × delivery_rate_score + 0.15 × open_rate_score + 0.20 × ctor_score + 0.20 × conversion_rate_score + 0.15 × bounce_rate_score + 0.10 × unsubscribe_rate_score`. Each sub-metric is converted to a 0-100 scale where 100 is the industry top decile and 0 is the industry bottom decile; the composite is the weighted average.
Why composite, not single metric	Email programme health depends on multiple dimensions simultaneously. A high open rate with a low click-to-open rate is misleading (eyeballs without engagement); a high click rate with a high unsubscribe rate is also misleading (the message is getting through but burning the audience). The composite catches all three failure modes in one number.
Sub-metric: delivery rate (20% weight)	`(emails_delivered ÷ emails_sent) × 100`. Emails accepted by the recipient mail server, post-bounce-handling. Healthy programmes run 97-99 percent. Below 95 percent indicates list-quality issues (high invalid-address rate) or sender-reputation issues.
Sub-metric: open rate (15% weight)	`(unique_opens ÷ delivered) × 100`. Adjusted for Apple Mail Privacy Protection inflation by subtracting an estimated 10-25 percentage points based on the audience’s Apple Mail share (configurable; default assumes 30 percent of the audience uses Apple Mail). The MPP adjustment makes this metric comparable to pre-MPP open rates. Healthy programmes run 22-32 percent post-MPP-adjustment.
Sub-metric: click-to-open rate (20% weight)	`(unique_clicks ÷ unique_opens) × 100`. The most MPP-resilient engagement metric because both numerator and denominator inflate together. Healthy programmes run 12-18 percent. CTOR isolates content effectiveness from open-rate confounds.
Sub-metric: conversion rate (20% weight)	`(orders_attributed_to_email ÷ unique_recipients_reached) × 100`. The commercial outcome dimension; the highest-weight sub-metric because revenue is the goal. Healthy programmes run 1.5-3.5 percent depending on category.
Sub-metric: bounce rate (inverted, 15% weight)	`(hard_bounces + soft_bounces ÷ emails_sent) × 100`, scored inversely (lower is better). Healthy programmes run under 2 percent; over 5 percent triggers ESP-level reputation alerts.
Sub-metric: unsubscribe rate (inverted, 10% weight)	`(unsubscribes ÷ delivered) × 100`, scored inversely. Healthy programmes run under 0.5 percent per send; over 1 percent indicates list fatigue, content-audience mismatch, or send-frequency issues.
Industry benchmarks	The 0-100 sub-metric scoring uses Mailchimp’s published industry benchmarks (refreshed annually) by category: e-commerce, retail, services, B2B, etc. Vortex IQ uses the merchant’s industry vertical from the Mailchimp account configuration; if the vertical is unset or generic, the e-commerce benchmark is used as default.
Currency	n/a, this is a composite score. The currency-denominated impact surfaces in `mai_revenue_per_recipient`.
Time window	`30D vsP` (30-day rolling average vs prior 30-day period). The composite smooths short-term noise; the 30-day window is appropriate.
Alert trigger	`score < 60` or `drop > 10 points vsP`. Either condition fires the alert; both indicate degradation worth investigating.
Sentiment key	`mc_email_health_score`
Roles	owner, marketing

Calculation

Calculated automatically from your Mailchimp data. See the At a glance summary above for what the metric tracks and the worked example below for a typical reading.

Worked example

A UK-based beauty brand on Shopify running Mailchimp Standard with a 95,000-contact main Audience. Snapshot for the 30-day window ending Wednesday 15 May 26.

Sub-metric	Raw value	Industry benchmark band	Sub-score (0-100)	Weight	Contribution
Delivery rate	98.4%	97-99% (in band)	78	0.20	15.6
Open rate (MPP-adjusted)	27.8%	22-32% (in band)	72	0.15	10.8
Click-to-open rate	14.2%	12-18% (in band)	65	0.20	13.0
Conversion rate	1.4%	1.5-3.5% (low end)	38	0.20	7.6
Bounce rate (inverted)	1.6%	<2% (in band)	82	0.15	12.3
Unsubscribe rate (inverted)	0.8%	<0.5% (above band)	22	0.10	2.2
Composite Email Health Score					61.5

What the per-sub-metric view is telling us:

The composite score of 61.5 is in the lower-normal range (60-80 normal operating). Not red, but not strong; specifically pointing at conversion rate and unsubscribe rate as the two drag factors. The headline figure tells the right story; the sub-metric decomposition tells the fix.
Conversion rate at 1.4 percent is the load-bearing problem. It is below the e-commerce benchmark band (1.5-3.5 percent) and carries a 0.20 weight, so a 0.5 percentage point lift would add roughly 6 points to the composite. Investigate: (a) are the right segments receiving the right campaigns? Use mc_segments_overview to check segment-to-campaign matching; (b) is the offer mix appropriate? Beauty brands often over-rely on percentage-discount offers when free-gift-with-purchase or first-purchase-bundle offers convert better; (c) is the cart-and-checkout flow working for email-driven traffic? Pair with Shopify’s email-attributed conversion rate for the commerce-side view.
Unsubscribe rate at 0.8 percent is above the healthy band and carries a 0.10 weight, so a reduction to 0.4 percent would add ~3-4 points. Investigate: (a) is send frequency too high? 8+ sends/month in beauty often triggers fatigue; consider a frequency cap survey or auto-pause for users showing fatigue signals; (b) are the unsubscribes concentrated in specific campaigns? The mc_top_campaigns_revenue card combined with unsubscribe attribution surfaces which sends are burning the list; (c) is the audience-acquisition channel mix shifting? Paid-social-acquired subscribers churn at 2-3x the rate of organic-search-acquired subscribers.
Open rate post-MPP at 27.8 percent is healthy. This is the MPP-adjusted figure (raw open rate is likely 38-42 percent before adjustment). The MPP adjustment is what makes this metric trustworthy for executive reporting; reporting raw open rate would tell a misleadingly positive story about engagement.
CTOR at 14.2 percent is in band. This is the most MPP-resilient engagement signal and the most reliable indicator of content effectiveness. A 14.2 percent CTOR means roughly 14 percent of openers clicked something; healthy beauty programmes typically run 12-18 percent. The current programme is doing decent content-engagement work.
Delivery rate at 98.4 percent and bounce rate at 1.6 percent are both healthy. Sender reputation is in good standing; deliverability infrastructure is working. The problems concentrate in conversion (commercial) and unsubscribe (audience-fatigue), not in deliverability.
Recommended action priorities based on score-impact: (1) Lift conversion rate from 1.4% to 1.8% via offer-mix and segment-matching work, biggest single improvement available, ~6 points to composite; (2) Reduce unsubscribe rate from 0.8% to 0.5% via frequency-cap testing and fatigue-aware send-pacing, second-biggest improvement, ~3-4 points; (3) Maintain delivery and bounce performance; both are healthy and degradation here would compound the conversion issues.

The diagnostic flow when this card flags amber (composite < 60 or drop > 10 points):

Identify the dragging sub-metric. The composite always has a load-bearing weak point; surface it first.
Cross-reference with mai_engagement_funnel for the funnel decomposition: where in Sent → Delivered → Opened → Clicked → Converted is the funnel breaking.
Check mc_alert_deliverability_drop and mc_alert_sender_reputation. Deliverability incidents cascade through the entire composite; fixing them recovers multiple sub-scores at once.
Pair with mai_revenue_per_recipient for the commercial check. A degrading composite with stable RPR suggests the score is over-reacting to non-revenue-affecting changes; a degrading composite with degrading RPR confirms the programme is genuinely under-performing.
For multi-Audience accounts, check audience-level scores. Different Audiences may score differently; a single low-scoring Audience can drag the blended figure even when most Audiences are healthy.

The rapid-response playbook for marketing leadership:

Time horizon	Action
First 1 hour after alert	Identify the dragging sub-metric. The composite without decomposition is not actionable.
First 4 hours	Pair with revenue-per-recipient to confirm the score reflects genuine programme degradation rather than benchmark drift.
First 24 hours	Build a 30-day improvement plan focused on the top 1-2 dragging sub-metrics. Avoid trying to fix everything; concentrate effort.
First week	Implement the highest-impact changes; measure 7-day-rolling composite score for early signal.

Sibling cards merchants should reference together

Card	Why merchants reach for it
`mai_engagement_funnel`	The Sent → Delivered → Opened → Clicked → Converted funnel decomposition. Where the composite is breaking shows up here first.
`mai_revenue_per_recipient`	The revenue-efficiency pair to this engagement-health composite. Pair to size whether engagement issues are translating to revenue issues.
`mai_delivery_rate`	Sub-metric 1 (delivery), 20% weight. Drill-down for delivery-specific investigation.
`mc_open_rate`	Sub-metric 2 (open rate), 15% weight. Drill-down for open-rate-specific investigation.
`mc_click_to_open_rate`	Sub-metric 3 (CTOR), 20% weight. The most MPP-resilient sub-metric.
`mc_conversion_rate`	Sub-metric 4 (conversion), 20% weight. The commercial outcome dimension.
`mai_bounce_rate`	Sub-metric 5 (bounce, inverted), 15% weight. Drill-down for sender-reputation investigation.
`mai_unsubscribe_rate`	Sub-metric 6 (unsubscribe, inverted), 10% weight. Drill-down for list-fatigue investigation.
`mc_alert_deliverability_drop`	Deliverability incident alert. Cascades through delivery, open, and CTOR sub-scores.
`mc_alert_sender_reputation`	Sender reputation degradation alert. Same cascade pattern.
`mc_alert_bounce_spike`	Bounce-volume spike alert. Often the first sign of list-quality or sender-reputation problems.
`mc_alert_abuse_spike`	Spam-complaint spike alert. Drives unsubscribe-rate and deliverability sub-scores down simultaneously.
`mc_audience_growth_rate`	The audience-growth pair to this composite. Programme can run healthy on a shrinking list, but revenue erodes; both metrics need to be monitored together.
Klaviyo `klv_email_health_kpis`	The Klaviyo parallel for brands running both or evaluating ESP migration.
Brevo `brv_email_health_kpis`	The Brevo (Sendinblue) parallel for brands running both.

Reconciling against the vendor’s own dashboard

Where to look in Mailchimp’s own dashboard:

Mailchimp → Reports → All campaigns for the per-campaign sub-metric values that aggregate into the composite. Each campaign report has Open rate, Click rate, CTOR, bounce rate, unsubscribe rate, and revenue figures; aggregating these manually approximates the composite calculation.
Mailchimp → Audience → All contacts → Engagement for the audience-level engagement breakdown.
Mailchimp → Account → Pricing for the contact-tier billing context that affects how the audience-size dimension feeds into programme economics.

Why the Vortex IQ composite score may legitimately differ from any single Mailchimp dashboard number: The Vortex IQ composite is a derived metric not surfaced directly by Mailchimp; the dashboard shows individual sub-metrics. Reconciliation is therefore not “Mailchimp UI shows X, Vortex IQ shows Y” but rather “do the sub-metric values that feed the composite match Mailchimp’s UI”. Common reconciliation gaps:

Reason	Direction	What to do
MPP adjustment. Vortex IQ subtracts an estimated MPP-inflation from open rate; Mailchimp’s UI shows the raw (MPP-inflated) figure.	Vortex IQ open rate lower	Use the raw open-rate field for direct UI comparison; the MPP-adjusted figure is for trend analysis.
Industry benchmark band. The 0-100 sub-scoring uses Mailchimp’s published e-commerce benchmarks; if the merchant’s Mailchimp account has a non-e-commerce vertical configured, the benchmark differs.	Either direction	Confirm the vertical setting in Mailchimp Account → Settings → Industry.
Window alignment. The composite uses 30-day rolling; per-campaign Mailchimp reports use the campaign’s send-and-attribution window.	Either direction	Aggregate Mailchimp per-campaign reports across the 30-day window to compare apples-to-apples.
Conversion attribution window. Mailchimp UI uses the configured account window (24 hours default); the composite reflects whatever the API returns.	Either direction	Confirm the account-level window setting.
Refresh lag. Composite recalculates every 6 hours; Mailchimp UI updates within 30 minutes of new send activity.	Vortex IQ moves slowly during active campaign periods	Wait for next refresh; check `last_synced_at`.

Cross-connector reconciliation:

Comparison	Expected relationship	When divergence is legitimate
`mai_email_health_kpis` ↔ Klaviyo `klv_email_health_kpis`	Comparable composite scores using same weighting and benchmark methodology	Different ESPs have different default attribution windows, different MPP adjustment defaults, and different industry benchmark sources. Brands running both should expect 5-15 point gaps that reflect ESP methodology rather than programme health differences.
`mai_email_health_kpis` ↔ Internal NPS or customer-satisfaction scores	Should correlate positively over time	Email programme health is a leading indicator of customer relationship health; declining composite often precedes NPS decline by 30-90 days. The relationship is correlation not causation.
`mai_email_health_kpis` ↔ Klaviyo `klv_email_health_kpis` ↔ Brevo `brv_email_health_kpis`	Multi-ESP brands can compare programme health across ESPs	The composite is consistently calculated; the ranking surfaces which ESP-and-programme combination is performing best for the merchant.

Quick rule for support tickets: the composite score is a derived metric and not directly comparable to a single Mailchimp UI number. When merchants ask “why does Mailchimp show 30 percent open rate but your card score is only 65”, explain that the score weights six sub-metrics together; healthy open rate alone cannot lift the composite when other sub-metrics are weak. Pointing the merchant at the sub-metric decomposition in the worked example surfaces what is actually dragging the score down.

Known limitations / merchant FAQs

My score dropped 15 points overnight. What happened? A 15-point drop in a single 30-day-rolling composite is large; it usually concentrates in 1-2 sub-metrics rather than degrading uniformly. Possibilities, in order of likelihood. (1) Deliverability incident: a sudden bounce-rate spike, sender-reputation drop, or blocklist appearance can cascade through delivery, open, CTOR, and conversion sub-scores simultaneously. Check mc_alert_deliverability_drop. (2) List import: a major list import added many low-engagement or invalid addresses, dropping bounce rate and engagement metrics for the next 30 days. (3) Send-frequency change: increasing send cadence often raises unsubscribe rate sharply, dragging the unsubscribe sub-score down. (4) Conversion tracking break: the Mailchimp-Shopify (or other) e-commerce integration disconnected, dropping conversion rate to zero. (5) Industry benchmark refresh: Mailchimp updates published benchmarks annually; a benchmark refresh can shift sub-scores even when the underlying performance is unchanged. Why does the composite weight conversion rate at only 20 percent? Isn’t revenue the goal? Conversion rate is the single highest-weighted sub-metric (tied with delivery and CTOR at 20 percent). Revenue is the goal but the composite is designed to reflect programme health not programme outcome. Conversion rate alone is a noisy single-period number that can move 30-50 percent week-over-week based on offer mix, holiday calendar, and product launches. The composite is more robust because the engagement sub-metrics (delivery, open, CTOR, bounce, unsubscribe) are slower-moving and reflect underlying programme quality. Brands wanting a pure-revenue view should pair this composite with mai_revenue_per_recipient for the commercial outcome dimension. My score is 75 but my CEO says email contribution to revenue is flat. Are these consistent? Possibly yes. A healthy 75 composite score reflects “the email programme is well-tuned for its current size and configuration”; revenue contribution depends on scale (audience size × per-recipient revenue) and share of total (email’s portion of total acquisition mix). A small audience can run an excellently-tuned programme (high composite) but contribute small absolute revenue. The pair to look at is composite + audience growth + revenue-per-recipient: a healthy composite with stagnant audience and stable RPR predicts flat email contribution. The MPP adjustment looks aggressive. Can I disable it? Yes; the mpp_adjustment_pct configuration parameter sets the assumed inflation. Default is 30 percent of audience uses Apple Mail, scaled to a 10-25 percentage-point open-rate adjustment. Brands that know their audience’s Apple Mail share precisely (via account-level analysis or third-party tools) can override the default. Setting mpp_adjustment_pct = 0 returns raw open rates, which makes year-over-year comparison difficult (pre-September-2021 data is genuinely from a non-MPP world; post-September-2021 data is MPP-inflated). The default adjustment is a pragmatic compromise. Should I optimise for the composite score or for revenue? Optimise for revenue; track composite for health monitoring. The composite is a leading indicator of programme sustainability; revenue is the lagging indicator of programme effectiveness. Brands optimising for composite alone often over-engineer engagement (cute subject lines, fewer emails, light offers) and under-deliver revenue. Brands optimising for revenue alone often burn the audience (heavy promotion, high frequency, manipulative subject lines) and degrade the long-term composite. The two together prevent either bias. What’s a good target composite score for my industry? Top decile across e-commerce verticals is typically 80+; healthy normal operating is 65-79; warning is 50-64; critical is below 50. The targets vary by industry: B2B brands typically run lower composites (50-70 healthy) because the funnel is longer and conversion rates are structurally lower; consumer impulse-buy verticals (fashion, beauty, food) often run higher (70-85 healthy). The Mailchimp account’s vertical setting drives the benchmark band the sub-scores are normalised against. Why is unsubscribe rate weighted only 10 percent? Unsubscribe is the most volatile sub-metric (a single bad send can spike unsubscribe rate dramatically), and the spike often does not reflect long-term programme health. The 10 percent weight prevents short-term noise from dominating the composite while still rewarding programmes that maintain healthy unsubscribe rates over time. Brands experiencing sustained unsubscribe spikes (3+ months) will see the composite move meaningfully because the sustained pattern flows through; brands experiencing single-send unsubscribe spikes will see only minor composite movement, which is the correct behaviour. Can Vortex IQ change my Mailchimp send cadence or content? Read-only by design. Vortex IQ surfaces health composite trends, identifies dragging sub-metrics, and flags optimisation opportunities; the merchant’s marketing team executes inside Mailchimp. The Vortex Mind Customer Recovery Opportunity report generates merchant-side Actions when composite degradation patterns suggest specific fixes, but the configuration changes themselves sit with the merchant. Is the composite score comparable across ESPs? Approximately. The same 6-sub-metric weighted-composite calculation runs across Mailchimp, Klaviyo, Brevo, Omnisend, and other email connectors. The weights and methodology are consistent; the underlying ESPs have different default attribution windows, MPP-handling defaults, and benchmark sources, which create 5-15 point methodological gaps. For evaluating ESP migration, compare composites alongside the underlying sub-metrics; rely on the sub-metrics for direct apples-to-apples comparison and the composite for the executive-summary read.

Tracked live in Vortex IQ Nerve Centre

Email Health KPIs is one of hundreds of KPI pulses Vortex IQ tracks across Mailchimp and 70+ other ecommerce connectors. Nerve Centre runs the detection layer; Vortex Mind investigates the cause when something moves; Ask Viq lets you interrogate any number in plain English. Start for free or book a demo to see this metric running on your own data.

Get Started

The AI OS

At a glance

Calculation

Worked example

Sibling cards merchants should reference together

Reconciling against the vendor’s own dashboard

Known limitations / merchant FAQs

Tracked live in Vortex IQ Nerve Centre

​At a glance

​Calculation

​Worked example

​Sibling cards merchants should reference together

​Reconciling against the vendor’s own dashboard

​Known limitations / merchant FAQs

​Tracked live in Vortex IQ Nerve Centre

At a glance

Calculation

Worked example

Sibling cards merchants should reference together

Reconciling against the vendor’s own dashboard

Known limitations / merchant FAQs

Tracked live in Vortex IQ Nerve Centre