At a glance
The composite Lighthouse Performance Score (0-100) for the merchant’s site, weighted across Largest Contentful Paint, Total Blocking Time, Cumulative Layout Shift, First Contentful Paint, and Speed Index. The single number that summarises whether the site is fast enough, designed for executive read so “is the site healthy?” can be answered with one glance rather than five separate Core Web Vitals charts. Scores below 50 indicate the site is structurally slow; 50-89 is the warning band; 90+ is the green band that benchmark-leading sites operate in. Performance score directly correlates with conversion rate: industry data puts the lift from 50 → 90 score at 2-7 percent conversion-rate uplift across ecommerce verticals.
| What it counts | A weighted composite of five Lighthouse lab metrics: First Contentful Paint (10% weight), Speed Index (10%), Largest Contentful Paint (25%), Total Blocking Time (30%), Cumulative Layout Shift (25%). Each metric is converted to a 0-100 sub-score using Lighthouse’s published scoring curves; the composite is the weighted sum. |
| Sample type | Lab data (not field). Lighthouse runs a synthetic page-load in a controlled environment (slow 4G throttling on mobile profile, no throttling on desktop profile) and measures the result. Field data from real-user CrUX measurement is surfaced separately in crux_lcp_p75 and siblings. |
| Device profile | Defaults to mobile (slow 4G, mid-tier hardware emulation). Desktop scores typically run 15-30 points higher than mobile for the same site. The psi_mobile_vs_desktop_score card surfaces both side-by-side. |
| Form factor / locale | The Lighthouse run uses Chrome desktop user-agent set to mobile profile, US English locale by default. Brands operating multi-locale sites should run per-locale audits (configurable via the integration’s site URL list); the score may vary 5-10 points across locales due to translated-content size differences. |
| Sample size threshold | n/a for lab data, every Lighthouse run produces a score. Field data (CrUX-sourced) has the 75th-percentile rule: a metric needs at least 75 percent of real users meeting the “good” threshold to score green; sites with low traffic may not have enough data for CrUX p75 calculation, in which case the field metrics return null and the lab score is the only signal available. |
| Score band interpretation | 0-49 (red): structurally slow, will visibly hurt user experience and conversion. 50-89 (orange): within range but not optimal; specific sub-metrics likely failing. 90-100 (green): benchmark-leading; further optimisation has diminishing returns. The 50/90 thresholds are Google-defined boundaries, not Vortex IQ thresholds. |
| Refresh cadence | Lab audits are scheduled per integration: hourly for high-priority sandboxes (sandbox + monitored production), daily for standard production sites. Each audit run takes 15-90 seconds depending on site complexity and is rate-limited at 4 runs/sec per Google API key. |
| Currency | n/a, this is a 0-100 score. |
| Time window | T/7D/30D vsP (today value, 7-day rolling, 30-day rolling, vs prior period). |
| Alert trigger | score < 50 (red band), or drop > 10 points vsP (significant regression even within healthy band). |
| Sentiment key | psi_perf_score |
| Roles | owner, marketing, operations |
Calculation
Calculated automatically from your Website Performance (PageSpeed + CrUX) data. See the At a glance summary above for what the metric tracks and the worked example below for a typical reading.Worked example
A UK-based BigCommerce fashion store running a Stencil theme with a 78-product catalogue, 6 hero collections, and a Klaviyo email-capture popup configured to fire on second pageview. Snapshot for Wednesday 15 May 26.| Page template | Mobile score | Desktop score | Composite drivers |
|---|---|---|---|
| Homepage | 42 | 76 | LCP 4.8s (hero video carousel), TBT 480ms (third-party widgets), CLS 0.18 (no aspect ratios on banner) |
| Product detail page | 58 | 82 | LCP 3.4s (high-res hero image), TBT 280ms, CLS 0.12 (review widget loading late) |
| Collection / category | 48 | 79 | LCP 3.9s (12-image grid), TBT 410ms (filter widget JS) |
| Cart | 62 | 88 | LCP 2.8s, TBT 220ms, CLS 0.04 |
| Checkout step 1 | 71 | 92 | Lighter, fewer third-party scripts loaded |
| Site weighted average | 51 | 81 | Mobile pulled down by homepage; desktop healthy |
- The weighted mobile score of 51 sits exactly on the red/orange boundary and the desktop score of 81 is comfortably in the orange band. The pattern is typical for Stencil-themed BigCommerce sites with rich hero content: the homepage is the single biggest drag because it loads heaviest media (video carousels, large hero images, third-party widgets) while transactional pages further down the funnel are progressively lighter and faster.
-
The homepage at 42 is the load-bearing problem. Three drivers: (a) the hero video carousel forces a 4.8 second LCP, (b) third-party widgets (Klaviyo popup, chat widget, analytics tags) generate 480ms of total blocking time on mobile, (c) the banner area lacks
aspect-ratiodeclarations causing 0.18 cumulative layout shift as images load. Each driver has a clean fix: defer the video to post-load, load Klaviyo popup withrequestIdleCallback, add explicit aspect-ratio CSS to the banner. Estimated combined uplift: 25-30 points to the homepage score, lifting the weighted average from 51 to roughly 60. - Product detail pages at 58 represent the biggest commercial opportunity. PDPs are where the conversion happens; a 10-point lift here directly translates to conversion-rate improvement. Industry data puts the conversion lift from 58 → 75 score at roughly 3-4 percent on PDP traffic. For a merchant doing £200K monthly revenue with 30 percent of revenue from direct-to-PDP traffic, that is roughly £1,800-£2,400/month in incremental revenue from PDP performance work alone.
-
Collection pages at 48 underperform mobile-fold expectations. The 12-image grid renders all 12 images on initial load on mobile, when only the top 3-4 are above the fold. Lazy-loading the below-fold images is a one-line CSS /
loading="lazy"HTML attribute change that typically lifts collection-page mobile score by 8-15 points. Combined with deferring the filter-widget JS, the collection-page score can reach 65-70 in 2-3 days of focused work. - The cart and checkout pages are healthy. No optimisation work needed there; they’re already in the orange-band sweet spot. Resist the temptation to touch transactional pages first because they “feel important”, the highest-leverage work is on the public-facing homepage and PDPs.
- Pre-launch readiness implication. For a BC store about to go live, a weighted mobile score of 51 is borderline acceptable for non-revenue-critical launches but inadequate for high-traffic launches (BFCM, brand campaign, paid-acquisition push). Recommended pre-launch threshold: weighted mobile score ≥ 60 OR no individual page below 40. The current site fails the second condition (homepage at 42).
- Decompose by template.
psi_score_by_templatesurfaces which page types drag the average. Optimisation effort concentrates where the worst scores live, weighted by traffic share. - Decompose by metric. A score of 50 with TBT-driver looks different from a score of 50 with LCP-driver. The Lighthouse audit JSON breaks the score into per-metric contributions.
- Cross-reference with field data.
crux_lcp_p75,crux_inp_p75, andcrux_cls_p75show what real users experience. Lab and field can diverge: lab measures synthetic conditions, field measures actual user devices and networks. Thepsi_field_vs_labcard surfaces the gap. - Identify specific opportunities.
psi_top_opportunities_msandpsi_top_opportunities_bytesrank optimisation actions by potential impact. Tackle the top 3 first; diminishing returns set in fast after the obvious wins. - For BC-specific optimisations, audit the Stencil theme’s third-party scripts. Many themes ship with broad analytics/marketing tag managers that load many vendor scripts; the
psi_third_party_costcard surfaces total third-party cost. Vendor consolidation often yields the largest single uplift.
| Time horizon | Action |
|---|---|
| First 1 hour after alert | Decompose by template; identify the worst-performing page type. |
| First 4 hours | Identify the top 3 optimisation opportunities (Lighthouse audit JSON). |
| First 24 hours | Implement the top 1-2 (typically image optimisation, render-blocking JS deferral). |
| First week | Re-run audit. Measure delta. Move to the next 2-3 opportunities if the headline number hasn’t moved enough. |
Sibling cards merchants should reference together
| Card | Why merchants reach for it |
|---|---|
psi_cwv_pass_rate | The field-data complement to this lab-data score. Composite tells you “what synthetic measurement says”; CWV pass rate tells you “what real users experience”. Both should be read together. |
crux_lcp_p75 | The most-recognised CWV field metric. The single biggest contributor to perceived load speed. |
crux_inp_p75 | The newer CWV (replaced FID March 2024). Click-responsiveness signal. |
crux_cls_p75 | Visual stability CWV. Composite-score impact via 25 percent weight. |
psi_field_vs_lab | The lab-vs-field gap analysis. Critical for understanding whether the score reflects user reality. |
psi_score_by_template | Per-template decomposition. Worst-performer identification. |
psi_mobile_vs_desktop_score | Mobile/desktop split. Mobile is the priority for almost all ecommerce. |
psi_top_opportunities_ms | Ranked optimisation actions. Where to concentrate work. |
psi_top_opportunities_bytes | Same shape but byte-savings ranking. |
psi_third_party_cost | Third-party script cost. Often the largest single drag on Stencil-themed BC sites. |
psi_total_weight | Total page weight. Direct driver of LCP and TBT sub-scores. |
psi_score_trend | Score over time. Drift detection. |
crux_regression_timeline | When did field metrics get worse, useful for incident-correlation work. |
psi_lab_lcp | Lab LCP, the largest weighted contributor to the composite (25 percent). |
psi_lab_tbt | Lab Total Blocking Time, the second-largest contributor (30 percent). |
psi_lab_cls | Lab Cumulative Layout Shift, 25 percent contributor. |
GSC gsc_mobile_usable_pages | Pairs with this card during pre-launch audit. Mobile usability and mobile performance are jointly required for go-live readiness. |
GA4 ga_pageviews_per_session | Pages-per-session correlates with site speed; slow sites see lower depth-of-engagement. |
Reconciling against the vendor’s own dashboard
Where to look in PageSpeed Insights’ own dashboard:- PageSpeed Insights, paste the merchant’s URL into the form. The headline mobile and desktop scores at the top of the report are the closest 1-to-1 comparison to this card. The “Performance” tab decomposes the composite into per-metric contributions.
- Lighthouse in Chrome DevTools, runs the same audit locally with the same scoring methodology. Useful when investigating a specific page in detail; gives access to the full audit JSON.
- Google Search Console → Core Web Vitals report, surfaces field-data CWV pass rates from the merchant’s verified property. Different from the lab score but directly related; brands seeing healthy lab scores but poor GSC CWV reports are typically running fast in lab conditions but failing under real-user device/network constraints.
| Reason | Direction | What to do |
|---|---|---|
| Run-to-run variance. Lighthouse scores fluctuate ±5-10 points between runs for the same URL on the same network conditions due to timing variations in JS execution, cache state, and CDN response. | Either direction | Vortex IQ surfaces a 7-day rolling average to smooth this; comparing a single Vortex IQ snapshot to a single fresh PSI run is unreliable. Use the trend view. |
| Throttling profile. Vortex IQ uses Google’s standard “Slow 4G + mid-tier mobile” profile by default; the PSI web UI uses the same profile. No reconciliation gap from this. The differences appear when merchants run Lighthouse locally with their own connection (typically faster), producing higher local scores. | n/a (default match) | Confirm the local run uses the “Mobile” emulated environment, not “Desktop” or unthrottled. |
| Refresh cadence. Vortex IQ runs scheduled audits (hourly for sandboxes, daily for production); PSI runs on-demand at the moment of request. A site that pushed a deploy 30 minutes ago shows the new score in PSI but the pre-deploy score in Vortex IQ until next refresh. | Vortex IQ lags for the most recent refresh window | Wait for next refresh; check last_synced_at. |
| URL set differences. Vortex IQ audits the configured URL list (typically homepage + top product/category pages from the manifest’s URL list); PSI on-demand runs whatever URL the user pastes. Comparing PSI on a single page to the Vortex IQ weighted average is not apples-to-apples. | Either direction | Use PSI on the same per-template URL the Vortex IQ audit ran on for direct comparison. |
| Plugin / extension noise. PSI runs in clean Chrome; Lighthouse runs in the user’s browser carry the user’s installed extensions, ad blockers, and dev tools. Local Lighthouse scores often run higher (extensions block third-party scripts) than the truthful “average user” environment. | Local higher | Run PSI in the web UI (matches Vortex IQ) rather than relying on local DevTools Lighthouse runs. |
| Comparison | Expected relationship | When divergence is legitimate |
|---|---|---|
psi_perf_score_summary ↔ crux_lcp_p75 (and other CWV field metrics) | Lab and field directionally agree | Lab measures synthetic conditions; field measures real users on real devices on real networks. Sites with high lab scores but poor field CWV typically run on slower-than-emulated user devices (older Android handsets, rural mobile networks). The psi_field_vs_lab card surfaces the gap. |
psi_perf_score_summary ↔ GSC Core Web Vitals report | GSC reflects field-data CWV pass rates | A lab score of 70 with GSC reporting “30 percent of pages failing CWV” is consistent (lab measures one synthetic run; GSC measures distribution across all real-user sessions for all indexed URLs). |
psi_perf_score_summary ↔ GA4 conversion rate | Score and conversion rate correlate positively | Industry data: 50→90 score lift typically produces 2-7 percent conversion rate uplift. Brands measuring this directly via A/B testing during performance work confirm the relationship; the lift varies by vertical and traffic-source mix. |
Known limitations / merchant FAQs
My score dropped from 72 to 58 overnight. What happened? Possibilities, in order of likelihood. (1) A new third-party script landed: a marketing tag, chat widget, A/B testing tool, or analytics tag was added (often via Google Tag Manager) and is generating significant TBT or CLS. Cross-referencepsi_third_party_cost. (2) An image asset was uploaded unoptimised: a hero image saved at 4MB instead of the target 200KB destroys LCP. The psi_image_optimisation card surfaces oversized images. (3) A theme update changed the CSS or JS bundle: Stencil theme version bumps occasionally introduce render-blocking CSS or unused JavaScript. Check theme update history. (4) CDN configuration changed: cache miss rates went up, server response time degraded. Check psi_server_response. (5) Browser engine update: Chrome occasionally tweaks Lighthouse scoring; major version updates can shift scores by 3-7 points without underlying site change.
Should I aim for 100 or accept 70?
Depends on competitive context and traffic source mix. Lab scores above 90 require sustained engineering investment that often outweighs the marginal conversion benefit. Most ecommerce brands operate productively in the 60-80 band. Brands with heavy paid-search traffic (SEO competitors, ad-quality-score sensitive) should target 80+ to maintain ad rank advantages. Brands with primarily direct or branded traffic can operate comfortably at 60-70 with appropriate field-data CWV pass rates.
Why is mobile so much lower than desktop?
Two reasons. (1) Network throttling: Lighthouse’s mobile profile applies “Slow 4G” throttling (1.6 Mbps down, 750 Kbps up, 150ms RTT) which more than doubles load times vs desktop’s unthrottled connection. (2) CPU emulation: mobile profile emulates a mid-tier Android handset (4x CPU slowdown), causing JS execution and rendering work to take 4x longer. Mobile is the right priority for almost all ecommerce because most traffic is mobile, but the structural gap of 15-30 points vs desktop is normal and not a problem on its own.
Lab vs field: which one matters?
Both, for different decisions. Lab data is the right anchor for engineering work because it produces deterministic measurements you can run repeatedly during development. Field data is the right anchor for ranking-impact decisions because it reflects what real users (and Google’s search algorithm) experience. Engineering teams optimise to lab; SEO and product leadership track field. The Vortex IQ pre-launch audit uses lab as the gate (deterministic) and post-launch monitoring uses field (real-world signal).
Why doesn’t PageSpeed give me a score for some pages?
Three causes: (1) The URL is non-public (behind auth, behind CDN block list, returning non-200 status). PageSpeed needs an unauthenticated GET to render. (2) The URL has redirects to a non-Lighthouse-eligible target (PDF download, login page, marketing landing tool). (3) PageSpeed API rate-limit hit: 4 requests/sec per API key, 25,000/day. Bulk audits can saturate this; the engine queues but slow tickets show as “pending”.
My audit shows score 50 but my customers don’t complain. Should I still fix this?
Probably yes, but pace the work to commercial impact. A 50 score with no customer complaints typically means either (a) the customer base is on fast networks and good devices (paid-acquisition cohorts converting before bounce-friction triggers), or (b) the merchant doesn’t have systematic complaint capture. The right test is conversion-rate impact: A/B test a faster version of the homepage (or PDP) for 14-30 days; measure conversion lift. If lift is negligible, the audience is genuinely tolerant of current performance and effort can shift elsewhere. If lift is meaningful (1+ percent), the score gap is costing real revenue.
Can Vortex IQ change my theme code or asset configuration?
Read-only by design. Vortex IQ surfaces score patterns, identifies dragging metrics, and ranks optimisation opportunities by potential impact; the merchant’s developers or agency execute inside the BC theme editor and asset CDN. The Vortex Mind Pre-Launch Readiness report generates merchant-side Actions (specific image-optimisation tasks, JS-deferral changes, third-party script reviews) but the changes themselves sit with the merchant.
My score is fine on PageSpeed but Google Search Console shows my pages failing CWV. Why?
Lab vs field divergence. PageSpeed measures a single synthetic run on Google’s emulated mid-tier mobile profile; GSC measures the field-data distribution from real-user CrUX measurements across all your traffic. The divergence usually means real users are on slower devices or networks than the emulation profile assumes. Common causes: brand has high Android share with older devices; brand has international audience on weaker mobile networks; brand has third-party scripts that perform worse in real-user conditions than in synthetic runs (synthetic has fast cache, real users hit cold cache). The fix is the same as for low lab scores but the urgency is higher because GSC field-data is what affects search ranking.
Is the Lighthouse score the same as Google’s PageSpeed Insights score?
Yes. PageSpeed Insights runs Lighthouse on Google’s servers and exposes the result. Same scoring methodology, same audit categories. Vortex IQ uses the PageSpeed Insights API (which runs Lighthouse on Google’s servers); local Chrome DevTools Lighthouse runs use the same Lighthouse binary with potentially different network/CPU conditions, which is why local scores can run higher than PSI scores.
Should I optimise for the score or for actual user experience?
For actual user experience, anchored on field-data CWV. The score is a useful heuristic and necessary for the Lighthouse audit to surface specific opportunities, but the goal is “real users have a fast experience”. Brands optimising purely for the lab score can hit 95+ while real users still suffer (typically by deferring critical content past first-paint, gaming the metric without improving the experience). The Vortex IQ approach pairs the score (composite indicator) with field-data CWV (truth source) and treats the field metrics as the canonical health signal.