<b>How to actually measure whether hreflang is working</b>
The hardest part isn't implementation — it's verification. "Did it work?" has no single metric, and the obvious ones mislead. Here's the measurement framework we use.
Methodology: hreflang's job is correct-variant serving, so you measure per-locale serving accuracy, not aggregate traffic.
Signals that genuinely indicate success:
— Search Console, segmented by country: for each target country, the impressions/clicks should concentrate on the matching locale URL. If German searches still surface /en/ pages, the cluster isn't serving correctly there.
— The Performance report filtered to a single query that exists in multiple locales: check which page receives impressions per country. Correct hreflang produces clean per-country URL separation.
— Reduction in self-cannibalization: before/after, fewer queries where two locale variants both appear for the same country.
What does NOT indicate success or failure:
— Total site clicks. Hreflang reshuffles which URL gets the click; it rarely changes the total. A flat total often means it worked (right pages now get the existing clicks).
— Average position. Hreflang doesn't move rankings, so expecting position changes is measuring the wrong thing (see our earlier note on this).
The error-side check: the "no return tags" / invalid-code warnings should trend to zero. Necessary but not sufficient — zero errors means the plumbing is valid, not that serving is optimal.
Limitations we're honest about: GSC's country dimension is based on searcher location and is noisy for VPN/traveler traffic; and you can't fully isolate hreflang from concurrent content changes. Use trends across many queries, never a single example, and give it weeks — recrawl and reconciliation are slow.
Hreflang Lab
@HreflangLab
<b>How to actually measure whether hreflang is working</b>
Этот пост опубликован в Telegram-канале Hreflang Lab. Подписаться можно по ссылке: @HreflangLab.