<b>Escape these five characters or fail validation</b>
Raw ampersands and angle brackets in URLs break XML parsing. The entry, sometimes the whole file, is rejected.
Must be entity-escaped inside <code><loc></code>:
— <code>&</code> → <code>&amp;</code>
— <code>'</code> → <code>&apos;</code>
— <code>"</code> → <code>&quot;</code>
— <code>></code> → <code>&gt;</code>
— <code><</code> → <code>&lt;</code>
Also required:
— ✅ URLs percent-encode non-ASCII (use the IDN/punycode or %-form).
— ✅ Absolute URLs only, including scheme and host.
— ✅ The <code><loc></code> ≤ 2,048 characters.
Checklist:
Step 1 — Run the file through an XML validator, not just a sitemap linter.
Step 2 — Grep for raw <code>&</code> in loc values — common in URLs with query strings.
Definition of done: the file parses as well-formed XML with zero entity errors.
The Sitemap SOP
@SitemapSOP
<b>Escape these five characters or fail validation</b>
Этот пост опубликован в Telegram-канале The Sitemap SOP. Подписаться можно по ссылке: @SitemapSOP.