Has Outgoing Links with Malformed Href: How to Fix Them
- October 14, 2024
- Links, Link Structure
A malformed href is a link whose URL is syntactically broken (double protocols, raw spaces, unencoded characters, or a bad relative path), so fix the source markup or template to output a clean, properly encoded address that browsers and crawlers can actually follow.
What "malformed href" means
The href attribute holds the destination URL of an anchor (<a>) element. Google has been explicit that it can only crawl a link when it is an <a> element with an href attribute, and the value of that href must resolve into a real, requestable web address. A malformed href is one that does not. The string sits inside the markup looking like a link, but it does not parse into a valid URL, so the browser cannot reliably resolve it and a crawler cannot send a clean request to it.
URLs follow a strict grammar defined in RFC 3986. Only a limited set of characters are allowed unencoded: the letters A to Z and a to z, the digits 0 to 9, and the marks hyphen, underscore, period, and tilde. Reserved characters such as space, the percent sign, square brackets, and many symbols carry special meaning or are simply not legal in their raw form and must be percent-encoded. When a link breaks these rules, it is malformed.
Common malformations
Double protocol
Two schemes get stitched together, usually because a template prepends "https://" to a value that already includes a scheme. The result, such as http://https://example.com, has no valid host and goes nowhere.
Raw spaces
A space inside an href is not a legal URL character. Browsers may guess and silently convert it, but crawlers and other tools can truncate the URL at the space or treat the whole value as invalid.
Unencoded characters
Reserved or non-ASCII characters that are dropped into the href without percent-encoding break parsing. An ampersand, a square bracket, a curly quote, or an accented letter all need encoding before they belong in a URL.
Broken relative paths
Relative URLs are valid and accepted by Google, but malformed ones are not. Stray patterns like an extra colon, a missing slash, leftover template tokens, or doubled segments resolve against the current page into an address that does not exist.
Why it breaks crawling and UX
A crawler treats the href value as the address to fetch. If that value is not a resolvable URL, the link is effectively a dead end: the target page may never be discovered, or the request lands on an error. That wastes crawl budget and weakens the internal link signals that pass authority through your site. For external links, a malformed href means the reference you intended to make simply does not connect.
The experience for people is just as poor. A visitor clicks expecting to reach a page and instead gets a browser error, a blank tab, or a silent no-op. Because some browsers paper over certain mistakes while others do not, the same malformed link can appear to work in one place and fail in another, which makes the problem hard to spot in casual testing.
How to diagnose
A crawler is the fastest way to surface these at scale. Screaming Frog can parse, crawl, and report syntactically invalid URLs, which surface under Response Codes as no-response or malformed links, and it has a dedicated check for non-ASCII characters in URLs since standards require URLs to be sent using the ASCII set. Sitebulb and similar tools flag the same class of issue. Once you have a list, group the bad links by the template or content area that produced them, because malformed hrefs almost always come from a repeated source rather than one-off typos. View the page source to confirm the exact characters in the raw href, since the rendered DOM may already show a browser-corrected version.
How to fix
Fix the source, not the symptom. If a CMS field, theme template, or script concatenates URLs, correct the logic so it stops prepending a scheme to values that already have one, stops leaving template tokens unresolved, and always percent-encodes user or data-driven segments. Validate every URL before it is written into an href, and encode reserved and non-ASCII characters: a space becomes %20, and other reserved characters take their own percent codes.
<!-- Bad: double protocol, raw space, unencoded bracket -->
<a href="http://https://example.com/our team/[2026]">Team</a>
<a href="page :about">About</a>
<!-- Good: single scheme, encoded space and bracket, clean relative path -->
<a href="https://example.com/our%20team/%5B2026%5D">Team</a>
<a href="/about">About</a>After patching the template, recrawl to confirm the malformed hrefs are gone and that the corrected links return a 200 response.
Common mistakes
The biggest mistake is fixing one visible bad link by hand while the template keeps generating dozens more. Another is trusting the browser: because browsers often auto-correct a stray space or a missing slash, the link looks fine on screen even though the raw href is invalid and crawlers will not be as forgiving. Teams also over-encode by accident, encoding the slashes or the colon that structure the URL and breaking a link that was previously fine. Encode the data inside a segment, not the delimiters that define the URL itself.
FAQ
A: No. Google only crawls href values that resolve into an actual web address. If the value is not a valid URL, the crawler cannot reliably request it, so the link may go undiscovered.
A: No. Google accepts relative links for internal navigation. A relative href is only a problem when its syntax is broken, such as a stray colon, a missing slash, or an unresolved template token.
A: Replace it with %20. A raw space is not a legal URL character, so any space in an href should be percent-encoded before the link is rendered.
Need a full technical audit?
SEO ProCheck runs deep crawls that catch issues like this across your whole site.
Claude Vincent is a technical SEO consultant focused on crawlability, rendering, and AI-search visibility. He writes the field guides and case studies at SEO ProCheck, with a bias toward the durable, unglamorous work that decides whether search engines and AI answer engines can actually read and cite a site.
About SEO ProCheck
Technical SEO consulting and GEO strategy with 20 years of enterprise experience. Case studies, resources, and tools for search and AI visibility.
Work With Me
Technical SEO audits, GEO strategy, site migrations, and international SEO. Hourly consulting for teams who need hands-on support, not just reports.








