Friday, 14 November 2014

Link checking Wordpress websites using Scrutiny

This article is for you if you want to scan your Wordpress site for broken links, SEO issues, generate a sitemap, or perform spelling / grammar checks. It also applies to other CMS's which generate 'SEO-friendly' urls.

If you want to start your crawl at the root, www.my-wordpress-site.com then just go ahead, it should be fine and you can limit your crawl if you need to by including or excluding partial urls (blacklisting or whitelisting).

But if you wish to start deep within a site, Scrutiny will limit its crawl to pages within and below that directory. eg www.my-wordpress-site.com/publications/all-publications/

You may know that all-publications is a page. But to Scrutiny it looks like a directory (it's not the trailing slash, it's the lack of a file extension) and it will dutifully check all links on that page but only follow (crawl) links which are in the same 'directory' or below. Therefore your pages won't be fully crawled as you expect.

Since v5.5, Scrutiny has a new option on the Settings screen which allows you to tell Scrutiny that urls are in this form.


It means 'the last part of this url is a filename, not a directory' and so in our example, a crawl starting at www.my-wordpress-site.com/publications/all-publications/  would be limited to the /publications/  directory which is probably what you would expect.

No comments:

Post a Comment