Thursday, 21 March 2019

Hidden gems: Scrutiny's site search

Yes, you can use Google to search a particular site. But you can't search the entire source, or show pages which *don't* contain something. For example, you may want to search your website for pages that don't have your Google Analytics code.

Scrutiny's site search has just had some improvements and another option added (version 8.3.3) so it's a good time to show off this easily-overlooked feature.
The usual options that you've set up in your site config apply to the crawl: Blacklisting / whitelisting of pages, link and level limits and lots of other options, including authentication.

Here's where you enter your search term (or search terms - you can search for multiple terms at once, the results will have a column showing which term(s) were found on the page).

There's a 'reverse' button so that you can see pages that *don't* contain. Your search term can be a regular expression (Regex) if you want to match a pattern. You can search the source or the visible text.

Today this panel has gained a case sensitivity button. It's set to 'insensitive' by default because that was the behaviour before.

It's worth noting a couple of 'beware's here.

1. If you're searching the entire source, make sure that your search term matches the way it appears in the source. Today I was confused when I searched for a sentence on Integrity's french page; "Le vĂ©rificateur de liens pour vos sites internet". The accented character appears in the source correctly encoded as é (if you search the body text rather than source, any such entities will be 'unencoded' before checking.)

2. There's a global setting for what should be included / excluded from the visible text of a page. You may want to ignore the contents of navs and footers, for example. One of these options is 'only look at the contents of paragraph and heading tags'. If this is switched on, then quite a lot is excluded. Visible text may not be in <p> tags but a span or a div. Anything in an unordered list, for example, may not (perhaps should not) be within paragraph tags.

The search dialog above now has a friendly warning which informs you if you're searching body text and if you've got some of these exclusions switched on.

The download for Scrutiny is here and includes a free 30-day trial.

No comments:

Post a Comment