Thursday, 21 March 2019

Hidden gems: Scrutiny's site search

Yes, you can use Google to search a particular site. But you can't search the entire source, or show pages which *don't* contain something. For example, you may want to search your website for pages that don't have your Google Analytics code.

The usual options that you've set up in your site config apply to the crawl: Blacklisting / whitelisting of pages, link and level limits and lots of other options, including authentication.

Here's where you enter your search term (or search terms - you can search for multiple terms at once, the results will have a column showing which term(s) were found on the page).

There's a 'reverse' button so that you can see pages that *don't* contain your term. Your search term can be a regular expression (Regex) if you want to match a pattern. You can search the source or the visible text.

Method

1. Add your site to Scrutiny if it's not there already and check the basic settings. See https://peacockmedia.software/mac/scrutiny/manual/v9/en.lproj/getting-started.html

2. Instead of 'Scan now', open 'More tasks' and choose 'Search Site'
3. enter your search term (or multiple search terms) and review the other options in the dialog





Note

1. If you're searching the entire source, make sure that your search term matches the way it appears in the source. Today I was confused when I searched for a sentence on Integrity's french page; "Le v√©rificateur de liens pour vos sites internet". The accented character appears in the source (correctly encoded) as é  If you search the body text rather than source, any such entities will be 'unencoded' before checking.

2. There's a global setting for what should be included / excluded from the visible text of a page. (Preferences > General > Content.)You may want to ignore the contents of navs and footers, for example. One of these options is 'only look at the contents of paragraph and heading tags'. If this is switched on, then quite a lot is excluded. Visible text may not necessarily be in <p> tags.

The search dialog above now has a friendly warning which informs you if you're searching body text and if you've got some of these exclusions switched on.

The download for Scrutiny is here and includes a free 30-day trial.

No comments:

Post a comment