A couple of Scrutiny support calls have recently been along the lines "Why is your tool reporting a number of http links on my site? All internal links are https:// Is this a bug?"
In both cases, an internal link did exist on the site with the http scheme. Scrutiny treats this link as internal (as long as it has the same domain) follows it, and then all relative links will of course have the http scheme as well.
1. The 'Locate' function is ideal for tracing the rogue link that shunts Scrutiny (and a real user of course) over to the http site. In the shot below we can see where that happened (ringed) and so it's easy to see the offending link url, the link text and the page it appears on. Does this useful feature need to be easier to find?
2. Does a user expect that when they start at a https:// url, that an http:// link would be considered internal (and followed) or external (and not followed) ? Should this be a preference? (Possibly not needed as it's simple to add a rule that says 'do not check urls containing http://www.mysite.com)
3. Should Scrutiny alert the user if they start at an https:// url and an http:// version is found while scanning? After all, this is at the heart of the problem described above; the users assumed that all links were https:// and it wasn't obvious why they had a number of http:// links in their results.
Any thoughts welcome; email me or use the comments below.
Scanning a website as an authenticated user is a common reason for people turning to Scrutiny.
The process necessarily involves some trial and error to get things set up properly, because different websites use different methods of authorisation and sometimes have unusual security features.
Scrutiny now has an important new feature. Some login forms use a 'security token'. I'm not going to go into details (I wouldn't want to deprive my competitors of the exasperations that I've just been through!)
There's a simple checkbox to switch this feature on (available since Scrutiny v6.4), and this may enable Scrutiny to crawl websites that have been uncooperative so far. (This may well apply to websites that have been built using Expression Engine).
Version 6.4 is in beta as I write this, if you're interested in trying it, please just ask.
Small print: Note that some care and precautions (and a good backup) are required because scanning a website as an authenticated user can affect your website. Yes, really! Use the credentials of a user with read access, not author, editor or administrator.
As flagged up in a recent mailing to those on the email list, A 50% offer will be running on Scrutiny for Mac for the rest of the month. The app is used by larger organisations and individuals alike. It seems fair to give the smaller businesses the opportunity to buy at a more affordable price.
Recent enhancements include 'live view' (shown below) and improved 'site sucking' page archiving while scanning.
So here it is - for 50% discount for the rest of February, please use this coupon.
This isn't a key, click 'buy' when the licensing window appears and look for the link that says 'check out with coupon and use the code above for a 50% discount.
I hadn't appreciated What a complex job sitesucker-type applications do.
In the very early days of the web (when you paid for the time connected) I'd use SiteSucker to download an entire website and then go offline to browse it.
But there are still reasons why you might want to archive a site; for backup or for potential evidence of a site's contents at a particular time, for two examples.
Integrity and Scrutiny have always had the option to 'archive pages while crawling'. That's very easy to do - they're pulling in the source for the page in order to scan it, why not just save that file to a directory as it goes.
Although the file then exists as a record of that page, viewing in a browser often isn't successful; links to stylesheets and images may be relative, and if you click a link it'll either be relative and not work at all, or absolute and whisk you off to the live site.
Processing that file and fixing all these issues, plus reproducing the site's directory structure, is no mean feat, but now Scrutiny offers it as an option. As from Scrutiny 6.3, the option to process (convert) archived pages is in the Save dialogue that appears when the scan finishes. Along with another option (a requested enhancement) to just go ahead and always save in the same place without showing the dialogue each time. These options are also available via an 'options' button beside the 'Archive' checkbox.
When some further enhancements are made, it'll be available in Integrity and Integrity Plus too.