Thursday, 21 June 2018

Small but important enhancement to Scrutiny's 'insecure content' reporting

An example came to light this week of a link from an https:// to an insecure page on the same domain that Scrutiny had been missing.

If you're scanning a secure site, Scrutiny can help you migration to a secure site by reporting links to insecure pages, and also pages which use mixed / insecure content (js, css, image files which are http://).

In this case, the target of a link was secure, but that link was redirecting to an insecure page. This particular situation had gone unreported in the insecure content table simply because of the order that Scrutiny was doing things.

Scrutiny 8.1.2 contains this and some other fixes / enhancements

Wednesday, 6 June 2018

WWDC18. Who is Number 1?!

For the record, I think Craig is a superstar, and I'm genuinely into the dark mode and dynamic desktops.

But Monday's WWDC left many questions unanswered, like why were the guys on stage all wearing those shoes with the white soles? Was it a tribute to Patrick McGoohan's The Prisoner, where the purpose of 'The Village' was to break Number 6's will to be an individual? Surely not.

Seriously though, MacOs Mojave (which sounds close enough to Mojito to raise my pulse)  excited me more than any new OS since Mavericks. The cartoon's more about the fact that the latest, most breathtaking technology is carried around in people's pockets for the most trivial purposes (singing poop emoji, anyone?) More of that another time.

How did that ever work?

Don't judge the code - this is a tool that was written many many years ago, and it's only a simple thing for personal use. So it was only ever developed to the "it just about works" standard and the project has been copied from computer to computer ever since, receiving the minimum updates to keep it running.

With the beta of 10.14 'Mojave' installed on a mac (which sounds enough like 'Mojito' to raise my pulse) unsurprisingly I started to notice a few things not working as before.

I love finding and fixing problems, so the regular round of fixes with each release of OSX / MacOS  is no hardship. It's particularly fun when you have a "how did that ever work?" moment.

NSArray *pages = [fileManager directoryContentsAtPath:pageLocation];   // NB directoryContentsAtPath: was apparently deprecated in 10.5

if([pages count]==0){return;}
for (c=0;c<[pages count];c++){
// foreach page in the pages directory
thisPage = (NSString *)[pages objectAtIndex:c];
if([[thisPage substringToIndex:1] isEqualToString:@"."]==NO){
// do stuff, ignoring hidden files
[collection addObject:thisPage];


The resulting list is displayed in a tableview and has always appeared in alphabetical order.

So not only is directoryContentsAtPath: still apparently working after being deprecated such a long time ago, apparently it used to return the directory listing sorted in alphabetical order, and no longer does.

It was easy to add [collection sortUsingSelector:@selector(caseInsensitiveCompare:)];
to restore the list to alphabetical order (collection being an NSMutableArray containing NSStrings) but I'm just surprised  that it wasn't necessary before.

The documentation for directoryContentsAtPath: doesn't mention that the return array is sorted, so it should never have been taken for granted. But hey, if something works the way you want it, you don't always think any further.

To bring this up to scratch, the suggested alternative to directoryContentsAtPath: is 
contentsOfDirectoryAtPath:Error:  so getting rid of that warning is really easy, just declare an NSError object and pass it in. And then report the NSError's 'localizeddescription' if it contains a non-null value. Or simply pass nil as the error: parameter if you feel lazy or don't care about the success of the operation.

Monday, 4 June 2018

Test HTML validation and accessibility checkpoints of whole website

Didn't Scrutiny used to do this?

Yes, but when the w3c validator's 'nu' engine came online, it broke Scrutiny's ability to test every page. The 'nu' engine no longer returned the number of errors and warnings in the response header, which Scrutiny had used as a fast way to get basic stats for each page. It also stopped responding after a limited number of requests (some Scrutiny users have large websites).

Alternative solutions

After exploring some other options (notably html tidy, which is installed on every mac) it appears that the W3C service now offers a web service which is responding well and we haven't seen it clam up after a large number of fast requests (even when using a large number of threads).

The work in progress is called Tidiness (obviously a reference to tidy, which we've been experimenting with).

It contains a newer version of tidy than the one installed on your Mac. However, the html validation results are useful but not as definitive as the ones from the W3C service.

So Tidiness as it stands is a bit of a hybrid. It crawls your website, passing each page to the W3C service (as a web service). If you like you can switch to tidy for the validation, which makes things much quicker as everything is then running locally. If you like, you can simultaneously make accessibility checks at level 1,2 or 3, with all of the results presented together.

This app is now available for you to try as a beta. It's early days, so please be nice. Here are some shots.

Saturday, 2 June 2018

Longevity of OSX

Seen here looking more like a desktop icon*, is the first release of OSX. It's now older than the classic Mac OS (up to 9).

I remember the sense of awe at the non-jagged icons, transparency, more realistic-looking shadows and the new traffic-light 'sweeties' in the top-right corner of windows.

It really doesn't look *that* dated. That Mail icon has hardly changed, just a bit more kiddie-coloured, and the magnification effect is still there.  Glassy-looking buttons were cool at the turn of the century, but by 10.6, the aqua look was looking a bit unnecessarily clumsy. The stripey background of windows and sheets didn't last so long. We had a weird dual-look with the windows. OS9 was already experimenting with the brushed aluminium look. Very different from the more plasticky look of the regular window borders and backgrounds. From memory, I think the human interface guidelines said that the aluminium look was appropriate where the window was to minic a control panel.

I really lament the passing of the 3D hyper-real-looking buttons and controls. I regularly use a couple of Macs on Snow Leopard and Mavericks.  I get the very 'clean' concept but when you can't immediately see whether some text on a plain white background is a button, input field or just some text, that's just plain unhelpful, however beautiful it looks.

Thanks to Jason Snell for the facts and figures:

* My clamshell iBook, the first Mac I had that ran OSX, had a screen 800 x 600 pixels. That's less than the highest resolution that we now make application icons (1024x1024)

Thursday, 31 May 2018

HTMLtoMD receives a well-overdue update

HTMLtoMD was a side project, put together using various elements developed for other apps, the website crawling engine and the HTML to Markdown converter.

Markdown is a very simple and transportable format - it's efficient for storage, and perhaps a great format to use when migrating one website to another.

HTMLtoMD has had a small but enthusiastic following, but over the years has become in need of an update. (MacOS / OSX has never been great for backward compatibility. Things stop working with each new version of the OS)

It's had the time needed to bring it up to speed. Version 2 has the most up-to-date version of the 'Integrity v8 engine', a bunch of things are fixed or improved, and a bunch of extra options have been added. I think it's looking good and working well. For the time being, it remains free and unrestricted.

Download and try it:

Options for archiving a website

Integrity / Scrutiny

Integrity (and Integrity Plus, Pro and Scrutiny) has long had an 'archive' option. It can save the html as it scans, originally with no frills at all. Recently I+, Pro and Scrutiny have received enhancement which mean that they can process the information a little to create a browsable archive.

It stops short of being a full 'Sitesucker' - it doesn't save images, for example, or download the style sheets etc. (It makes sure that all links and references are absolute, so that the site still appears as it should.) It was always intended as a snapshot of the site, automatically collected as you link-check, for the purposes of reference or evidence.


WebScraper for mac has loads of options and therefore it's not just 'enter a homepage url and press Go' like the other apps mentioned here. So it does allow you to do much more. You have much more control over what information you want in your output file, what format you want that in, and whether you want the content converted to plain text / markdown / html.


HTMLtoMD was a side project built using various functionality we'd developed in other apps. It scans a whole site and archives the content as Markdown. Once working, we released it for free and put it on the back burner.

Recently it's received more development. It's now up-to-date with the Integrity v8 engine, and has received some improvements to the markdown conversion via WebScraper. It can now save images and has more options for saving the information.

Again, it's not a Sitesucker. If you need to download a website for saving or browsing offline then use SiteSucker ($4.99), it's pointless us trying to reinvent that wheel.

But markdown has its advantages. It's a much more efficient way to store your content. It's just text with a little bit of markup (headings, lists etc). That also means that it's very transportable.

You may also find it a very readable format. See the shots below.