Thursday, 21 March 2019

Hidden gems: Scrutiny's site search

Yes, you can use Google to search a particular site. But you can't search the entire source, or show pages which *don't* contain something. For example, you may want to search your website for pages that don't have your Google Analytics code.

The usual options that you've set up in your site config apply to the crawl: Blacklisting / whitelisting of pages, link and level limits and lots of other options, including authentication.

Here's where you enter your search term (or search terms - you can search for multiple terms at once, the results will have a column showing which term(s) were found on the page).

There's a 'reverse' button so that you can see pages that *don't* contain your term. Your search term can be a regular expression (Regex) if you want to match a pattern. You can search the source or the visible text.

Method

1. Add your site to Scrutiny if it's not there already and check the basic settings. See https://peacockmedia.software/mac/scrutiny/manual/v9/en.lproj/getting-started.html

2. Instead of 'Scan now', open 'More tasks' and choose 'Search Site'
3. enter your search term (or multiple search terms) and review the other options in the dialog





Note

1. If you're searching the entire source, make sure that your search term matches the way it appears in the source. Today I was confused when I searched for a sentence on Integrity's french page; "Le vérificateur de liens pour vos sites internet". The accented character appears in the source (correctly encoded) as é  If you search the body text rather than source, any such entities will be 'unencoded' before checking.

2. There's a global setting for what should be included / excluded from the visible text of a page. (Preferences > General > Content.)You may want to ignore the contents of navs and footers, for example. One of these options is 'only look at the contents of paragraph and heading tags'. If this is switched on, then quite a lot is excluded. Visible text may not necessarily be in <p> tags.

The search dialog above now has a friendly warning which informs you if you're searching body text and if you've got some of these exclusions switched on.

The download for Scrutiny is here and includes a free 30-day trial.

Wednesday, 13 March 2019

Raspberry Pi Zero W - baby steps

I don't know why I've waited so long to do this. I love messing around with home automation, and here's a fully-functional computer with wireless and 20-odd GPIO (input-ouput) pins.

I've also been meaning to begin using Linux (technically I already do, I use MacOS's Terminal a lot, and have set up and run a couple of Ubuntu servers with python and mysql). The Pi has a desktop, USBs for keyboard and mouse, HDMI output. This one has 1Ghz processor 256Mb of ram and for a HD, whatever free space is on the card that you put in. All of this on a board which is half the size of a credit card and costs less than a tenner (or less than a fiver for the non-wireless version).

When we finally invent time travel, or at least find a way to communicate across time (as per Gibson's The Peripheral which I heartily recommend)  my teenage self will be astounded by this information. I remember the day when I first heard the word 'megabyte'. It wasn't far off the day that I felt very powerful after plugging an 8k expansion into my computer.

Anyhow, back to the plot. What I've learned so far is that that the 'less than a tenner' Raspberry Pi Zero W is 'bare bones'. I've bought a few bits and pieces that have cost much more than the computer(!) including a header for the GPI pins, a breadboard & components kit, pre-loaded micro SD (effectively the HD and OS), mini HDMI to proper-size HDMI adaptor.

Monday, 11 March 2019

Website archiving - Watchman's commercial release

[NB since version 2.1.0, we have had to make a slight change to the name, its full title is now Website Watchman.]

It has been a (deliberately) long road but Watchman for Mac now has its first commercial release.
This product does such a cool job that I've long believed that it could be as important to us as Integrity and Scrutiny. So I've been afraid to rush things. Version zero was officially beta, and a useful time for discovering shortcomings and improving the functionality. Version one was free. Downloads were healthy and feedback slim, which I take as a good sign. Finally it's now released with a trial period and reasonable introductory price tag. Users of version one are welcome to continue to use it, but it obviously won't get updates.

So what does it do? In a few words. "Monitor and archive a website".

There are apps that monitor a url and alert you to changes. There are apps that scan an entire website and archive it.

Watchman can scan a single page, part of a website or a whole website. It can do this on schedule - hourly, daily, weekly, monthly. It can alert you to changes. It builds a web archive which you can view (using Watchman itself or the free 'WebArchive Viewer' which is included in the dmg). You can browse the urls that it has scanned, and for each, view how that page looked on a particular day.

We're not talking about screenshots but a 'living' copy of the page. Watchman looks for and archives changes in every file, html, css, js and other linked files such as pdfs.  You can obviously export that page as a screenshot or a collection of the files making up that page, as they stood on that date.

A 'must have' for every website owner?

Try Watchman for free / buy at the introductory price.

Friday, 8 March 2019

SID tune for C64 homebrew game - part 1

My enthusiasm for this project has surprised even me, and to avoid this blog filling up with my ramblings about making this 8-bit game, and to keep all of those posts in one suitable place, I've moved this post and the others to their own blog.

This post has moved to:

https://newstuffforoldstuff.blogspot.com/2019/03/homebrew-game-for-c64-part-1-sid-tune.html






Monday, 4 March 2019

Website archiving utility, version 2

Watchman for Mac is a utility that can scan a single page or a whole site on schedule, it'll archive the page(s) and alert the user to any changes in the code or visible text.

As it builds its archive it's possible to browse the historical versions of the pages, as they stood on particular dates. It displays a 'living' version of the historical pages, with javascript and stylesheets also archived.

We've just made a version 2 beta available. It features a 'non-UI' or headless mode, which means that it can remain in the background and not interrupt the user when a scheduled scan starts. Its windows can still be accessed from a status bar menu.

https://peacockmedia.software/mac/watchman/

Version 1 is still available. It's free and will remain free. The new beta is also available and free at present, but version 2 will eventually require a licence.