Saturday 5 December 2015

Swinsian - zen-like alternative to iTunes

I'm very grateful to the friend who recommended Swinsian as a replacement for iTunes (thankfully its pretty yellow icon is more memorable than the name).

When it comes to "it just works" (which used to be the Mac way) then the team have it sussed. It was incredibly easy to import my music, with no appreciable disk space used. it just built its own library leaving the files where they were, which is exactly what I wanted. (I noticed later a Preference which allows copying of music into Swinsain's music folder). Plus it watches the music folder and just adds anything I buy via or add to iTunes.

Using Swinsian is such a zen-like experience. It's so nice to have a music player that only handles music. It does podcasts too (which I've switched off). But I hope they're not tempted to start adding more stuff and allowing it to become the nightmare that is the current iTunes.

I keep discovering nice little touches, like the 'animated dock icon'. It's no more than a dot that moves as a progress indicator, but that's really neat.

It supports many music types. Which is nice to know but I'm not sure whether anything in my collection isn't aiff, aac or mp3

There's no cover-flow which I know people still miss, but there are a number of ways to customise the display, including my favourite, the good old list with whichever columns you like. The font size in there is a little smaller than usual, which is great. (Spotify annoyed me greatly by taking the opposite approach - big spacious text so that you can't see much at a time and have to scroll lots.)

Buttons look like buttons! No symbols just printed on the textured surface of the window, or within the LCD display. Or words printed in an area of nothing that could equally be a field, a label or a button. The search box is nice and simple, it just filters your music in the display according to the word you type.

It's like having iTunes 10 back again!

There are one or two glitches - I have given up on the desktop widget which I couldn't get to work properly, also it would be nice if, when you're using shuffle, the display scrolled automatically to the track that's playing. But I'm sure these will get ironed out with time.

Expect Screensleeves support very shortly!

Sunday 29 November 2015

HTML Validation Runner

Earlier this year, the w3c validator switched to a new engine and although this returns results for html5, the service was no longer returning simple statistics in the http header. In short it broke Scrutiny's html validation feature. (Those using a local instance of the validator may have found that it kept working)

My options were to screen-scrape from the new w3c validator, find an alternative service (either web or something that could be included within Scrutiny) or to write my own.

In the absence of a 'quick fix', I replaced the full website validation feature with a 'single page validation' (context menu item in SEO table).

Feedback has been generally good, some users agreeing with my own feeling that because websites tend to be based on template(s), validating all pages isn't necessary. For this reason, it's unlikely that the new tool mentioned below will be part of Scrutiny, but I may include it in the dmg as a separate free app.

I have now found an alternative, that works well and doesn't rely on a web service or need anything extra installed. It does give slightly different results to the w3c validator, but I guess any two tools will give different results.

My first prototype is now working, if you are interested in being able to scan your site, validating the html for all pages, and would like to give this tool a whirl, please just ask.

All thoughts and comments are very welcome, email me or use the comments.

Friday 20 November 2015

Philips Hue bulbs, see your Hue-topia schedules as a gantt chart

The gantt chart mentioned in the previous post (as a new feature in LIFXstyle) is now built into Hue-topia
Hue-topia gives you a number of ways to schedule your lamps; groups, presets, dawn/dusk, and with a number of schedules set up, it may be difficult to see what's going on. The chart shows you any day or a week (bearing in mind that schedules can be set to happen on particular days).

It will be ready for beta testing soon. If you're interested in running it, please let me know.

Wednesday 18 November 2015

LIFX - see your schedules as a gantt chart

LIFXstyle gives you some very powerful tools for scheduling your lamps. Schedule individual lamps, groups to switch on, off or change colour. Set up presets and trigger those on schedule.

With a number of schedules set up (possibly overlapping) it's difficult to see exactly what's going to happen. Or if a lamp doesn't behave as expected, the reason may not be obvious.
Viewing your schedules as a gantt chart should help. View a day (24hrs) or a whole week. Click a block (either to the left or the right) to see the particular schedule which is responsible at that point.

This chart is now in alpha testing. If you'd like a copy for beta testing ahead of the main release, please get in touch.

Tuesday 17 November 2015

'Build your own bundle' at bundlehunt - includes Scrutiny

Website scrutinizer Scrutiny is one of 30 quality apps currently on offer at

Build your own bundle of 10 for $19.99

NB: Scrutiny v6 is currently in beta. Don't let this put you off - If you buy Scrutiny 5 now, Scrutiny v6 will be a free upgrade.

Wednesday 4 November 2015

Tracing http redirects

A new feature in the v6 beta of Scrutiny (from v6.0.2) is the ability to trace all the redirects of the http request / response.

If a url is redirected, it is usually redirected once but it can be two or more times. In these cases, a button will be available beside the 'redirect url' field of the link inspector. The button will display the number of redirects.

Pressing the button triggers a new request to be sent and a trace will be shown showing all of the redirects as they happen and the status codes.

Download the latest Scrutiny beta. (not recommended for existing production users - please wait until v6 moves to stable.)

Tuesday 3 November 2015

Which web pages does Scrutiny include in its SEO table and Sitemap table?

When asked that question recently I had to look through code to find definitive answers (not ideal!) and realised that the manual should contain that information.

It does now, and here's the list for anyone who's interested:

The SEO table will include pages that are:
  • html*
  • internal (urls with a subdomain may or may not be treated as internal depending on whether the preference is checked in Preferences > General)
  • status must be good (ie urls with status 0, 4xx or 5xx will not be included)
  • not excluded by your blacklist / whitelist rules on the Settings screen
  • your robots.txt file will be observed (ie a page will be excluded if it is disallowed by robots.txt) if that preference is checked on the Settings screen (below Blacklist and Whitelist rules)
The sitemap table will include a subset of those pages - so in addition to the above, the following rules apply:
  • will include pdfs (if that preference is checked in Preferences > Sitemap)
  • not excluded by your robots.txt file (if that box is checked in Preferences > Sitemap)
  • not excluded by a 'robots noindex' meta tag (if that box is checked in Preferences > Sitemap)
  • does not have a canonical meta tag that points to a different url
As always, I'm always very happy to look into any particular example that you can't make sense of. 

Scrutiny's SEO results table

* this doesn't mean a .html file extension, but the mime type of the page as it is served up. Most web pages will be html. Images are shown in the SEO results but in a separate table which shows the url, page it appears on, and the alt text.

** Although Integrity Plus doesn't display SEO results (at present) it does display the same sitemap table as Scrutiny and all of the rules above apply.

Friday 30 October 2015

Scrutiny v6 beta

The most major change isn't a visible one. Scrutiny 6 is fitted with the 'v6 engine' which is faster and more efficient than before.

Robotize has been an experimental app which allows you to see your web pages as a robot does (including search engine bots such as Googlebot). Text-only, linearised, headings as an outline, image alt text etc.

There's a context menu item for robotize within the SEO results table.
Or you can open a robotize window any time from the Tools > Robotize (cmd-3) or from the Tasks screen.

Once open, a robotize window (you can open multiple) functions as a browser, with address bar, forward / back and refresh buttons.
It helps with various WCAG Accessibility Guidelines:
 - See your images listed with alternative text (Guideline 1.1.1)
 - Check your content's information and structure 'linearised' (Guideline 1.3)
 - See your links listed by link text and url (Guideline 2.4.4 & 2.4.9)
 - See headings listed separately and in context (Guideline 2.4.10)

The last main addition is Siteviz. Another previously-experimental app, now more mature and built into Scrutiny.

Integrity and Scrutiny have long exported a .dot (visualisation) file which can be opened using a graphing app. SiteViz is a free (still beta) app which opens such files. Only missing from Scrutiny is SiteViz's 3D theme. (This relies on technology which is 10.7+ and Scrutiny is still supported on 10.6 (Snow Leopard).) For the 3D theme you can still export the site as .dot and open using the free SiteViz.

Otherwise, simply run a scan for Sitemap, then switch to the 'Visualisation' tab

Both the current stable version and the new beta are available at Scrutiny's home page.

The beta is currently free. When the licensing is implemented, v6 will be a free upgrade for existing licence holders.

Saturday 17 October 2015

Scrutiny 60% discount until 28 Oct

Website scrutinizer Scrutiny is currently on offer over at Macupdate - 60% discount until 28 Oct.

Scrutiny offers the link-checking functionality of Integrity and Integrity Plus, XML sitemap generation, spelling and grammar checking, website search and much more.

Read more watch a video and take advantage of the offer

Sunday 27 September 2015

Breakthrough in wikipedia spidering project - 3 million links checked

Today a scan of 3 million links finished. That in itself isn't a breakthrough because I've previously made such a scan, but this time thanks to a very small change at the very heart of the new v6 crawling engine,  at the end of this 3 million link scan, the app was still working within expected resources and both the app and the Mac were still responsive.
 That's a very large number of broken links? But it does seem that this is a true result:
So we're now 'game on' for a 5 million link crawl. The aim of the game is to find out whether the 'six degrees' theory is true (whether you can really reach any of the 5 million English language pages within six clicks).

Friday 18 September 2015

Generating an XML sitemap for your website

This new video explains XML sitemaps and demonstrates how to generate one for any site using Integrity Plus. It also looks at some of the options.

Wednesday 16 September 2015

Using Scrutiny to make important checks over your EverWeb website

In this new video we take a look at how to use Scrutiny to make some important UX and SEO checks over your Everweb website.

The checks themselves apply to any website, but EverWeb makes it easy to correct the issues, as we demonstrate here.

This tutorial uses Scrutiny for Mac and the EverWeb 'drag and drop' content management system.

Links become broken over time (link rot) so a regular link check is important. Everweb helps with this issue because it manages the links in your navigator, but links in your content are still vulnerable to your own or external pages naturally being moved, changed or deleted. Fortunately, finding them and fixing them is easy, as demonstrated.

The title tag and meta description are very important (and a good opportunity) for SEO. Scrutiny will highlight any that are missing, too long or too short. EverWeb makes it a breeze to update these where necessary.

Alt text for your images is also important (depending on the image). Once again, Scrutiny can highlight any potential issues / keyword opportunities and the video shows you how to update your site.

On the subject of keywords, it's very important to do your keyword research and ensure that your pages contain a reasonable amount of good quality content. Once Scrutiny has scanned your site, you can see any pages with thin content, keyword stuffed pages, check occurrences of your target keywords and even see a full keyword analysis:

Wednesday 9 September 2015

Integrity, Integrity Plus and Scrutiny - updates to fix recent mysterious crashes

Since earlier in September there has been a number of support requests with the following pattern:

  • Integrity or Scrutiny being run on Yosemite (seems fine on Mavericks or El Capitan)
  • The app quits at a consistent point in the scan
  • The site contains links to "" or ""
  • The crash report shows that control is with the system rather than Integrity or Scrutiny at the point of the crash
  • Integrity and Scrutiny have been very stable for a long time, this problem will have started recently and without you changing anything

The google links work in a browser without the browser crashing, even with cookies and js switched off (which is how Integrity and Scrutiny generally send their requests).

Having traced this to the Google links, I then narrowed it to the User-Agent string. The problem seems to be with sending an http request to these urls with the default user-agent string for Integrity or Scrutiny. (By default my apps are honest about their identity.)

I have two workarounds. The first is easy - go to Preferences and switch the User Agent string to one of the browsers.

The second is to blacklist these links by setting up two rules saying:

Do not check urls containing
Do not check urls containing

I've just released new versions of Integrity and Integrity Plus (v5.4.2) and Scrutiny (v5.9.12) which contain a little hack meaning that you can continue using the default UA string for Integrity or Scrutiny and the scan will complete with these links being tested.

Tuesday 8 September 2015

The cosmic watchmaker and the genetic algorithm

Reading about artificial neural networks has been a life-changer. That led on to the unexpected topic of the genetic algorithm, which is very effective by itself (i.e. without using a neural network) at solving tricky problems.

After working through my first example I was really astonished to find that applying rules like crossover and mutation (mimicking our own reproduction) in a population of initial random data, you arrive at a very fit population (i.e. gets you lots of good answers very close to the best answer) in remarkably few generations.

This is really profound stuff and I'm more excited than I have been since I first came to conventional computing in the mid-80s. If you watch a trace of your data 'evolving', it's perfectly obvious why we reproduce sexually - anything which reproduces this way can become fit for a new environment or solve a problem in remarkably few generations. 

It's also clear that you can produce something very distinct (all depending on your test for fitness) from random data very quickly. Thus the intricate watch found on a beach (as long as it could be the product of rules such as genetic crossover and mutation and meets a need perfectly) isn't so remarkable.

To demonstrate all this, and just a fun exercise in this stuff, is a face (a very special one) evolving from random noise. 

Without going into too much detail (that's all here) this exercise starts with a 'population' of chromosomes made of random numbers, which represent pixels in the images. For each new generation, the rules of crossover and mutation* and a test for fitness are applied. 'Survival of the fittest' isn't a good description of what really goes on in the natural world or in our algorithm here. Instead, individuals are randomly selected for reproduction with a bias towards the fittest.

In the human population, 'fittest' means thriving and feeding yourself until you can reproduce. Here, it means how closely each chromosome looks like the target picture. (Think females choosing males with a highly decorative tail).  (Target picture is shown on the right for reference.)

The infinite monkeys concept isn't helpful here either. Sometimes an answer to a solution can appear out of random data (the bigger the population, the more likely) but this isn't generating random data until the right answer is found, it's starting with random data and applying some rules to work towards and very quickly arrive at the right answer.

For a smoother animation, the picture in this video isn't the best picture from each generation, but an average of all of them. It shows that a whole population becomes very fit in a short time, rather than just a few outstanding individuals getting very fit and then passing on their genes as the terms 'survival of the fittest' or 'natural selection' imply.

* We tend to associate genetic mutation with disease, but it's an important part of this process. 
** My title is a response to the 'cosmic watchmaker' argument. It's a terrible analogie for many reasons, not least of which is that a timepiece is obviously a manufactured tool (like a flint axe) and not a living, reproducing being. But I'm really not interested in the religious argument, only in the uses for these amazing, almost magical techniques.

Thursday 20 August 2015

Performing a content audit using Scrutiny

I've just seen this post by Sean at SEO Hacker about conducting a content audit.

There's lots of useful advice - checking for keywords in page titles, length of meta description,  length of page title, thin content, grammar check, broken links, avoiding keyword stuffing, images with no alt text.

Sean's article expands on all of these things and is well worth a read.

But he begins by making a spreadsheet listing all of your pages and copy-and-pasting each page title and other information manually.

Scrutiny can do this for you, furthermore, at the touch of a button it can show you pages that may need attention with regard to many of the problems above. (It has a free unrestricted 30-day trial.)

Here's the 'getting started' video once more, which focuses mainly on making a scan but does visit the SEO results from where you can use the filter button and keyword search box to perform the checks above. Or if you like you can export to CSV and open in a spreadsheet to do more of a visual check as Sean suggests, saving you the copying and pasting.

Tuesday 18 August 2015

html validation with Scrutiny

Problem: Because of some changes with the w3c validator, Scrutiny is no longer receiving the expected response back from the public instance of the validator.

At present, it seems a little random which w3c server will deal with your request and therefore whether Scrutiny shows results. If you have this problem then the validation results will have empty warnings and errors columns:
Workaround: The Validator S.A.C. application is free to use, and installs a version of the w3c validator locally for you. It seems that the app contains a version of the validator that is compatible with Scrutiny. With a little setting up, Scrutiny can use this local instance of the validator and will show results once more.

This has always been better than using the public instance because there are no limits (the public instance had a limit of 100 pages in one batch) and you can go to Preferences > Validation and set your delay to zero, with impunity, thus validating all of your pages much faster:

The Validator S.A.C application can be found here:

After downloading and running, follow the instructions below 'Advanced Topics' on that page.

To switch Scrutiny over to use your local instance of the validator, go to Preferences > Validation and click the text 'http://localhost/w3c-validator' and the address will be pasted into the location box.

(NB after making that switch, it may be necessary to quit and re-start Scrutiny)

Sunday 9 August 2015

Wow, it's amazing when stuff like this happens. I don't know who bowlerboy is and this certainly wasn't solicited or engineered in any way. From MacUpdate's Scrutiny page.

Wednesday 5 August 2015

The Integrity v6 engine - a website crawling / spidering engine for developers

A major change for Scrutiny v6 will be 'under the hood' (or bonnet as we say here in the UK). It'll separate the engine from higher functionality and the UI.

This will mean that I can make it available to other developers to use in their own tools.

More hundreds of hours than I care to calculate have gone into developing the code which scans a website accurately and fully. It works well and I'm proud of that - parsing arbitrary (and sometimes non-compliant) html code is full of pitfalls.

For Scrutiny users, v6 will be a step forward, and may not be a paid upgrade. It will incorporate tools such as the .dot visualiser and the text-only engine (currently both available as SiteViz and Robotize)

For developers, I'd love to make the engine available. I feel that my strengths lie in building and maintaining that engine, more so than building a UI or producing and marketing a consumer product.

Work is underway. I'm not sure whether it'll be a cocoa framework (with friendly API), a CLI tool (which of course can be used in a Mac app) or both.

It won't be open source. I want to provide a free version and license a faster, paid-for version.

If you're interested then please let me know, it's shiela[at] Work is now underway and it would be really useful to have an indication of interest and thoughts on how best to make the engine available.

[Update]  The engine is now available in a demo xcode project which should give you the info you need to use it.

Tuesday 28 July 2015

403 'forbidden' server response when crawling website using Scrutiny

The problem: Scrutiny fails to retrieve the first page of your website and therefore gets no further. The result looks like this (above).

The reason: By default Scrutiny uses its own user-agent string (thus being honest with servers about its identity). This particular website (and the first I've seen for a long time to do this) is refusing to serve the website without the request being made from a recognised browser.

The solution: Scrutiny > Preferences > General

The first box on the first Preferences tab is 'User agent string'. A button beside this box allows you to choose from a selection of browsers (this is called 'spoofing'). If you'd like Scrutiny to identify itself as a browser or a version not in the list, just find the appropriate string and paste it in (if you can run your chosen browser, you can use this tool to find the UA string)

With the User agent string changed to that of a recognised browser, this problem may be solved.

Tuesday 21 July 2015

Getting started with Scrutiny - first video

A milestone! I can't tell you how please I am with our first instructional video.

It's a quick tour of Scrutiny for Mac, performing a basic link check, reading the results, discussing a few settings and some troubleshooting. Much of this will be relevant to Integrity and Integrity Plus.

... top marks to tacomusic who has just become the voice of PeacockMedia!

Apple Music not playing nicely

[update 16/8/15] This issue now seems to be resolved in iTunes

21 Jul 2015

Screensleeves, the album art screensaver for Mac, is currently having trouble displaying the artwork and track details when the new streaming service (Apple Music) is being used.

Applescript (the sensible way for applications to talk to each other) is often overlooked - there are ongoing problems with Spotify, although it's been possible to work around most of these. But it's particularly disappointing when Apple themselves neglect their own scripting interface.

iTunes' 'current track' seems broken when it comes to the new music service as developers of other apps have reported.

I hope that this is 'teething trouble' with the new service and can only suggest installing updates when they're available.

Wednesday 15 July 2015

Scrutiny included in July Mac Bundle

Scrutiny 5 - Improve your website's quality, SEO and user experience. Find out more

A brief message because PeacockMedia's flagship application, Scrutiny, has been included in this bundle at short notice and the clock is already running. 

For 14 more days, Scrutiny is one of eight quality Mac apps available at a ridiculous discount in BundleCult's July bundle.

Details of how to take advantage of this offer

Thursday 9 July 2015

Using Integrity to scan Blogger sites for broken links - some specifics

I've recently been helping someone with a few issues experienced when testing a Blogger blog with Integrity.

Some of these things are of general interest, some will be useful to anyone else who's link-checking a Blogger site. These tips apply equally to Integrity Plus and Scrutiny.

1. Share links being reported as bad

You may have these share links at the bottom of each post.
As you'd expect, they redirect to a login page, so no danger of Integrity actually sharing any of your posts. The problem comes when you're testing a larger site with more threads. These links may eventually begin to return an error code. I don't know whether this is because of the heavy bombardment on the share functionality, or whether Blogger is detecting the abnormal use. Either way, you may begin to get lots of red in your results.

One solution is to turn down the number of threads to a minimum. This isn't desirable because the crawl will then take hours. A better solution is to ask Integrity not to check those links (it's pretty certain that they'll be ok).

(Note: Even though these link use a querystring with parameters, checking 'ignore querystrings' won't work because these links have a different domain to the blog address, thus they look like external links and the 'ignore querystrings' setting only applies to internal links.)

Add a 'blacklist rule' using the little [+] button (screenshot below). Make a rule that says 'do not check urls containing share-post'
While here, add similar rules for 'delete-comment' and 'post-edit'. It was a concern to see these urls appearing in my link-check results. They do indeed appear in the pages' html code, although they're hidden by the browser if you're browsing as a guest. But no need to worry - as you'd expect, they also redirect to a login screen and Integrity isn't capable of logging in. *

2. A large amount of yellow

Integrity highlights redirected urls in yellow. Not an error but a 'FYI'. Some webmasters like to find and deal with redirects, but the Blogger server uses redirects extensively and it's just part of the way it works. When testing a Blogger site, you will see a lot of these but it's not usually something you need to worry about.

If you like, you can change the colour that Integrity uses to highlight such links - you can change it to white, or better still, transparent. See Preferences > Views and then click the yellow colour-well to see the standard OSX colour picker with an 'opacity' slider.

3. Pageviews on your website

Given that Google Analytics uses client-side javascript to make it work (meaning that crawling apps like Integrity don't trigger page views **) I was surprised to find Integrity triggering page views with a Blogger site. I guess it counts the views server-side.

It seems that changing the user-agent string to that of Googlebot stopped these hits from registering.

The user-agent string is how any browser or web crawler identifies itself. It's useful for a web server to know who's hitting on it.

Posing as Googlebot by using the Googlebot user-agent string:
Mozilla/5.0 (compatible; Googlebot/2.1; +
... seems to work - it prevents hits from triggering page views in Blogger's dashboard

Deliberately using another string (known as 'spoofing') is technically mis-use of the user-agent string, but until Google recognises Scrutiny and Integrity as web crawlers, then I think this is forgivable. If you'd like to be a little more transparent then I've found that this alternative also works:
Integrity/5.4 (posing as: Googlebot/2.1; +

I will be shortly building this Googlebot string into the drop-down picker in Preferences. In the mean time just go to Preferences > Global and paste one of those strings into the 'User-agent string' box.

* neither Integrity or Integrity Plus are capable of authenticating themselves, in effect they're viewing websites as an anonymous guest. Scrutiny is capable of authentication, it's a feature that's much in demand (if you want to test a website which requires you to log in before you see the content) but the feature must be used with care - it's not possible to switch it on without seeing warnings and advice.

** I guess that Scrutiny could trigger page views when its 'run js' feature is switched on, though I haven't tested that

Saturday 4 July 2015

New view in Integrity / Scrutiny groups links by status

I don't know how it's taken so long to do this (Integrity has been around since 2007)
Some people are interested in the redirect code and like to sort those out. Other people just care about the final status of the page after redirection. No problem, choose initial status, final status or the combination which you're used to seeing.
The 'bad links only' button will work on this view just the way you're used to. This view can be exported, perhaps you want to expand to show just one  particular status, or just 5xx codes for example, before exporting.

Finally, one more thing is worth mentioning here as we've touched on redirects. Some people are tasked (or task themselves) with making a list of all redirected urls (3xx). Integrity, Integrity Plus and Scrutiny will achieve the task using the techniques above. But the most effective way to achieve this is to use Scrutiny's filter button for 'redirects'. Shown below is the flat view sorted by status but the new 'by status' view will do just as well. When exported to csv, any filter or search will be respected.

Wednesday 1 July 2015

SiteViz updated with 3D theme

The video in the last post isn't terribly clear, so here's a static screenshot of the new 3D theme in SiteViz.
I've released a new version of SiteViz today - still beta - it needs work - but you're more than welcome to try it and feed back. It opens the sitemap visualisation file generated by Integrity Plus and Scrutiny and displays it in a number of ways

Sunday 21 June 2015

Surprising SEO results with a personal blog

A small but perhaps very useful enhancement to Scrutiny (in progress right now) is to remove the limited summary here on the SEO results screen (previously it just gave numbers for pages without title / meta description) in favour of a more comprehensive summary:
Screenshot showing Scrutiny's SEO results table plus the new summary text

Bit of a surprise with this one (a personal blog).

Previously with Scrutiny you've had to use the filter button to visit the results for each test in turn. Now the list is just there at a glance and I guess I haven't been very vigilant here - I wasn't aware that blogger don't automatically stick in a description and I guess I've always been too excited about each new blog post to worry about image alt text....

Wednesday 10 June 2015

Visualising a website in 3D

This is very much a work-in-progress but here's a sneaky peeky at the 3D functionality I'm experimenting with for SiteViz. This is the Peacockmedia site exported by Scrutiny and visualised in 3D by SiteViz:

Friday 5 June 2015

Spidering wikipedia

I've reached a milestone in my 'crawling the English Wikipedia project'. (I'm hoping to find out whether the 'six degrees' principal is true*.) Scrutiny has now managed a scan taking in 3 million links which includes 1.279 million pages in its sitemap results. This is the largest single scan I've ever seen any of my applications run.

My instance of Scrutiny must have been feeling very enlightened after parsing this eclectic raft of articles including Blue tit, Conway Twitty, Wolverine (character) [yes, there are a surprising number of other Wolverines!], Benjamin Anderson (adventurer) and Personal Jesus.

The most fascinating thing about this crawl is that out of the pages scanned here, the article with the most links (excluding a few unusual page types) is alcohol. It has over 6,000 hyperlinks on its page**  This suggests that we have more to say about nature's gift of fermentation than about World War Two, which has two thirds the number of links.

*The uncertainty here is that if you imagine a node structure with each node linking to, say, 100 pages, then you can reach a million pages in three clicks. But those aren't a million unique pages. The number of previously-undiscovered pages diminishes with each page parsed

** this does include 'edit' links and citation anchor links. For future crawls I'll blacklist these for efficiency.

Thursday 4 June 2015

Internal backlinking

The graphs for this website's sitemap are unusual and very attractive.

Here's the 'Daisy' themed graph. perhaps more attractive but maybe less obvious what's going on.

So what *is* going on here? Upon investigation (aka switching on labels by clicking a button in the toolbar)...

... the 2015 pages are all linked from a page, two clicks from home, called 2015 (and only from that page). On that page is a link to a page called 2014 an on that page are links to all 2014 pages plus a link to 2013 pages and so on.

Visiting any of these pages makes it clearer. This is an unusual kind of pagination, a little like scrolling to the bottom of some content and clicking 'more' to load older content. From a user point of view it does work very well. Everything's really obvious, no-one's going to struggle to find the older content, it'll just take more clicks.

So is this a problem? The pages are all discoverable, so no problem there. But some might say that this site isn't making the best exploitation of internal backlinking. In this particular case I don't think it matters, these are reports going back in time, it's unlikely that a visitor is as interested in older reports than the newer ones.

Any other thoughts on the analysis of these graphs or thin internal backlinking - please comment.

(graphs generated by SiteViz, using sitemap files generated by Scrutiny)

Yesterday's post in this series: analysing the structure of larger sites

Wednesday 3 June 2015

What does Amazon's website structure look like?

While discussing the visual analysis of website structure with a Scrutiny user, he mused that it would be useful to see what a successful website such as Amazon would 'look like'. Well here it is:

The eyeball shape is completely unintended and unexpected, and I think really funny. (And slightly ironic.)

In fact this isn't the real picture at all. It only shows pages (of within 2 clicks from home, not because all pages on the site lie therein but because traversing 100,000 links and including 1,000 pages in this chart barely scratches the surface of the website. (There are ~120 links on the homepage, if every page has an average of 100 pages (it does) then given Scrutiny's top-down approach, it would need to include 10,000 pages in this chart just to reach the 'escape velocity' of the second level). This project is on the back-burner for another day in favour of some smaller commercial sites.

NB the placement of each page in this chart is based on 'clicks from home', not necessarily 'navigation by navbar' or directories implied by the urls.

Other sites

Here are a couple of sites, crawled to completion, to see how successful commercial sites appear.

The first is my favourite clothes retailer
There are a relatively small number of pages 4 clicks from home, but the vast majority of the product pages can be reached within 3 clicks. Based only on this blogger's history of using this site, it *is* more usual to browse than to search with this type of site.

Next up is my favourite shoe site. Again crawled in its entirety.
Very similar, especially if we take into account that it has fewer pages than the clothes retailer.

9 circles

Finally in this tour, for comparison, here's the site of a local authority (middle-tier local government). These are not commercial organisations and not generally renowned for the user-friendliness of their websites.

This '9 circles of hell' does extend outwards and outwards beyond this screenshot. Though to be fair, all of the actual website content is 6 clicks from home or fewer*. After that we're into pages of planning documents etc.

These graphs are analogous to browsing the site. (I have some experience in local authority websites and it is more common than you'd think for users to browse rather than search.) If, in the real world, the search box is used, then the user is 2 clicks from home. If the user starts with Google, then the user potentially lands on the page they need (assuming the page is indexed). But the object of this exercise is to see how successful websites are organised in terms of their link structure and see what we can learn. These three sites have a similar number of pages**

I'm working on some other ideas, so please keep an eye on this blog: besides the Amazon project I'd love to crawl the entire English content of Wikipedia to see whether the 'six degrees' game holds true. I believe this is feasible, I've now successfully made a crawl up to a million links (which included half a million pages in the sitemap) so I don't think the 4-point-something million articles is out of the question.

These websites were crawled by Scrutiny and the graphs generated by SiteViz, a tool I've been working on for a long time to view the .dot files generated by Scrutiny and Integrity Plus. SiteViz is very new and in beta. Other graphing applications can also open Scrutiny's .dot files.

If you have any other thoughts on what we can learn from these charts and figures, please leave them in the comments.

* to be clear, pages are shown here at the fewest number of clicks possible from the starting page (as far as Scrutiny was able to discover)

** in the same ballpark; ~3,000 for the shoe site, ~6,500 for the clothes and ~5,000 for the local authority

Monday 25 May 2015

Website structure visualisation clearly shows up problems

I've been working on a tool to display website visualisations from .dot files such as those generated by Integrity+ and Scrutiny.

Viewing the website structure in this way can really help to spot problems. For example, while working on the new app I spotted a couple of pages on one of my sites that seemed to have more internal links than others in their level. Turns out those two have out-of-date navigation menus.

Look out for SiteViz, work is progressing well. If you have any thoughts about website visualisations (or displaying .dot files generally) then I'd love to hear from you - either in the comments or any other way that suits you.

Sunday 17 May 2015

Me and the Mac App Store...

.. aren't getting on. I think it's time for the "we need to talk...." conversation.

I was surprised this morning to find that none of my apps are showing on the Store, something to do with contracts (I think they may want an annual fee from me, it may be my fault that it's lapsed)
So I duly clicked the 'agreements...' link, expecting to pay them more money and get my stuff back online, but see this...

It just doesn't feel like Apple are the friendly company they were when we met. We've grown apart.

It also feels like there are three of us in this relationship now. I really think Apple cares more about *her* (that plain and two-dimensional cow, iOS)

I need some space. I'm going to take some time out.

To be frank, getting an app available on the MAS is a real ball-ache (and I haven't even got any).

Submitting an app takes time. There are so many hoops to jump through and potential problems right there. Once submitted you sit back for maybe a couple of weeks waiting for Apple to get around to inspecting it and you may well get a rejection.

Once you've had a few rejections in a row, sometimes your own fault, sometimes not, and then run into a stupid submission problems that you can't get over, it simply becomes not worth the time and effort. This is what happened with Philips Hue app for OSX Hue-topia. Apologies to those who originally obtained it from the Store, but I've been more than happy to give those people a licence for the web version when they've asked.

(That's not to mention sandboxing, a security measure which is compulsory for apps available on the Store. It also makes certain features impossible).

If it weren't for the fact that I really want to make an iOS app to support my business software Organise (and have already spent a lot of development time) then I'd happily walk away right now and worry solely about my web sales (thank you Paddle for being easy to use and incredibly supportive). The whole monopoly thing, making the app store the only way to obtain iOS apps was a cold and calculating move. (Oh sure, 'it's all about security'. And you taking 30% of sales.) With each release of OSX I'm half expecting the same thing there.

Let me be absolutely clear to my wonderful users, lovely people all. My commitment to my apps is 110%. I will continue to develop and support, the only thing that's in question is my future with the Store. Irreconcilable differences and all that.

Wednesday 6 May 2015

Webmaster tools suite 50% offer

Flagship application, Scrutiny, suite of webmaster tools, is used by larger organisations and individuals alike. (Read more about Scrutiny here) It seems fair to give the smaller businesses the opportunity to buy at a more affordable price.

For two weeks from 8 May, Scrutiny was on offer at 50%. The offer was run by MacUpdate and is now over.  

Why not download now anyway and take advantage of the free and unrestricted trial? 

If you're unable to run Scrutiny or don't want to buy, install, understand and run the software, a one-off standard or custom website report will cost less.

Tuesday 21 April 2015

Further problems with Spotify and Screensleeves

Recently people have been reporting that the Screensleeves album art screensaver is displaying some information (track name, artist, album) but not all (album art is missing, time elapsed/remaining) when using the Spotify player.

Recently there has been a problem where Screensleeves has been unable to detect that Spotify is playing (and displaying the 'Paused' message). If this is your problem then the problem is at Spotify's end and here is the workaround.

The newer problem involves Screensleeves recognising that Spotify is playing and some information being displayed, but not the album artwork and certain other information. This latest problem is also at Spotify's end, they are aware and say that they're working on it.

Screensleeves is capable of fetching album artwork from the web if none is supplied by the player, though this may or may not be kicking in. I suggest checking Screensleeves' options in System Prefs and making sure that 'If no artwork is available' is set to 'Attempt to find cover image from the web'.

Once again, I hope that Spotify will be fixing the problem soon and things will be working as they should once more. Through all of this, other music players have been working fine as far as I'm aware.

Thursday 16 April 2015

New domain

Luckily, moving domains on the web doesn't involve cardboard boxes and heavy lifting but it can mean things being lost or broken. (Though I've been lucky in being able to use Scrutiny to clean up the links on the new site. Its new feature - checking the canonical href of each page - is a result of a subsequent need on my part).

The website is moving from the domain (which I've used since about 2000) to a new  home. This is because Google is treating the domain as targeting the UK and there seems to be no way to change that.

I'm not sure how I feel about the vast number of new top-level domains. A couple of years ago I failed in a sniping match to secure as it became available for registration (and was then offered it for hundreds of pounds, which I wouldn't pay out of principle - there's something really wrong in all of this). I feel that it'll be more difficult for consumers to tell a legitimate website from a fraudulent one from a url. I also rather cynically feel that it's a money-making exercise - businesses (legitimate businesses as well as speculators) will feel obliged to snap up various suffixes.

But I'm hoping that the more international new domain will get better recognition by Google in countries other than the UK. Time will tell.

Thursday 19 March 2015

Panic button for switching on connected bulbs

Problem: There have been occasions when I've been woken up in the middle of the night by a sound, maybe something innocent, but it hasn't always been. In a half-asleep state I want to be able to reach out and hit a button to switch on all of my lights.

I have many connected bulbs, some LIFX, some Hue, so this shouldn't be a big deal.

Here are the options I can think of:

iPhone: This is the most obvious solution, but is actually the most clumsy. There are two apps I have to get into to switch on all of my lights. (Though I could make a single app to do this.) From a locked state that's a lot of pressing and tapping, and doesn't touch-ID always fail to work when you most need it to? It's certainly a long way away from simply reaching out and fumbling for a big 'on' button.

A great solution would be to make an app that would respond to the phone being picked up and shaken. Can an iOS app respond to a shake from a locked state? I don't think so. This solution might work but would probably need the phone to be 'armed' at night by bringing the app in question to the front and turning auto-lock off.

Hue Tap: This is exactly the kind of solution I'm looking for, but a/ it would only operate my Hue bulbs and b/ they're fifty quid. Seems a bit much for four programmable buttons purely dedicated to the Hue bulbs.

Laptop: I usually have one within reach when I'm asleep. In theory this is pretty slick - reach out for it, lift the lid and hit one of the F keys that I've programmed to mean 'all on'. In practice it takes more seconds than you'd think to fumble for it, lift the lid, wait for it to connect to the wireless network before hitting the key. (When you wake up in the early hours, it's almost painful to squint at a screen.) It would be possible to leave the laptop close to the bed with the lid open and prevented from going to sleep but again this requires some bedtime preparation.

Some kind of bluetooth button: Here we're getting closer, I believe it'll be feasible to buy a cheap bluetooth device pair it with the computer (I didn't mention that I have an 'always on' mac looking after various household things) and write an app to receive that button press and toggle the lights on / off

That last option reminded me of this:

It was under my desk covered in dust. (I love the old white extended keyboards with big clunky keys). It's essentially a bluetooth controller with loads of programmable buttons. Even better is that LIFXstyle and Hue-topia can already respond to F-key presses, with or without a modifier key.

Here's how to get things set up:

1. pair the keyboard with the 'always on' mac

2. set the f-keys to work as f-keys without pressing the fn key (System Preferences > Keyboard)

3. in Hue-topia and/or LIFXstyle, go into Preferences>Hotkeys and choose which presets you want to trigger with each F key.

That's it - I can reach out and bash one of the first few f-keys. A single keypress triggers a preset in both apps. (I've assigned the first couple of F keys to the same 'all on bright' preset). Other f-keys trigger other presets including 'all off'. It works beautifully, immediately and from anywhere in the house, and even outside.

Next steps:

One of these:
It could be fastened to the wall within reach of the bed. These can be had for much less than a Philips Tap, also much more versatile - it also has plenty of other keys for controlling other things in the future. (Any future app could listen for any of the keys on this keypad). LIFXsyle and Hue-topia would need an enhancement for assigning those high-number f-keys (this improvement is already on the enhancement list.)

Are there other wireless ways of triggering lights / sending a message to a mac that I've not considered? As always, I'll be grateful for thoughts in the comments.