Wednesday, 13 March 2019

Raspberry Pi Zero W - baby steps

I don't know why I've waited so long to do this. I love messing around with home automation, and here's a fully-functional computer with wireless and 20-odd GPIO (input-ouput) pins.

I've also been meaning to begin using Linux (technically I already do, I use MacOS's Terminal a lot, and have set up and run a couple of Ubuntu servers with python and mysql). The Pi has a desktop, USBs for keyboard and mouse, HDMI output. This one has 1Ghz processor 256Mb of ram and for a HD, whatever free space is on the card that you put in. All of this on a board which is half the size of a credit card and costs less than a tenner (or less than a fiver for the non-wireless version).

When we finally invent time travel, or at least find a way to communicate across time (as per Gibson's The Peripheral which I heartily recommend)  my teenage self will be astounded by this information. I remember the day when I first heard the word 'megabyte'. It wasn't far off the day that I felt very powerful after plugging an 8k expansion into my computer.

Anyhow, back to the plot. What I've learned so far is that that the 'less than a tenner' Raspberry Pi Zero W is 'bare bones'. I've bought a few bits and pieces that have cost much more than the computer(!) including a header for the GPI pins, a breadboard & components kit, pre-loaded micro SD (effectively the HD and OS), mini HDMI to proper-size HDMI adaptor.

Monday, 11 March 2019

Website archiving - Watchman's commercial release

[NB since version 2.1.0, we have had to make a slight change to the name, its full title is now Website Watchman.]

It has been a (deliberately) long road but Watchman for Mac now has its first commercial release.
This product does such a cool job that I've long believed that it could be as important to us as Integrity and Scrutiny. So I've been afraid to rush things. Version zero was officially beta, and a useful time for discovering shortcomings and improving the functionality. Version one was free. Downloads were healthy and feedback slim, which I take as a good sign. Finally it's now released with a trial period and reasonable introductory price tag. Users of version one are welcome to continue to use it, but it obviously won't get updates.

So what does it do? In a few words. "Monitor and archive a website".

There are apps that monitor a url and alert you to changes. There are apps that scan an entire website and archive it.

Watchman can scan a single page, part of a website or a whole website. It can do this on schedule - hourly, daily, weekly, monthly. It can alert you to changes. It builds a web archive which you can view (using Watchman itself or the free 'WebArchive Viewer' which is included in the dmg). You can browse the urls that it has scanned, and for each, view how that page looked on a particular day.

We're not talking about screenshots but a 'living' copy of the page. Watchman looks for and archives changes in every file, html, css, js and other linked files such as pdfs.  You can obviously export that page as a screenshot or a collection of the files making up that page, as they stood on that date.

A 'must have' for every website owner?

Try Watchman for free / buy at the introductory price.

Friday, 8 March 2019

SID tune for C64 homebrew game - part 1

My enthusiasm for this project has surprised even me, and to avoid this blog filling up with my ramblings about making this 8-bit game, and to keep all of those posts in one suitable place, I've moved this post and the others to their own blog.

This post has moved to:

https://newstuffforoldstuff.blogspot.com/2019/03/homebrew-game-for-c64-part-1-sid-tune.html






Monday, 4 March 2019

Website archiving utility, version 2

Watchman for Mac is a utility that can scan a single page or a whole site on schedule, it'll archive the page(s) and alert the user to any changes in the code or visible text.

As it builds its archive it's possible to browse the historical versions of the pages, as they stood on particular dates. It displays a 'living' version of the historical pages, with javascript and stylesheets also archived.

We've just made a version 2 beta available. It features a 'non-UI' or headless mode, which means that it can remain in the background and not interrupt the user when a scheduled scan starts. Its windows can still be accessed from a status bar menu.

https://peacockmedia.software/mac/watchman/

Version 1 is still available. It's free and will remain free. The new beta is also available and free at present, but version 2 will eventually require a licence.

Wednesday, 27 February 2019

Vic20 Programmer's Reference Guide

Having become more and more interested in the computers that started it all for me, I've been poking (geddit?) around in my attic among my collection of 8-bit computers, running emulators and buying hardware (leads and adaptors) and starting to use some of these machines again.

I have quite a collection - there was a time when you could easily find them at car boot sales for a fiver,  and I think that one of the Vic20s I have is the one I was given as a teenager.

I've always thought that I'd return to these computers, maybe in retirement, but it's happened a bit sooner.
There's something very exciting about these early home 8-bit machines. You're presented with a flashing cursor and in order to do anything, you need to type a command or two in basic (LOAD and RUN if you wanted to play a game). I believe this fact is why so many of my generation became so interested in software development, and it's something that was lost when a/ many turned to consoles because they only wanted to play games and b/ computers gained more graphic interfaces, separating the user from the workings.

Back to the point. I've begun writing a game for C64. I couldn't help myself. After much searching of the attic, I couldn't find a copy of the C64 Programmer's Reference Guide. I easily found it as a scanned pdf online, which is fine, but it's harder to use that than simply flicking through the pages of a book. (I will buy a copy, they come up on eBay. No doubt I will then find a copy I already owned.)

One of the things I really wanted was a reference guide for the instruction set. Although the C64 has a 6510 processor and the Vic20 a 6502, the two processors are identical but for a small difference. And I did find a Vic20 Prog Ref Guide, and the instruction set reference is identical.
It'll be handy when I make the Vic20 version of my new game.

It's very well-thumbed - I was as obsessed with programming as I am now. This is clearly my own original copy. It has some lined A4 with notes in my handwriting. It feels weird handling the book, a little bit like I've gone back 38 years in time, grabbed the book and brought it back with me.
(This page has my own instructions for using the monitor programme that I wrote in machine code, as a tool for writing more machine code. I used to write my programmes on paper and assemble them by hand before typing in the hex. I'm sure the monitor is on a tape somewhere in the attic.)

End note:
As part of my foray into this new-old world, I've discovered that there are some amazing people producing hardware and games (on cartridge/disc/tape) for these old computers. To support and encourage that, I've begun this project.

Friday, 18 January 2019

scraping Yelp for phone numbers of all plumbers in California (or whatever in wherever)

I've written similar tutorials to this one before, but I've made the screenshots and written the tutorial this morning to help someone and wanted to preserve the information here.

We're using the Webscraper app. The procedure below will work while you're in demo mode, but the number of results will be limited.

We enter our starting url. Perform your search on Yelp and then click through to the next page of results. Note how the url changes as you click through. In this case it goes:
https://www.yelp.com/search?find_desc=Plumbers&find_loc=California&ns=1&start=0
https://www.yelp.com/search?find_desc=Plumbers&find_loc=California&ns=1&start=20
https://www.yelp.com/search?find_desc=Plumbers&find_loc=California&ns=1&start=40
etc.

(I added in &start=0 on the first one, that part isn't there when you first go to the results, but this avoids some duplication due to the default page and &start=0 being the same page).

So our starting url should be:
https://www.yelp.com/search?find_desc=Plumbers&find_loc=California&ns=1&start=0

In order to crawl through the results pages, we limit our crawl to urls that match the pattern we've observed above. We can do that by asking the crawl to ignore any urls that don't contain
?find_desc=Plumbers&find_loc=California&ns=1&start=

In this case we're going to additionally click the links in the results, so that we can scrape the information we want from the businesses' pages. This is done on the 'Output filter' tab. Check 'Filter output' and enter these rules:
URL contains /biz/
and URL contains ?osq=Plumbers
(The phone numbers and business names are right there on the results pages, we could grab them from there, but for this exercise we're clicking through to the business page to grab the info from there. It has advantages.)

Finally we need to set up the columns in our output file and specify what information we want to grab. On the business page, the name of the business is in the h1 tags, so we can simply select that. The phone number is helpfully in a div called 'biz-phone' so that's easy to set up too.

Then we run by pressing the Go button. In an unlicensed copy of the WebScraper app, you should see 10 results. Once licensed, the app should crawl through the pagination and collect all (in this case) 200+ results.

Limitations

I was able to get all of the results (matching those available while using the browser) for this particular category. For some others I noticed that Yelp didn't seem to want to serve more than 25 pages of results, even when the page said that there were more pages. Skipping straight to the 25th page and then clicking 'Next' resulted in a page with hints about searching.

This isn't the same as becoming blacklisted, which will happen when you have made too many requests in a given time. This is obvious because you then can't access Yelp in your browser without changing your IP address. One measure to avoid this problem is to use ProxyCrawl which is a service that you can use by getting yourself an account (free initially), switch on 'Use ProxyCrawl' in WebScraper and enter your token in Preferences.

Wednesday, 9 January 2019

New website monitor / archive utility for Mac arrives at full stable release and is still free

Watchman is an easy-to-use website monitoring / archiving utility.

You can use it to watch a single page, a part of a website or a whole website. It can run on schedule (hourly, daily, weekly, monthly) and alert you to the changes you're interested in, which could be visible text, source, resources appearing on the page, its status, or you can simply leave it to archive all changes to all files. You can set up multiple website configurations, each with their own schedule. It uses the system's launchd, meaning that Watchman doesn't have to be left running, it'll just start as needed.

Watchman uses the same fast, efficient crawling engine as Scrutiny and Integrity, which has been developed over 12 years and offers a huge amount of configuration and tuning. This is coupled with a new web archive format.



Its web archive format can store changes like a Time Machine backup. You can view any page as it appeared on a certain date. When you do so, you're viewing a 'living' version of the page, with its css and javascript running as it was at the time, not a simple screenshot. You can of course export a version of a page as an image, or as a collection of all the files under their original filenames, as they were on that date. You can switch between versions of a page to compare them.

It's a desktop app running on your own Mac, so you own your own data.

It's early days, there are many more features in the pipeline, but for now it's stable and doing invaluable work. And version 1.x is free to download and use. (The next major version may not be free, or may be 'freemium' but the current version will continue to work and remain free.)

I've been flagging the release of Watchman for a while, it's been a long time since I've been so excited about a new project and I believe it'll become a more important title for us than Scrutiny.

https://peacockmedia.software/mac/watchman