Wednesday 5 August 2015

The Integrity v6 engine - a website crawling / spidering engine for developers

A major change for Scrutiny v6 will be 'under the hood' (or bonnet as we say here in the UK). It'll separate the engine from higher functionality and the UI.


This will mean that I can make it available to other developers to use in their own tools.

More hundreds of hours than I care to calculate have gone into developing the code which scans a website accurately and fully. It works well and I'm proud of that - parsing arbitrary (and sometimes non-compliant) html code is full of pitfalls.

For Scrutiny users, v6 will be a step forward, and may not be a paid upgrade. It will incorporate tools such as the .dot visualiser and the text-only engine (currently both available as SiteViz and Robotize)

For developers, I'd love to make the engine available. I feel that my strengths lie in building and maintaining that engine, more so than building a UI or producing and marketing a consumer product.

Work is underway. I'm not sure whether it'll be a cocoa framework (with friendly API), a CLI tool (which of course can be used in a Mac app) or both.

It won't be open source. I want to provide a free version and license a faster, paid-for version.

If you're interested then please let me know, it's shiela[at]peacockmedia.software. Work is now underway and it would be really useful to have an indication of interest and thoughts on how best to make the engine available.

[Update]  The engine is now available in a demo xcode project which should give you the info you need to use it.

http://peacockmedia.software/mac/integrity/framework.html

No comments:

Post a Comment