Monday, 31 October 2016

Webscraper from PeacockMedia - usage

I've had one or two questions about using WebScraper. There's a short demo video here  but if, like me, you prefer to cast your eye over some text and images rather than sit through a video, then here you go:

1. Type your website address (or starting url for the scan). Like Integrity / Scrutiny (Webscraper uses the same engine) the crawl will be limited to any 'directory' implied in the url.

2. Hit Go. The way this works (currently) is that the app crawls your site, then when complete, you choose  what and how you want to export your data.
3. When the scan is complete, the export options will open. Choose the format you want to export (currently csv, json) and which information you want to include. This can include various meta data or information extracted from the pages, by span or div, class or id.

4. If the output file isn't as you expected, then you can tinker with the output options without needing to crawl again. Just use the Export button on the Main (crawl) window.

No comments:

Post a Comment