Thursday, 23 May 2013

Find missing meta tags

[Updated 9 May in line with Scrutiny v5]

Meta keywords may not be as important now as they used to be, but your meta title is one of the most important SEO factors and your meta description will appear on search result pages and net you click-throughs.

Here's how to check your site to see whether these things are in place using Scrutiny.

1. First you need to scan your site. At the Sites screen, press 'New' and type your starting url. Press 'Next' to see the default settings, press 'Next' again to accept those settings. Press 'Go' beside 'SEO'.

2. When the crawl has finished, the SEO screen will open

3. Make sure that the 'Highlight' box is switched to 'Missing metadata' (this is the default position)

4. It's easy to see which pages have missing title, description and keywords. These will be highlighted in orange or red.

How to test a website which requires authentication

[last updated 3 Jan 17]
This process will involve a little experimentations because of the different ways that authentication can work. It may be as simple as checking a box.

It also involves a risk of changing or deleting your pages - yes really. With some content management systems, the buttons to perform these actions can look like links to Scrutiny and it will dutifully try to click them.  Here are some precautions:
  • Visit the site using Safari and check whether you're logged in using an account with admin access. If so, log out.
  • If possible, use an account which has access to view the site but no higher
  • If you know the url(s) of links which could perform changes, use Scrutiny's blacklist feature to make sure that they're 'not checked'
  • Make sure that your site is backed up and that you're prepared to restore if necessary

So, we're ready to go. Work through these steps until Scrutiny successfully crawls the site as an authenticated user.

1. Go to Advanced Settings and Check 'Attempt to authenticate'. You will see a warning - read it, heed it and OK it.

[udpate: recent versions of Scrutiny now have an additional 'handle cookies' button. Check that too if cookies may be important for tracking authenticated users on your site.]

2. Step 2 used to be:
Log into your site using Safari, using an account which has read access but no higher.  If your site has an option for 'keep me logged in' then check that. (step 9 should be up here - do that now) Then try to crawl the site.
However, that method won't work if you're on Yosemite or newer because of tighter security in MacOS making cookies browser-specific and no longer systemwide. If you are on 10.9 or earlier, do that ^.

If you're on 10.10 or higher, here's the new step 2. Scrutiny 6 and 7 now have these buttons: 'Handle Cookies' and 'Log in using a browser window'. (If you don't see the 'cookies' button, don't worry, just use the 'attempt to authenticate' button.)

It's a simple workaround - if your website allows you to log in and then tracks you using a cookie (rather than session id) then the button opens a simple browser window that you can use to log in. If you check the 'handle cookies' button, then when you start Scrutiny's crawl, it should retain and use the cookie you just collected.

3. If that doesn't work, enter the username and password into the top two boxes in the Advanced Settings window and try again.

4. If your site uses a web form to send the authentication details ( eg Wordpress) then find out the names of the username and password fields. Here's a snippet of the source for the site above, and you'll see that the names of the fields in this case (a Wordpress site) are 'log' and 'pwd'. Enter these in the second pair of boxes and try the crawl again.

<input type="text" name="log" id="user_login" class="input" value="" size="20" />
<input type="password" name="pwd" id="user_pass" class="input" value="" size="20" />

5. You may need to experiment with your starting url too. If you're using the web form fields described in step 4 then Scrutiny will send these by POST request but it'll only send them to your starting url. (The site should use a cookie or some kind of session id after that.) Again, if you check the source code of your login form, find the form action and use that url as your starting url. 

6. If authentication still not working, check your html login form for hidden fields (or visible ones) which may be necessary for the login to work. [Note step 7 - this step may now be superseded by a new checkbox.] Since v5.5 you can enter any field names and values to be included in the POST request:

7. If your login form uses a security token - check the box and try again. (This may take care of step 6 so it may not then be necessary to add the names / values of  hidden fields). This feature is available in Scrutiny v6.4 onwards.

8. If no joy, this may work: Using a custom header field in the Advanced settings panel, set Field to 'Authorization' and Value to 'Basic [base64 encoded credentials]'. Credentials should  be in the form username:password and an encoded version can be obtained here

9. Once logged in, there may be a 'logout' link on your pages. Obviously you don't want Scrutiny to log itself out on the first pageful of links, so you may have to blacklist such links (see screenshot above).

10. If you've tried all of this and are still unable to log in, please contact Scrutiny support. Be ready to let us have the details for a test user account with read access but no higher.