Thursday, 23 May 2013

How to test a website which requires authentication

[last updated 23 Mar 2021]
This process will involve a little experimentations because of the different ways that authentication can work. It may be as simple as checking a box.

It also involves a risk of changing or deleting your pages - yes really. With some content management systems, the buttons to perform these actions can look like links to Scrutiny and it will dutifully try to click them.  Here are some precautions:
  • Visit the site using Safari and check whether you're logged in using an account with admin access. If so, log out.
  • If possible, use an account which has access to view the site but no higher
  • If you know the url(s) of links which could perform changes, use Scrutiny's blacklist feature to make sure that they're 'not checked'
  • Make sure that your site is backed up and that you're prepared to restore if necessary

So, we're ready to go. Work through these steps until Scrutiny successfully crawls the site as an authenticated user. 

1. Go to Advanced Settings and Check 'Attempt to authenticate'. You will see a warning - read it, heed it and OK it.

[udpate: recent versions of Scrutiny now have an additional 'handle cookies' button. Check that too if cookies may be important for tracking authenticated users on your site.]

2. Step 2 used to be:
Log into your site using Safari, using an account which has read access but no higher.  If your site has an option for 'keep me logged in' then check that. (step 9 should be up here - do that now) Then try to crawl the site.
However, that method won't work if you're on Yosemite or newer because of tighter security in MacOS making cookies browser-specific and no longer systemwide. If you are on 10.9 or earlier, do that ^.

If you're on 10.10 or higher, here's the new step 2. Since v6, Scrutiny has these buttons: 'Handle Cookies' and 'Log in'. (If you don't see the 'cookies' button, don't worry, just use the 'attempt to authenticate' button.)

It's a simple workaround - if your website tracks you using a cookie (rather than session id) then you can use a simple browser window here to log in. If you check the 'handle cookies' button, then when you start Scrutiny's crawl, it should retain and use the cookie you just collected. 

(Version 10.3.1 updated the webview used in the Log in window. This helped the functionality to work with some sites and broke the functionality for others. Version 10.3.3 offers a choice - try the legacy version first, if that doesn't work, try the other.)




3. This is worth a try, I have seen it work. It may or may not work when you try it in your browser - currently it works in some browsers but not others, but may still work if you try it in Scrutiny.

Add the username and password to your starting url, in the form:
http://user:password@example.com
4. If that doesn't work, enter the username and password into the top two boxes in the Advanced Settings window and try again.

5. If your site uses a web form to send the authentication details ( eg Wordpress) then find out the names of the username and password fields. Here's a snippet of the source for the site above, and you'll see that the names of the fields in this case (a Wordpress site) are 'log' and 'pwd'. Enter these in the second pair of boxes and try the crawl again.

<input type="text" name="log" id="user_login" class="input" value="" size="20" />
<input type="password" name="pwd" id="user_pass" class="input" value="" size="20" />

6. You may need to experiment with your starting url too. If you're using the web form fields described in step 4 then Scrutiny will send these by POST request but it'll only send them to your starting url. (The site should use a cookie or some kind of session id after that.) Again, if you check the source code of your login form, find the form action and use that url as your starting url. 

7. If authentication still not working, check your html login form for hidden fields (or visible ones) which may be necessary for the login to work. [Note step 7 - this step may now be superseded by a new checkbox.] Since v5.5 you can enter any field names and values to be included in the POST request:



8. If your login form uses a security token - check the box and try again. (This may take care of step 7 so it may not then be necessary to add the names / values of  hidden fields). This feature is available in Scrutiny v6.4 onwards.




9. If no joy, this may work: Using a custom header field in the Advanced settings panel, set Field to 'Authorization' and Value to 'Basic [base64 encoded credentials]'. Credentials should  be in the form username:password and an encoded version can be obtained here

10. Once logged in, there may be a 'logout' link on your pages. Obviously you don't want Scrutiny to log itself out on the first pageful of links, so you may have to blacklist such links (see screenshot above).


11. If you've tried all of this and are still unable to log in, please contact Scrutiny support. Be ready to let us have the details for a test user account with read access but no higher.

No comments:

Post a Comment