A Guide To Heuristic Website Reviews

Advertisement

In this article, we’ll explore a scoring system for rating and comparing websites, we’ll visualize those ratings using infographics, and we’ll see what data and structure this method provides for reviewing websites.

How To Tell Whether A Website Is Junk

We are all reviewers. We review many websites every day without even realizing it. In fact, many of us are experts at it. We don’t realize it because the whole process occurs in moments.

That’s how it is. We use websites; we judge websites. Even if we don’t know we’re doing it, we make judgements about trustworthiness, credibility, competency, reliability, design and style within seconds of arriving on a Web page. After looking around, we also get a pretty good feel for the user experience and usability.

Consultancy Reviews

For many years, the agency I work for1 has conducted detailed reviews of its clients’ websites. As part of the consultancy process, we offer recommendations for any redesign or redevelopment work that is necessary.

Snap judgments may be useful and unavoidable, but when it comes to reviewing websites professionally, we need to be more organized and thorough, and we do this by using a review methodology. It also pays in both time and effort to be formulaic and consistent in our approach, because there are so many things to look at when considering a website.

To make this easier, we use a set of heuristics to score websites, along with a simple method to quickly visualize any weaknesses. I use a set of heuristics that I have worked with and edited and updated to suit the type of projects I work with, based on original work created by User Focus2.

Heuristics


Artistic Merit in ice dance.
3
(Image: Rick S.4)

A heuristic is just a fancy word for a measurement of something that can’t readily be quantified (i.e. when there are no actual numbers to judge whether item A is better or worse than item B). In a 100 meter sprint, the winner is easily identified by concrete data. In ice dancing, the contestants are judged based on a set of technical and artistic criteria, giving them a set of scores.

All That Glitters Is Not Gold

We might be swayed by something that looks good, but we all know that beauty is only skin deep. As with everything that glitters, the job of the reviewer is to poke about and see if they really have struck gold.

Conversely, some websites that are judged harshly for their graphic design are successful beyond measure — I’m looking at you, Amazon, eBay, Craigslist and even Google. These websites aren’t much to look at, but functionally speaking, they do their job well and have evolved over the years to precisely meet their customers’ needs.

As designers, we’re asked to redesign websites that generally are getting to look better and better. It’s getting to the point that we find ourselves questioning the need for a redesign at all. But usually the problems are not immediately obvious in the visuals, layout or code. Sometimes a website is just wrong for the client’s brand; or the experience of performing tasks on it is unpleasant. Sometimes, a website just doesn’t work.

You can’t tell by looking. You need to dig deeper by really using the website, setting yourself tasks and trying things out. Only then will you experience what is really going on. Realizing just how much rethinking, redesigning and redeveloping a website needs often takes a while.

Metrics For Success

The success of most websites can be measured by some metric, be it the number of sales, uploads, downloads, clicks, comments or sign-ups. But a website can be successful in sales and still have problems; for example, it might be successful because of excellent marketing, because of its offline reputation (as in the case of high-street brands) or from having cornered the market. That does not mean it is without problems. But many more websites have no quantifiable metrics by which we can determine how good or bad they actually are. Judging these websites is more difficult and requires a bit more leg work.

A Many-Layered Cake


A many layered cake.
5
(Image: Scheinwerfermann6)

When reviewing a website in detail, we have to explore many layers, both on the surface and below, including the following:

  • Task orientation and website functionality,
  • Navigation and information architecture,
  • Forms and data entry,
  • Trust and credibility,
  • Quality of writing and content,
  • Search,
  • Help, feedback and error tolerance,
  • Page layout and visual/aesthetic design,
  • Accessibility and technical design.

Taking these broad categories, we can devise a list of questions to explore each and get to the heart of the website. This formalizes the process and ensures that the same thought process can be repeated the next time. It also serves as a checklist, ensuring that nothing is forgotten. For example, when looking at the layout and visual design of a website, our questions could include these:

  • Are standard elements (such as page titles, website navigation, page navigation and privacy policy) easy to locate?
  • Is there a good balance between information density and white space?
  • Does the website have a consistent and clearly recognizable look and feel that will engage users?

For accessibility, we could formulate questions such as these:

  • Is the color contrast across the website enough to make all of the content accessible?
  • Does the website work comfortably at lower resolutions (e.g. 1024 × 768 pixels)?
  • Does the CSS validate with the W3C’s validation services?

Regarding the written copy, our questions could include:

  • Are the pages simple to scan on screen? Are they broken up by headings and subheadings? Are the paragraphs short?
  • Are acronyms and abbreviations defined when first used?
  • Does the website favor maps, diagrams, graphs, flow charts and other visuals over long blocks of text?

Depth

Although relatively easy to conduct, a heuristic review is not quick to perform. However, we can decide just how much depth to go into and how many questions to ask in order to get a feel for the website. The more heuristic measures we use, the longer the process will take; the fewer we use, the less informative the results will be. It’s a matter of striking a balance between the time available and the quality of returns. Selecting heuristics that get to the heart of each category can significantly reduce the amount of effort you need to put in.

Devising A Scoring System

To get a yardstick score for each heuristic, a simple score can be given. For example, 0 points if it falls short of a metric, 1 point if it’s halfway there, and 2 points if it does the job. So, if acronyms or abbreviations are defined in some sections but not in others, then the heuristic would score only 1 point. If the website worked comfortably at 1024 × 768 pixels, then it would receive 2 points.

These points can be totalled across each category to give a quantifiable sense of what’s going on across the website, as shown here:

Table showing heuristic totals
Totals of heuristic data across categories.

Visualization

Representing this data visually helps us quickly identify problem areas and makes it easier to compare websites.

Radar diagrams are perfect for this kind of analysis, because they give a recognizable shape based on the score. The more circular the radar, the more balanced the score; the spikier the radar, the more variation in the score. The size of the radar plot on the axes indicates the score percentage itself, showing good and bad areas, as seen in the examples below:

A radar plot showing heuristic data
A radar plot showing a website that performs well across all heuristic categories.

A radar plot showing heuristic data - poor
A radar plot showing poor performance across all heuristic categories.

A radar plot showing heuristic data - perform well
A radar plot showing a website that performs well in all areas but one.

A radar plot showing heuristic data - poor
A radar plot showing a website that performs poorly in all areas but one.

Competitor Reviews

By combining the heuristic results of different websites, we can create a visual comparison of competing websites in a market segment. This is particularly good for getting a feel for which websites fail and which succeed in certain respects. Analyzing multiple websites can, of course, take a lot of work, so stripping your heuristics down to the essentials is a good idea.

A Direct Comparison

As a real-world example, below is a comparison of two similar websites: Smashing Magazine and Webdesigner Depot. We can see that both lack a little in most of the categories, apart from quality of writing and content, which is what we would expect from content-rich blogs. (Please note that I work for neither website and stand as an impartial bystander!)

Both websites score a little higher in page layout and visual design, but they have rather weak home pages, being in the format of a traditional, basic blog. Their calls to action score quite poorly (other than the advertising!). Smashing Magazine scores marginally better in navigation because it has the tabs on top to distinguish major content areas, whereas Webdesigner Depot almost loses the navigation below the advertising in the right-hand column. Smashing Magazine scores slightly higher in accessibility for a number of minor heuristics, such as the clarity of the text, spacing and contrast.

Webdesigner Depot falls behind a little on trust and credibility because of details such as the basic link to an email address in the footer (compared to the well-considered contact form on Smashing Magazine), and also for the very brief copy in the “About us” section. However, Webdesigner Depot picks up slightly more points in visual design for its colorful style. Of course, like the presentation scores in ice dancing, any process used to score aesthetics or design will always be subjective, so having a wide range of criteria for various aspects of design is a good idea.

A radar plot showing heuristic data - smashing
A heuristic analysis of Smashing Magazine.

A radar plot showing heuristic data - WDD
A heuristic analysis of Webdesigner Depot. Note that Webdesigner Depot does not really have or require form inputs, so it scores 0 by default in the “Forms and data entry” category; this score can be either ignored or removed altogether if so wished.

To emphasize the differences in the heuristic measurements, we can overlay one radar plot on the other:

A radar plot showing heuristic data - Overlay
Overlaying one radar diagram on the other to enhance visualization.

Conclusion

When reviewing a website, subjective snap judgements are unwise. We can do justice to a website only with a detailed test drive. We need to perform tasks and look in detail at various components on and below the surface. Heuristic scoring is a useful process for quantifying and visualizing a website’s quality when other measures are not appropriate or available. This formal process reveals problem areas, while focusing the discussion at the start of a redevelopment phase.

Resources

Based on work done by Userfocus7. Discover more8 and download a free template to get started in creating your own heuristic reviews.

(al)

Footnotes

  1. 1 http://www.headscape.co.uk
  2. 2 http://www.userfocus.co.uk/
  3. 3 http://commons.wikimedia.org/wiki/File:Fusar-Poli_and_Margaglio.jpg
  4. 4 http://commons.wikimedia.org/wiki/File:Fusar-Poli_and_Margaglio.jpg
  5. 5 http://en.wikipedia.org/wiki/File:Pound_layer_cake.jpg
  6. 6 http://en.wikipedia.org/wiki/File:Pound_layer_cake.jpg
  7. 7 http://www.userfocus.co.uk/
  8. 8 http://www.userfocus.co.uk/resources/guidelines.html

↑ Back to topShare on Twitter

Leigh is a designer with 15 years experience, now working in user experience. He's been helping websites look better, be more organised and work better since grey backgrounds were the norm. He's a jack of all trades, from video to music and still trying to master at least one of them. He survives on coffee and custard creams and blogs occasionally from his own planet; leighhowells.com

Advertising
  1. 1

    Excellent article! It was a very easy to read breakdown of what a heuristic actually is.

    1
  2. 3

    Good article, but the radar graphs as designed are actively deceptive, an example of “how to lie with graphical presentations”.

    The graphs are ostensibly intended to convey and compare information on the lines or spokes of the graph, but what the viewer sees and bases hir reactions on is the area of the slices.

    For instance, the webdesignerdepot graph is supposed to indicate that one of the 10 categories “does not apply” — but the visual message is that one-fifth of the graph is blank, missing. It makes WDD look much worse than it should — almost as though the presentation was aimed to flatter people at smashingmagazine, hmm?

    Even without that, the sizes and shapes of solid areas — which the viewer will assume reflect the solid, important data — are epiphenomena. They will be larger or smaller depending on which categories are next to each other — you could make them look thin and spiky or reassuringly broad by manipulating the order of the categories around the circle.

    It only takes a small change to get rid of these problems and (mostly) stop lying with statistics. All you have to do is to think of the display as showing slices of a pie, not points around a circle, with each category filled up to the line for its score.

    This will still be a trifle deceptive, because our brains will compare the area of each slice, not how long it is. Because of π-r^2, a slice representing a score of 50% will look 4 times as large as one for a score of 25%. This may be useful if you’re trying to persuade a client how much they need you, but I still don’t think Edward Tufte would approve.

    17
  3. 4

    It may be nice to see the items grouped so each quarter of the diagram represented a broader category. E.g. Visual, Functional, UX, Content or similar?

    1
  4. 6

    This way of judging usability measurement is useful… but what does it mean when the site scores quite low for usability in most areas while having very high conversion rate?

    Kind of makes the detailed review arbitrary for some sites.
    (Not knocking the article though. Great stuff to consider when designing)

    0
    • 7

      Thanks!

      Well, I guess if a site is not very usable but still has high conversion rates it could leave you wondering just how much better the conversion rates may be if it were more usable.

      1
  5. 8

    Good article. Layering the radar graphs is a nice touch.

    But I’m curious how this

    “Does the CSS validate with the W3C’s validation services?”

    would mean anything in terms of site quality. I can understand the value of quality coding, especially in terms of performance, but I don’t see a validated style sheet making any kind of impact on user experience.

    Also, scores of 0-2 might not be meaningful except for extreme comparisons.

    2
    • 9

      That’s just an example of the kind of criteria one could look at and score the site on out of many questions – I have 20 or 30 per category. Looking for real or serious validation errors in the code gives a clue about the quality of the site, not necessarily just user experience.

      As for scoring, that’s up to you! Could easily be out of a 10 or 100.

      0
  6. 10

    Thanks for the article. It would be great to see a full list of the questions you answer/judge on within the categories.

    3
  7. 12

    This was a great article. We take a similar approach in our business albeit it’s usually more subjective than what you presented. I would be interested to see more in-depth information on how you rate each area and if every component of the area is given equal weight.

    2
  8. 13

    Interesting article Leigh, but you might want to point out that the tool you’ve used to structure your article, create the screenshots and do the review is our free Excel expert review template available for download from http://www.userfocus.co.uk/resources/guidelines.html. I’m not sure why one of the images has Headscape branding since the template was designed by Userfocus.

    1
    • 14

      @ David Travis
      Thanks for submitting that link to your list, without which the whole article is kinda pointless.

      0
    • 15

      @David Travis:

      I see that graphic presentation elements I criticized in my comment @9 are from your Excel spreadsheet. I think you should *really* consider shifting to a “pie-slice” rubric for the radar display, to be more useful and less deceptive.

      0
      • 16

        @Doctor Science

        Of course, you’re absolutely correct about the spider chart representation. It’s made worse by the edge case that Leigh has presented (when one of the categories doesn’t apply to a site) but I’m persuaded by your argument that even if it did apply people will base their judgement on the area rather than the size of the lines. My lame excuse is that spider charts are easy to program in Excel. The more accurate graphic representation you’re suggesting isn’t as easy to do. I’ll happily change it if there’s an Excel whiz who’s reading this thread and would like a challenge.

        But there’s a deeper issue here that I think makes the issue moot. The score itself is at best only a rough indication of a site’s usability. For example, let’s take two guidelines from the ‘Task orientation’ category: “Users can complete common tasks quickly” and “The site allows users to rename objects and actions in the interface (e.g. naming delivery addresses or accounts)”. I think most people would agree that complying with the first guideline is much more important than the second. Yet the kind of scoring scheme emphasised in this article assumes that all of the criteria have equal weight.

        I’m beginning to wish that I’d never included a score sheet in the template because people tend to get fixated on the score when it’s not the score that matters — it’s about intelligently applying the guidelines to find and fix usability problems. Although I think that a site that complies with more guidelines will be more usable than a site that complies with few, the score sheet is just 1 of 10 sheets in the template and I’d always intended it to be an indicator, not an end in itself.

        2
        • 17

          Yes, I completely agree that the radar chart can give a deceptive visual impression -especially when a section is not directly applicable!. I’ve never had to include a blank when reviewing a client site, and when it cropped up in this simple example I felt a little uneasy because of the visual impact it had.

          I think it’s a little strong to call it a lie, but maybe other forms of chart – even a simple bar chart may be a more appropriate way of visualising the data – even if the radar charts are much cooler :)

          But you are right David. Even though the scores and visualisation give tangible information, and a pretty visual, it’s not the important part of the exercise. The important part for me is the creation of a procedure which is repeatable that gets into all the important areas of the site and ensures you cover everything that’s required to be examined before writing an expert review. I often find that making notes alongside each question becomes a more important part for writing the review.

          2
    • 18

      Hi David. I’m really glad you got in touch, and yes you deserve full credit for this excellent approach from what I can see on your site (I will add). I never claim to have invented any of this and to be perfectly honest I had lost all traces of where the original spreadsheet came from. I guess it’s evolved a lot as I’ve used it and revamped questions and added sections over time. I think that’s one of the advantages of the system, heuristics and categories can be added or removed over time to create a more tailored scoring system.

      As for the logo, this of course it part of documentation that is sent to clients, as the process takes a lot of work, but I will add a credit link to your site to my documentation. It’s an excellent system and really gets into the nitty gritty of a site for writing an expert review and I’m glad more people can read about it here – and now on your site too.

      0
  9. 19

    Nice reading, Leigh. Can we expect any instalment of this article? You’ve sugested several (actually more than several – 242 exactly) questions. Is it secret or can you reveal particular questions?

    0
  10. 20

    yes befoire we could take this great idea in concept seriously we would have to see the full set…otherwise its just a nice talk point ( and a great idea btw)

    0
  11. 21

    very interesting indeed.
    This takes the guesswork out of “judging” a website, which, more often than not, concentrates on just one aspect (I don’t really like that color!) neglecting other, possibly more important things like: Is our message clear, is there a clear call to action, can older people read the copy etc.

    This can also be very helpful when reviewing a prototype with a client, giving more “measurable” criteria to judge the site.

    Thanks

    0
  12. 22

    I love data visualization. Management seems to love them as well because it boils down a meeting into something you can use to quickly ascertain where you stand.

    Sadly, if I had to run that heuristic on a project I was recently tasked with (whose client was adamant that quality not be an issue), it would be a small circular dot, all the way around.

    When a higher up is more concerned with quantity and quick-to-production, the site suffers miserably and the potential of the site is lost since the capability for the development team to bring it to that potential is essential usurped from underneath their own feet.

    Good article.

    0
  13. 23

    Qualitative appraisals of visual design styles and visual design formats are largely subjective. t being subjective, I prefer valuing a site subjectively for its functionality, and “what I get out of it” as far as resources/assets or useful points of view. For instance, kernel.org is pretty spartan, but it’s the web site for the main parts of the Linux kernel , in current and previous versions. Then, there’s Smash Magazine – it could be wrapped in paper towels, for all I care, it’s a unique range of design points of view represented at this site.

    Enough of my hair-splitting, though. The concept of having a heuristic methodology for web-site design analysis – that sounds, to me, like a new, kind of deep concept. I suspect that the concept could be investigated and expanded upon, in a format broader than a single article.

    I imagine that the heuristics that any one agency would prefer to use, in web site analysis, could be a somewhat agency-specific thing, but as far as general principles, I think it could apply broadly. So, I think I’ll wait for the [e]book ;}

    -1
  14. 24

    Having seen the 247 web usability excel sheet and experimented with, I am feeling concerned over the article , I almost feel decieved, I read it and thought the uathor had infact developed this but yet find everything someone elses work? I am not seeing a clear departure from nor improvement upon the 247′s body of work now?

    0
    • 25

      It was not my intention to deceive – I’m simply talking about the method and advertising it as a great system I have been using for the last year or so and have built upon in the process.

      0
  15. 26

    I think that the resulting diagram should weigh the items being measured comparably.
    “Content Quality” and “Accessibility” should have more prominence than all the other measures combined.
    Surely things like Nav, search ,layout/design are subclasses of Accessibility , or is accessibility now reserved for “non standard users” ?
    Credibility is a subclass of Content quality.

    i think:
    The resulting diagram should not be a linear pie-chart styley.
    It should reflect the hierarchy of each measure.

    Apart from that , (visualization)
    I really like the way you suggested for analyzing sites in a methodical way.
    thx
    I

    0
  16. 27

    I have recently conducted research on a sample of the UK population for web design factors which impact on a visitors trust, purchase intention and loyalty, so this is an interesting read.

    Where would branding and up-to-date content fit within your categories?

    For readers looking for other areas of interest on this topic, check out the measurements of WebQual and SiteQual.

    Thanks for sharing.

    0
  17. 28

    Thanks for the article. I am working on my bachelor thesis in which I do research for a website of a Dutch healthcare organisation and I will use this method as one of my research methods. You’re helping me a lot.

    0
    • 29

      Hi Robert

      If the results of the thesis are to hand would love top review it part of my desk research into best practice for these types of sites…

      0
  18. 30

    I must say that this article is brilliant especially in terms of connecting the heuristics to web design.

    I don’t have enough budget to hire third party to conduct a deep review. Therefore thanks again for this article that opens my mind a bit more again.

    The only thing that bothers me is the way you evaluate the weight of each category. The scoring system is clear however the way you evaluate each aspect is a matter of subjectivity. When you work in small business you usually do it on your own without any customer research and that could be misleading.

    1
  19. 31

    Hi there…great article, thanks for sharing! A couple of questions for you:

    Did you use Excel to generate the radar charts?

    And as a follow-up, do you know of any other radar chart generators that might be worth playing with?

    0
  20. 32

    Hi, I’m a non-technical person who is often in a text-editing position vis a vis websites (eg for my local business association). I have long wondered what ‘best practice’ is when it comes to visualising from scratch how a website should look and hang together. How do people do it? Post-it notes on a whiteboard? Any and all tips appreciated, thanks.

    0
  21. 33

    Thanks for post

    0
  22. 34

    Excellent approach! I am new to quantified research process for digital design and this article and resource is great primer!

    0
  23. 35

    This is an interesting article, full of good stuff. It seems to veer away from heuristics and into contextual inquiry though, does it not? Heuristics are all about snap decisions, yet the charts are getting into quality of the content as opposed to things like font selection, line height, and contrast ratio.

    Heuristics are, by definition, about the instant of trust, not the eventual value judgement.

    0
  24. 36

    Thanks so much for sharing your method. I found it very refreshing to see that I’m not completely off my rocker in some of my thoughts on heuristics. Recently I conducted a heuristic analysis of a register system. While I went through it, I documented my method so that new folks would have a starting point. I’d be interested in hearing your thoughts.

    Creating a UX Assessment
    http://uxfindings.blogspot.com/2014/06/creating-ux-assessment.html

    1

Leave a Comment

Yay! You've decided to leave a comment. That's fantastic! Please keep in mind that comments are moderated and rel="nofollow" is in use. So, please do not use a spammy keyword or a domain as your name, or else it will be deleted. Let's have a personal and meaningful conversation instead. Thanks for dropping by!

↑ Back to top