Adding Quality to Your Design Research with an SSQS Checklist

- 503 shares
- 9 years ago
Quantitative research is the methodology which researchers use to test theories about people’s attitudes and behaviors based on numerical and statistical evidence. Researchers sample a large number of users (e.g., through surveys) to indirectly obtain measurable, bias-free data about users in relevant situations.
“Quantification clarifies issues which qualitative analysis leaves fuzzy. It is more readily contestable and likely to be contested. It sharpens scholarly discussion, sparks off rival hypotheses, and contributes to the dynamics of the research process.”
— Angus Maddison, Notable scholar of quantitative macro-economic history
When we want to say whether something's good or not, it's not so obvious. And this unit is all about evaluation. Ah, well, it's a lovely day here in Tiree. I'm looking out the window again. But how do we know it's a lovely day? Well, I won't turn the camera around to show you, because I'll probably never get it pointing back again. But I can tell you the Sun's shining; there's a blue sky.
I could go and measure the temperature. It's probably not that warm because it's not early in the year. But there's a number of metrics or measures I could use. Or perhaps I should go out and talk to people and see if there's people sitting out and saying how lovely it is or they're all huddled inside. Now, for me, this sunny day seems like a good day. But last week it was the Tiree Wave Classic, and there were people windsurfing. The best day for them was not a sunny day.
It was actually quite a dull day, quite a cold day. But it was the day with the best wind. They didn't care about the Sun; they cared about the wind. So, if I'd asked them, I might have got a very different answer than if I'd asked a different visitor to the island or if you'd asked me about it. Evaluation is absolutely crucial to knowing whether something is right. But, you know, the methods of it are important – they are important to do. But they tend to be a bit boring to talk about, to be honest, because you end up with long lists of things to check.
When you're looking at an actual system, though, it becomes more interesting again. But it's not so interesting to talk about. What I want to do is talk more about the broader issues about *how* you choose *what kind* of evaluation to do and some of the issues that surround it. And it *can* be almost a conflict between people within HCI. It's between those who are more quantitative. So, when I was talking about the sunny day, I could go and measure the temperature. I could measure the wind speed if I was a surfer
– a whole lot of numbers about it – as opposed to those who want to take a more qualitative approach. So, instead of measuring the temperature, those are the people who'd want to talk to people to find out more about what it *means* to be a good day. And we could do the same for an interface. I can look at a phone and say, "How long did it take me to make a phone call?" Or I could ask somebody whether they're happy with it: what does the phone make them feel about? – different kinds of questions to ask. Also, you might ask those questions, and you can ask this in both a qualitative and quantitative way in a sealed setting.
You might take somebody into a room, give them perhaps a new interface to play with. So, you might take the computer, give them a set of tasks to do and see how long they take to do it. *Or* you might go out and watch people in their real lives using some piece of – it might be existing software; it might be new software, or just actually observing how they do things. There's a bit of overlap here – I should have mentioned at the beginning – between evaluation techniques and empirical studies. And you might do empirical studies very, very early on,
and they share a lot of features with evaluation. They're much more likely to be wild studies. And there are advantages to each. In a laboratory situation when you've brought people in, you can control what they're doing; you can guide them in particular ways. However, that tends to make it both more – shall we say – robust that you know what's going on, but less about the real situation. In the real world, it's what people often call "ecologically valid"; it's about what they *really* are up to.
But I said it's much less controlled, harder to measure – all sorts of things. Very often, it's rare or it's rarer to find more quantitative in the wild. But you can find both. You can both go out and perhaps do a measure of people outside. You might go out on a sunny day and see how many people are smiling. Count the number of smiling people each day and use that as your measure – a very quantitative measure that's in the wild. More often, you might in the wild just go and ask people – it's a more qualitative thing.
Similarly, in the lab, you might do a quantitative thing – some sort of measurement – or you might ask something more qualitative – more open-ended. Also, you might do away with the users entirely. So, you might have users there doing it, or you might actually use what's called an *expert evaluation* method or an analytic method of evaluation. By having a structured set of questions, somebody who's got a bit of expertise, a bit of knowledge,
can often have a very good estimate of whether something is really likely to work or not. So, you can have that sort of expert-based or analytic-based evaluation method, or you can have something where you get real users in. Most people I think would say that in the end you do want to see some real users there; you can't do it all by expert methods. But often the expert methods are cheaper and quicker to do early on in the design process. So, usually both are needed, and in fact that's the general message I think I'd like to give you about this.
That, in general, it's the *combination* of different kinds of methods which tend to be most powerful. So, sometimes at different stages: you might do expert evaluation or analytic evaluation early, more with real users later. Although probably you'll want to see some users at all stages. Particularly, quantitative and qualitative methods, which are often seen as very, very different, and people will tend to focus on one or the other.
Personally, I find they fit together. Quantitative methods tend to tell me whether something happens and how common it is to happen – whether it's something I expect to see in practice commonly. Qualitative methods – the ones which are more about asking people open-ended questions – either to both tell me new things I didn't think about before, but also give me the "Why?" answers, if I'm trying to understand *why* it is I'm seeing a phenomenon. So, the qualitative things – the measurements – say, "Yeah, there's something happening. People are finding this feature difficult."
The qualitative thing helps me understand *what it is about it that is difficult* and helps me to solve it. So, I find they give you *complementary* things – they work together. The other thing you have to think about when choosing methods is what's appropriate for the particular situation. And these things don't always work. Sometimes, you can't do an in-the-wild experiment. If it's about, for instance, systems for people in outer space, you're going to have to do it in a laboratory.
You're not going to go up there and experiment while people are flying around the planet. So, sometimes you can't do one thing or the other – it doesn't make sense. Similarly, with users – if you're designing something for chief executives of Fortune 100 companies, you're not going to get 20 of them in a room and do a user study with them. That's not practical. So, you have to understand what's practical, what's reasonable and choose your methods accordingly. Key to all of this is understanding the purpose of your experimentation.
Why are you doing the evaluation in the first place? What do you want to get out of it? And there's usually said to be two main kinds of user evaluation. The first of them is what's called *formative evaluation*. And that's about "How can I make something better?". So, you've designed an interface and you're partway through. This is in the iterative process. You're in that iterative process, and you're thinking: "Okay, how do I make it better? How do I find out what's wrong with it?"
In fact, people often focus on what's wrong. The making it better sometimes is a better way to think about it. But very often people look for usability faults or flaws. Maybe you should be looking for *usability opportunities*. But whichever way, your aim is about making this thing better that you have in front of you. So, that's about improving the design. The other kind of evaluation you might do is towards the end of that process, which is "Is it good enough? Does it meet some criteria?". Perhaps somebody's giving you something that says: "I've got to put this into the company, and everybody has got to be able to use this
within ten minutes; otherwise, it's no good." So, you have some sort of criteria you're trying to reach. So, that's more about contractual or sales obligations, and it's an endpoint thing. The two of these will often use very similar methods. You might measure people's performances, do a whole range of things. But in the first of them – the formative one – your aim is about improving things. It's about unpacking what's wrong to make it better.
In the second, your aim is about finding out whether you've done it well enough. Sometimes, people use this to try and *prove* that they've done it well enough. So, there's an interesting tension that goes on there. However, those two are important. But there's a third, which is often missed, which is: In practice people *are* doing things, but often forget and don't realize what they're doing. There isn't a good name for this one. I sometimes call it "explorative", "investigative", "exploratory".
And this is about when you want to understand something. So, I might be giving somebody a new mobile interface to use because that's the interface I'm going to deliver and I want to make it better. But I might give them the interface to use because I want to understand how they would use something like it. So, say it's a life-logging application – it's about health monitoring. You know, "How well are you feeling today?" and stuff like that. I might be more interested in finding out how that would go into their lives,
how it would fit with their lives, how it would make sense to them – the kinds of things they would want to log. Later on, then, I might throw away completely what I've designed. So, it wasn't an early design; it was more an exploratory thing – a thing to find out. Now, you'll certainly do that from an academic research point of view if you're doing a Ph.D. or if you're doing research in HCI. But it's also true early in a design process. Your aim is more to understand the situation than it is to make something that's going to get better
or to say it's good enough. It's very easy to confuse these goals. That's why I'm telling you about them – because your goal, what you're really after, might be investigative but you might address your experiment as if it's summative – a "good enough" answer: "Yes, it was good enough." That doesn't tell you anything. So, if I had this health application and I found that people enjoyed using it,
what does that tell me? What have I learned? So, if you know *what* you're trying to address, you can then tune your evaluation for that. And when does this process end? Evaluation could go on forever, especially if you think about these iterative processes. And that's not true – the summative one is when you *do* get to the end. But in these formative evaluations, when do you get to the end of that? Now, there would have been a time when you'd say, "Well, it's when we deliver the product;
when it goes into shrink-wrap and we put it on shelves." Nowadays, you may have heard the term "perpetual beta": the idea that with web applications you're constantly putting them up there, tweaking them, making them better, experimenting effectively, often with real users. So, in some sense, real use is the ultimate evaluation. Because of that, actually as you design one of the things you might want to think about is how you are going to get that information from use
in order to help you design. In fact, last week at the Wave Classic – the surfer event – I've been involved in designing a local history application for the island that I'm on. And we were able, just in time, to get a version of this out for the Wave Classic. I know, because of the number of downloads and access to feeds from logs, that some people were using the application. But I don't know whether they used any of the history things or they just used some of the other facilities on it,
because I was a bit last-minute; I didn't get a chance to get the logging in. So, I'm getting real use, but I wasn't getting information to help improve the future one. So, certainly for future prototypes, we will actually have this in. But when you design, you can actually think about how you're going to gather information from real use to help you improve things.
See how quantitative research helps reveal cold, hard facts about users which you can interpret and use to improve your designs.
Quantitative research is a subset of user experience (UX) research. Unlike its softer, more individual-oriented “counterpart”, qualitative research, quantitative research means you collect statistical/numerical data to draw generalized conclusions about users’ attitudes and behaviors. Compare and contrast quantitative with qualitative research, below:
Quantitative Research | Qualitative Research | |
You Aim to Determine | The “what”, “where” & “when” of the users’ needs & problems – to help keep your project’s focus on track during development | The “why” – to get behind how users approach their problems in their world |
Methods | Highly structured (e.g., surveys) – to gather data about what users do & find patterns in large user groups | Loosely structured (e.g., contextual inquiries) – to learn why users behave how they do & explore their opinions |
Number of Representative Users | Ideally 30+ | Often around 5 |
Level of Contact with Users | Less direct & more remote (e.g., analytics) | More direct & less remote (e.g., usability testing to examine users’ stress levels when they use your design) |
Statistically | Reliable – if you have enough test users | Less reliable, with need for great care with handling non-numerical data (e.g., opinions), as your own opinions might influence findings |
Quantitative research is often best done from early on in projects since it helps teams to optimally direct product development and avoid costly design mistakes later. As you typically get user data from a distance—i.e., without close physical contact with users—also applying qualitative research will help you investigate why users think and feel the ways they do. Indeed, in an iterative design process quantitative research helps you test the assumptions you and your design team develop from your qualitative research. Regardless of the method you use, with proper care you can gather objective and unbiased data – information which you can complement with qualitative approaches to build a fuller understanding of your target users. From there, you can work towards firmer conclusions and drive your design process towards a more realistic picture of how target users will ultimately receive your product.
Author / Copyright holder: Teo Yu Siang and the Interaction Design Foundation. Copyright terms and license: CC BY-NC-SA 3.0
Quantitative analysis helps you test your assumptions and establish clearer views of your users in their various contexts.
There are many quantitative research methods, and they help uncover different types of information on users. Some methods, such as A/B testing, are typically done on finished products, while others such as surveys could be done throughout a project’s design process. Here are some of the most helpful methods:
A/B testing – You test two or more versions of your design on users to find the most effective. Each variation differs by just one feature and may or may not affect how users respond. A/B testing is especially valuable for testing assumptions you’ve drawn from qualitative research. The only potential concerns here are scale—in that you’ll typically need to conduct it on thousands of users—and arguably more complexity in terms of considering the statistical significance involved.
Analytics –With tools such as Google Analytics, you measure metrics (e.g., page views, click-through rates) to build a picture (e.g., “How many users take how long to complete a task?”).
Desirability Studies –You measure an aspect of your product (e.g., aesthetic appeal) by typically showing it to participants and asking them to select from a menu of descriptive words. Their responses can reveal powerful insights (e.g., 78% associate the product/brand with “fashionable”).
Surveys and Questionnaires – When you ask for many users’ opinions, you will gain massive amounts of information. Keep in mind that you’ll have data about what users say they do, as opposed to insights into what they do. You can get more reliable results if you incentivize your participants well and use the right format.
Tree Testing –You remove the user interface so users must navigate the site and complete tasks using links alone. This helps you see if an issue is related to the user interface or information architecture.
Another powerful benefit of conducting quantitative research is that you can keep your stakeholders’ support with hard facts and statistics about your design’s performance—which can show what works well and what needs improvement—and prove a good return on investment. You can also produce reports to check statistics against different versions of your product and your competitors’ products.
Most quantitative research methods are relatively cheap. Since no single research method can help you answer all your questions, it’s vital to judge which method suits your project at the time/stage. Remember, it’s best to spend appropriately on a combination of quantitative and qualitative research from early on in development. Design improvements can be costly, and so you can estimate the value of implementing changes when you get the statistics to suggest that these changes will improve usability. Overall, you want to gather measurements objectively, where your personality, presence and theories won’t create bias.
Take our User Research course to see how to get the most from quantitative research.
When developing a product or service, it is *essential* to know what problem we are solving for our users. But as designers, we all too easily shift far away from their perspective. Simply put, we forget that *we are not our users*. User research is how we understand what our users *want*, and it helps us design products and services that are *relevant* to people. User research can help you inspire your design,
evaluate your solutions and measure your impact by placing people at the center of your design process. And this is why user research should be a *pillar* of any design strategy. This course will teach you *why* you should conduct user research and *how* it can fit into different work processes. You'll learn to understand your target audience's needs and involve your stakeholders.
We'll look at the most common research techniques, such as semi-structured interviews and contextual inquiry. And we'll learn how to conduct observational studies to *really understand what your target users need*. This course will be helpful for you whether you're just starting out in UX or looking to advance your UX career with additional research techniques. By the end of the course, you'll have an industry-recognized certificate – trusted by leading companies worldwide. More importantly, you'll master *in-demand research skills* that you can start applying to your projects straight away
and confidently present your research to clients and employers alike. Are you ready? Let's get started!
See how quantitative research methods fit into your design research landscape.
This insightful piece shows the value of pairing quantitative with qualitative research.
Find helpful tips on combining quantitative research methods in mixed methods research.
Qualitative and quantitative research differ primarily in the data they produce. Quantitative research yields numerical data to test hypotheses and quantify patterns. It's precise and generalizable. Qualitative research, on the other hand, generates non-numerical data and explores meanings, interpretations, and deeper insights. Watch our video featuring Professor Alan Dix on different types of research methods.
Ah, well – it's a lovely day here in Tiree. I'm looking out the window again. But how do we know it's a lovely day? Well, I could – I won't turn the camera around to show you, because I'll probably never get it pointing back again. But I can tell you the Sun's shining. It's a blue sky. I could go and measure the temperature. It's probably not that warm, because it's not early in the year. But there's a number of metrics or measures I could use. Or perhaps I should go out and talk to people and see if there's people sitting out and saying how lovely it is
or if they're all huddled inside. Now, for me, this sunny day seems like a good day. But last week, it was the Tiree Wave Classic. And there were people windsurfing. The best day for them was not a sunny day. It was actually quite a dull day, quite a cold day. But it was the day with the best wind. They didn't care about the Sun; they cared about the wind. So, if I'd asked them, I might have gotten a very different answer than if I'd asked a different visitor to the island
or if you'd asked me about it. And it can be almost a conflict between people within HCI. It's between those who are more *quantitative*. So, when I was talking about the sunny day, I could go and measure the temperature. I could measure the wind speed if I was a surfer – a whole lot of *numbers* about it – as opposed to those who want to take a more *qualitative* approach. So, instead of measuring the temperature, those are the people who'd want to talk to people to find out more about what *it means* to be a good day.
And we could do the same for an interface. I can look at a phone and say, "Okay, how long did it take me to make a phone call?" Or I could ask somebody whether they're happy with it: What does the phone make them feel about? – different kinds of questions to ask. Also, you might ask those questions – and you can ask this in both a qualitative and quantitative way – in a sealed setting. You might take somebody into a room, give them perhaps a new interface to play with. You might – so, take the computer, give them a set of tasks to do and see how long they take to do it. Or what you might do is go out and watch
people in their real lives using some piece of – it might be existing software; it might be new software, or just actually observing how they do things. There's a bit of overlap here – I should have mentioned at the beginning – between *evaluation techniques* and *empirical studies*. And you might do empirical studies very, very early on. And they share a lot of features with evaluation. They're much more likely to be wild studies. And there are advantages to each. In a laboratory situation, when you've brought people in,
you can control what they're doing, you can guide them in particular ways. However, that tends to make it both more – shall we say – *robust* that you know what's going on but less about the real situation. In the real world, it's what people often call "ecologically valid" – it's about what they *really* are up to. But it is much less controlled, harder to measure – all sorts of things. Very often – I mean, it's rare or it's rarer to find more quantitative in-the-wild studies, but you can find both.
You can both go out and perhaps do a measure of people outside. You might – you know – well, go out on a sunny day and see how many people are smiling. Count the number of smiling people each day and use that as your measure – a very quantitative measure that's in the wild. More often, you might in the wild just go and ask people. It's a more qualitative thing. Similarly, in the lab, you might do a quantitative thing – some sort of measurement – or you might ask something more qualitative – more open-ended. Particularly quantitative and qualitative methods,
which are often seen as very, very different, and people will tend to focus on one *or* the other. *Personally*, I find that they fit together. *Quantitative* methods tend to tell me whether something happens and how common it is to happen, whether it's something I actually expect to see in practice commonly. *Qualitative* methods – the ones which are more about asking people open-ended questions – either to both tell me *new* things that I didn't think about before,
but also give me the *why* answers if I'm trying to understand *why* it is I'm seeing a phenomenon. So, the quantitative things – the measurements – say, "Yeah, there's something happening. People are finding this feature difficult." The qualitative thing helps me understand what it is about it that's difficult and helps me to solve it. So, I find they give you *complementary things* – they work together. The other thing you have to think about when choosing methods is about *what's appropriate for the particular situation*. And these things don't always work.
Sometimes, you can't do an in-the-wild experiment. If it's about, for instance, systems for people in outer space, you're going to have to do it in a laboratory. You're not going to go up there and experiment while people are flying around the planet. So, sometimes you can't do one thing or the other. It doesn't make sense. Similarly, with users – if you're designing something for chief executives of Fortune 100 companies, you're not going to get 20 of them in a room and do a user study with them.
That's not practical. So, you have to understand what's practical, what's reasonable and choose your methods accordingly.
This video elucidates the nuances and applications of both research types in the design field.
In quantitative research, determining a good sample size is crucial for the reliability of the results. William Hudson, CEO of Syntagm, emphasizes the importance of statistical significance with an example in our video.
One of the main issues that we do need to focus on is this question of the term *statistical significance*. Basically, all that means is that if we have two different groups and we've shown them different designs or they're in two different categories – these people converted; these people didn't convert, or these people saw design A; these people saw design B – and we get *different counts* as a result, are those counts statistically significant?
If we were to do this all over again with a different set of participants, would we see similar results? Now, the example of people who converted and people who didn't convert – that doesn't usually need statistical significance or separate testing because you'd be lucky to get more than about 5% of visitors converting. But take design A versus design B. We might see two different figures for conversion from *that* lot.
And we'd want to know, well, is that meaningful? Was that a really successful design or are we just barking up the tree in statistical terms? So, the whole question is whether we would get these kinds of results again or whether these results are the product of chance. So, we have to understand a bit about statistics in order to be able to know how to test this and to know what the results actually mean in terms of significance. Here's an example.
So, this actually is taken straight from an example on Optimal Workshop's Chalkmark. So, Chalkmark is Optimal Workshop's version of first-click testing. And that little wireframey thing right in the middle of the screen is the thing being tested. You don't need much in terms of visual design in order to be able to do first-click testing, which this is an example of. And in this particular case – this is using their own figures – they had 60 people who clicked in what we thought was the right place.
So, in these early-design tests, you're often allowed to state what is considered to be success and what isn't – which is great because it means you can actually get some overall validation of what it is you're trying to do. We had, I think, a little bit more than 100 participants. We had 60 people click in the right place, and we had 42 people click in the wrong place. So, those are just shown on the slide as success and failure.
And we had one person skip, who we're going to ignore. Now, 60 is bigger than 42, but is it very much bigger? They're both numbers that are quite close to the middle. And if you look at the pie chart, it's clearly in favor of the successful side. There's more green than red, but it isn't exactly an overwhelming result. So, we need to run a test of statistical significance. Now, this is what's called *categorical data*. We have two counts: one of 16 and one of 42.
And, for that, we have a very well-understood and popular test called *chi square*. And we can do this very simple test even with just Excel, or lots and lots of online websites that will offer to do chi-square tests for you. And all we have to do is put in our counts what we were expecting and what we got. Now, in terms of what we were expecting if it was *random* selection, if we simply flipped a coin for every participant taking part in this,
we would expect a 50/50 ratio. So, that would be 102 total. So, 51 get it right and 51 get it wrong. But we've got the actual figures of 60 and 42. And it turns out that there is a 7% chance of this result occurring *randomly*. And that really isn't good enough. We tend to use in user experience, as in a lot of social research, a figure of 95%, leaving a 5% chance of random occurrence.
Here, we're only 93% certain that this is not random. And so, we would actually say this is not statistically significant. And that's the kind of thing that we need to do with a very large amount of the results that we're looking at in all kinds of tools, if not all of them. In some cases – you know – if you've got "90% of users did this", then you probably can get away without running a separate significance test. But that's the kind of thing that we need to do, and we'll be talking about how to do these things.
So, what conclusions can we reach with this particular result? Is the wireframe a poor design? Well, probably – people really aren't doing terribly well; we're not doing really significantly better than random; just flipping a coin. So, there isn't a large enough difference between the people who got it right and the people who got it wrong. And one of the things that we need to be concerned about is *how engaged users were*. Were they actually following the instructions?
Or were they just clicking blindly in order to be able to get through to the end of the study so they could claim compensation? — Some kind of remuneration for having taken part in the study, which I'm afraid is something that happens – not all that frequently, but it's certainly a known hazard. And you could expect in any particular study that maybe 5–10% of your participants will not actually be paying attention to what they're actually doing. So, we need to make sure that we're looking at clean data and that we don't have other sources of noise.
And, of course, one of the sources of noise is just going to be our choice of terminology. And if we're using words that users *aren't* quite certain about, then, yes, we might expect half of them to get it wrong.
He illustrates that even with varying results between design choices, we need to discern whether the differences are statistically significant or products of chance. This ensures the validity of the results, allowing for more accurate interpretations. Statistical tools like chi-square tests can aid in analyzing the results effectively. To delve deeper into these concepts, take William Hudson’s Data-Driven Design: Quantitative UX Research Course.
Quantitative research is crucial as it provides precise, numerical data that allows for high levels of statistical inference. Our video from William Hudson, CEO of Syntagm, highlights the importance of analytics in examining existing solutions.
*When and Why to use Analytics* Primarily, we're going to need to be using analytics on existing solutions. So, if you're talking about *green field* – which is a brand-new solution, hasn't been built and delivered yet – versus *brown field* – which is something that's already running but perhaps we want to improve it – then we're decidedly on the brown field side.
So, we're looking at existing solutions because it's only existing solutions that can provide us with the analytics. If you haven't got an existing solution, you're going to have to use another technique. And there are obviously many other techniques, but they're not going to provide you with much in the way of *quantitative data*. We do have early-research methods, which we'll be talking about very briefly as an alternative, but predominantly analytics for existing deployed solutions.
Having said that, then if you're looking at a rework of an existing site or app, then looking at current analytics can tell you a lot about what you might like to address; what questions you might like to raise with your team members, stakeholders, users. So, those are important considerations. A good starting point in organizations or teams with low UX maturity is analytics because analytics are easier to sell – to be honest – than qualitative methods.
If you're new to an organization, if they're only just getting into user experience, then trying to persuade colleagues that they should be making important decisions on the basis of six to eight qualitative sessions, which is typically what we do in the usability lab, then you should find by comparison web analytics a much easier thing to persuade people with. And the other issue particularly relevant to qualitative methods
is that quantitative methods tend to be very, very much cheaper – certainly on the scale of data, you are often having to talk in terms of hundreds of dollars or pounds per participant in a *qualitative* study, for various expenses; whereas a hundred dollars or pounds will get you potentially hundreds or thousands of users. And, in fact, if you're talking about platforms like Google Analytics which are free, there is no cost other than the cost of understanding and using
the statistics that you get out; so, obviously it is very attractive from a cost perspective. Some of the things that we'll be needing to talk about as alternatives to analytics or indeed *in addition* to analytics: Analytics can often *highlight* areas that we might need to investigate, and we would then have to go and consider what alternatives we might use to get to the bottom of that particular problem.
Obviously, *usability testing* because you'll need to establish *why* users are doing what they're doing. You can't know from analytics what users' motivations are. All you can know is that they went to *this* page and then they went to *that* page. So, the way to find out if it isn't obvious when you look at the pages – like there's something wrong or broken or the text makes no sense – is to bring users in and watch them actually doing it, or even use remote sessions – watching users doing the thing that has
come up as a big surprise in your analytics data. A/B testing is another relatively low-cost approach. It's – again – a *quantitative* one, so we're talking about numbers here. And A/B testing, sometimes called *multivariate testing*, is also performed using Google Tools often, but many, many other tools are available as well; and you show users different designs;
and you get statistics on how people behaved and how many converted, for example. And you can then decide "Well, yes, putting that text there with this picture over here is better than the other way around." People do get carried away with this, though; you can do this ad nauseam, to the point where you're starting to change the background color by minute shades to work out which gets you the best result. These kinds of results tend to be fairly temporary. You get a glitch and then things just settle down afterwards.
So, mostly in user experience we're interested in things which actually really change the user experience rather than getting you temporary blips in the analytics results. And then, finally, *contextual inquiry* and *early-design testing*: Contextual inquiry is going out and doing research in the field – so, with real users doing real things to try to find out how they operate in this particular problem domain; what's important to them; what frustrations they have;
how they expect a solution to be able to help them. And early-design testing – mostly in the web field these days but can also be done with software and mobile apps; approaches like *tree testing* which simulate a menu hierarchy. And you don't actually have to do anything other than put your menu hierarchy into a spreadsheet and upload it – it's as simple as that; and then give users tasks and see how they get on.
And you can get some very interesting and useful results from tree testing. And another early-design testing approach is *first-click testing*. So, you ask users to do something and you show them a screenshot – it doesn't have to be of an existing site; it can be just a design that you're considering – and find out where they click, and is where they click helpful to them? Or to you? So, these are examples of early-design testing – things that you can do *before* you start building
a product to work out what the product should look like or what the general shape or terminology or concepts in the product should be. And both of these can be used to find out whether you're on the right track. I have actually tested solutions for customers where users had no idea what the proposition was: "What does this site do?"; "What are they actually trying to sell me?" or "What is the purpose of it?" – and it's a bit late to be finding that out in usability testing towards the end of a project, I have to say. And that was indeed exactly what happened in this particular example
I'm thinking of. So, doing some of these things really early on is very important and, of course, is totally the opposite of trying to use web analytics, which can only be done when you finish. So, do bear in mind that you do need some of these approaches to be sure that you're heading in the right direction *long before* you start building web pages or mobile app screens. Understand your organization's *goals* for the interactive solution that you're building.
Make sure that you know what they're trying to get out of it. Speak to stakeholders – stakeholders are people typically within your organization who have a vested interest in your projects. So, find out what it's supposed to be doing; find out why they're rebuilding this site or why this mobile app is being substantially rewritten. You need to know that; so, don't just jump in and start looking for interesting numbers.
It's not necessarily going to be that useful. Do know the solutions; become familiar with them. Find out how easy it is to use them for the kinds of things which your stakeholders or others have told you are important. Understand how important journeys through the app or website work. And get familiar with the URLs – that's, I'm afraid, something that you're going to be seeing a lot of in analytics reports – the references for the individual pages or screens,
and so that you'll understand, when you actually start looking at reports of user journeys, what that actually means – "What do all these URLs mean in my actual product?" So, you're going to have to do some homework on that front. You're also going to have to know the users – you need to speak to the users; find out what they think is good and bad about your solutions; find out how they think about this problem domain and how it differs from others and what kind of solutions they know work and what kind of problems they have with typical solutions.
Also ask stakeholders and colleagues about known issues and aspirations for current solutions. So, you know, if you're in the process of rebuilding a site or an app, *why* – is it just slow-ish? Is it just the wrong technology? Maybe. Or are there things which were causing real problems in the previous or current version and that you're hoping to address those in the rebuild.
Quantitative methods, like analytics and A/B testing, are pivotal for identifying areas for improvement, understanding user behaviors, and optimizing user experiences based on solid, empirical evidence. This empirical nature ensures that the insights derived are reliable, allowing for practical improvements and innovations. Perhaps most importantly, numerical data is useful to secure stakeholder buy-in and defend design decisions and proposals. Explore this approach in our Data-Driven Design: Quantitative Research for UX Research course and learn from William Hudson’s detailed explanations of when and why to use analytics in the research process.
After establishing initial requirements, statistical data is crucial for informed decisions through quantitative research. William Hudson, CEO of Syntagm, sheds light on the role of quantitative research throughout a typical project lifecycle in this video:
This is a very typical project lifecycle in high-level terms. Generally start off with *requirements* – finding out what's needed, and we go off and talk to stakeholders. And one of the problems we have with *user requirements*, in particular, is that often analysts and requirements researchers in the IT world tend to go off and want to ask *users* what they want.
They don't really understand that users don't quite know what they want, that you actually need to do user research, and that is one of the biggest issues that we face in user experience: is the lack of understanding of user research and the whole field of user experience. From requirements, we might expect to be doing surveys to find out – particularly if we have an existing offering of some kind – we might find out what's good about it, what's not so good about it,
what people would like to do with it. And surveys might be helpful in those particular areas. Now, bear in mind that generally when we're talking about surveys, we already need to have some idea of the questions and the kinds of answers people are going to give us. It is really a very bad plan to launch a large survey without doing some early research on that, doing some qualitative research on how people think about these questions and these topics
and trying to understand it a little bit better before we launch a major initiative in terms of survey research. We can also use surveys in *analysis and design* perhaps to ask people which kinds of things might work better for their particular needs and behaviors. We also can start to employ *early-design testing*, even in the analysis and design phase so that we've got perhaps some wireframes that we're thinking about on the *design* side,
and we can start to *test* them – start to try to find out: "Will people understand this? Will they be able to perform the most important tasks from perspective?" I have been involved in user testing of new product ideas where users had *no idea* what the service being offered was about because it was just presented *so confusingly*; there was no clear message; there was no clear understanding of the concepts behind the message because it wasn't very clear to start with, and so on.
So, early-design testing really has an important role to play there. *Implementation* and *testing* – that's when we can start doing a lot more in terms of evaluating what's going on with our products. There we would employ *usability evaluations*. And the things that I've called "early-design testing", by the way, can be done later on too. It's just they don't really involve the finished product. So, they're perhaps not quite as relevant. But if we've got questions about how the navigation might be changed,
then we might fall back to the tree testing where we're just showing people the navigation hierarchy rather than the whole site and asking them to perform tasks and just tweak the navigation as required to improve that. And one of my big complaints with our whole industry – still, after all these decades! – is that we do tend only to be allowed to do usability evaluations, and we do tend to wait until implementation has taken place
and the product is being tested before we start to try to involve real users, which really is far too late in the whole process. If you want to be able to be confident in the concepts and the terminology that your interactive solution is providing to your users and customers, then that needs to start way back at the beginning of the project cycle. And then, finally, once we've got live solutions available,
we can use *analytics* for websites and apps and we can also use A/B and multivariate testing to make sure that our designs are optimal. If we find problems, we might set up an A/B experiment to see whether this particular alternative would be a better solution or we could go down the multivariate route where we provide permutations of a *number* of different design elements on a particular page and see which of those elements proved to be the most effective.
The fact that if you're doing project development, software development in an iterative environment – like agile, for example – then you might be doing a little bit of this in every single iteration; so, there might be a little bit of work on the requirements at the front and there might be a little bit of design and analysis. Having said that, there is usually some upfront requirements and analysis and design that has to go on so that you know what *shape* your project is
– what *shape and size* I think is perhaps a better or more complete description – because in order for you to be able to even guess at how long this is going to take you, you need to have *scoped* it. And to scope it means to set the boundaries, and to set the boundaries means to understand the requirements and to understand what kind of solutions would be acceptable; so, there will be some of this done always up front. Anybody who sets on a major project *without* doing upfront requirements analysis and design of some sort
is – I'm afraid – probably asking for trouble.
During the analysis and design phases, quantitative research helps validate user requirements and understand user behaviors. Surveys and analytics are standard tools, offering insights into user preferences and design efficacy. Quantitative research can also be used in early design testing, allowing for optimal design modifications based on user interactions and feedback, and it’s fundamental for A/B and multivariate testing once live solutions are available.
To write a compelling quantitative research question:
Create clear, concise, and unambiguous questions that address one aspect at a time.
Use common, short terms and provide explanations for unusual words.
Avoid leading, compound, and overlapping queries and ensure that questions are not vague or broad.
According to our video by William Hudson, CEO of Syntagm, quality and respondent understanding are vital in forming good questions.
We're going to be talking about writing good questions. The quality of the questions, along with the quality of the respondents, is really key. So, you have to have questions which people understand, that they really can address *unambiguously*; they don't have to sit wondering what it is you meant by that, and to do it pretty quickly, too – they need to be *short*.
So, we have a long list of various points you should take into account. And it starts with the the issue of *shortness and ambiguity*. So, we want things to be as *short as possible*. We want to use short questions, short words. We want to be *very specific*, and we want to be *very unambiguous*. So, you don't ask vague questions. If you want to know about whether somebody has done something, you need to specify over what period. A lot of questionnaires these days I noticed ask,
'What did you do yesterday?' or 'Did you watch this yesterday? Did you do that yesterday?' And you could of course change that period as required, but it's very unrealistic to expect people to either mind-read if you're not being at all clear about when you mean, or to remember very far back. And that changes with the topic, of course. Somebody might remember when they were married from a long time ago, but the last time they visited a coffee shop – you might not get away with more than a couple to four weeks, for example.
Do use *common terms* and provide *explanations* where you're using anything that's a little bit unusual. And common terms in English are generally fairly short. So, use short words. Short words appear more frequently, are used more frequently. The number of syllables is actually a good indicator; the more syllables, the less frequent the word is used in English. And with a lot of interactive survey tools, you are able to provide explanations.
If nothing else works, then just put something parenthetically after the question to explain what it is you mean by that particular term. Start the questionnaire or the survey with *easy, uncontroversial questions*. If you're going to go on to be asking sensitive questions or things that people might get slightly aggravated or upset about, then try to leave that as late as possible. You want people to get engaged; you want them to feel comfortable, and to an extent trust you.
And starting with controversial questions or in some way irritating questions is not a good way to do that. Do *use words to label all points on a scale*. I've got a terrible question that I've made up about Brexit. But I've labeled every single point. Now, this used to be for me a debatable issue, that I suggested that perhaps we should just label the two ends and number the points in between, but I've seen good evidence and firm advice that you should label all of the points in between the two.
Certainly on seven – this is a seven-point response – five or seven is fairly typical. And in those cases, you would expect to see words being used there. Address only *one aspect of a question at a time*. *Avoid compounds* like: 'I found the website quick and easy to use,' because there is the possibility that somebody found it easy to use but not quick, or quick but not easy. So, stop that from being a problem by asking those questions separately. Just like in navigation design, *do not use overlapping categories or ranges*.
People need to be very clear about where they should be clicking; so, you want '20-49', '50-64,' '65 and over', rather than '20-50', '50-65', '65 and over' because you can see in that latter example that there are two places where people, if they were exactly 50 or exactly 65 would not be clear about where you actually want them to answer. And *do not ask leading questions* if you want honest responses,
or any responses at all, for that matter. Certainly, it's during various elections that have taken place over the last couple of years, I have seen numerous alleged surveys come round where clearly it is not a questionnaire or a survey at all; it is a party political statement, and I just stop as soon as I realize that that is the case. And I believe a lot of other people do as well. So, this example on 'How bad an idea is Brexit?' is an example of a leading or loaded question because of course some people think that Brexit is a perfectly good idea
and we should not be talking about it as a bad idea in their eyes. So, this is something that you would not do. In fact, there are quite a few things in that particular example you should not do, so just treat that as a bad example... *Avoid question grids*; I know that they are extremely popular, but if you can avoid them, they are worth avoiding because they are intimidating, certainly with the large question grids, people turn over to them online typically, and they immediately get put off. It looks very complicated. It is quite complicated.
If they were on the brink of not completing your questionnaire, that probably has pushed them over the edge. And if you're unlucky and are working with a service provider for your questionnaire, your survey tool, that doesn't support converting these into individual questions, then you'll find that they don't actually work on mobile phones. And that certainly does happen from time to time, and of course we can expect more people to be
using mobile phones as their primary internet tool; so, that would be a bad plan. Really the best way of approaching these is to ask the questions separately. And this is actually quite a bit more likely in the problem domain that we're talking about, which is user experience, rather than, say for example, market research where you might have a list of 10 or 20 coffee shops down the left-hand side and the frequency of use across the top. That's still not easy for respondents, but it's perhaps a little bit easier than asking them
deep, meaningful questions like this particular one about Barack Obama. *Ranking questions* are those where you're asked to re-order the responses. And the *problem* with them is twofold. One is that it's actually not very easy as a participant to do this, that you have to think about, 'Well, which of these is my preference? Which is my second preferred, third preferred?' etc. And it's off-putting to them; it's time-consuming; it's hard to do in some cases,
particularly on a mobile platform, but it might be quite technically challenging, just the dexterity required, and it's also off-putting; you're making people think really hard about something where the detailed answers are not all that important. What does it matter to you in a list of five whether somebody lists something fourth or fifth? It's not first; it's not second; so, do you really care that much about it?
It turns out to be very easy and almost equivalent. I have not seen a paper saying that they're equivalent, by the way, but certainly when I've done it, I've not been disappointed. Just ask people to choose their *favorite*. Or, if you've got a long list, maybe their favorite top 'n', where 'n' might be 2, 3, 4 ... And you're not talking about order, then; you're just talking about which is the most favorite or which are the two most favorite. And when you come to analyze those, the analysis is very much simpler because it is just the items with the biggest numbers are the most popular.
And that was the second point with the whole ranking thing – is that you have to do a *weighted analysis* in order to get sensible results from the ranking, which means taking into account where in the list these things appeared; whereas if you're just asking people to choose their favorite, it is very much easier and it's just slightly more complicated for their favorite two or their favorite three. And most survey tools will let you set a *maximum and minimum number of responses* for this kind of question.
So, if you really insist on having two choices from a long list, then it could complain to participants that they've not selected two or preferably a maximum is the better way of doing that so people cannot choose the full number if they really don't care that much. So, that is something I would just strongly recommend you avoid altogether. It's really of no particular benefit. It looks fun when you're looking at it in the tool and maybe on the screen yourself.
But when you're talking about lots of respondents getting to it and dealing with it, it's not fun for them. Open-ended questions do allow some flexibility, and of course they're not in any way going to replace an interviewer who can dig a lot more deeply and try to interpret or understand what participants are saying. They have got their use, though, open-ended questions; the set of possible answers is unknown
or very large, is the main reason for doing that. So, if you've got a list of items and there are other possibilities, you will have an 'other' response, and under the 'other' response you will have a text box for people to fill in. Perhaps you want to know the underlying cause of a response – 'Why did you rate us like that?'; 'What was the main thing?' And that's certainly very commonly done and a perfectly good use of open-ended responses. Or you need to allow participants to express *unanticipated concerns*.
So, 'Is there anything that we could have done to have prevented this or to make you happier?' – that kind of question. And these two examples from SurveyMonkey: the top one is the final question in most SurveyMonkey templates. It's open-ended. And that's recommended for almost all surveys, that you give people a chance to comment about either your organization or the questionnaire or just general comments that they might have. 'Is there anything else you'd like to add?' is the kind of question you would ask there.
The bottom one is actually from their market research template, and it has a lot of open-ended questions in it, but it's one of the very few SurveyMonkey templates that actually has more than one or two open-ended questions; so, you really can largely – and perhaps should try largely – to stick to multiple choice questions. An alternative – and you may see this yourself as a respondent to various questionnaires – an alternative to extensive open-ended questions is to invite survey participants to take part
in an *online or telephone interview*. So, you might ask them a few questions, and they might express some reservations about certain aspects of your product or service. And if they did that, you might offer them the chance to be interviewed at depth, either by telephone or through some online collaboration tool. And of course that becomes, then, a primarily qualitative approach, and you would only do this with a relatively small number of participants,
mostly because it will be moderately time-consuming; you could expect it to take at least half an hour, and you will need a qualified interviewer to do that. Do make sure that if you're using this approach that the interviewer does have access to the respondent's initial questionnaire so they're not repeating themselves or are not at all aware of the participant's background or complaint. *Semi-structured interviewing* is the most appropriate technique in most cases. So, this would be a follow-up set of questions and then just the interviewer exploring
some of the responses that the participant has given already.
He emphasizes the importance of addressing specific aspects and avoiding intimidating and confusing elements, such as extensive question grids or ranking questions, to ensure participant engagement and accurate responses. For more insights, see the article Writing Good Questions for Surveys.
Survey research is typically quantitative, collecting numerical data and statistical analysis to make generalizable conclusions. However, it can also have qualitative elements, mainly when it includes open-ended questions, allowing for expressive responses. Our video featuring the CEO of Syntagm, William Hudson, provides in-depth insights into when and how to effectively utilize surveys in the product or service lifecycle, focusing on user satisfaction and potential improvements.
Looking at when to use surveys relative to the product or service lifecycle? Well, you might have an existing solution and for that you may well want to consider a survey after every *major release*; or perhaps on a *calendar basis*, every quarter you might run a customer satisfaction or a user satisfaction survey. And that allows you to keep a pulse on things to know how your offerings are faring
in your users' or customers' eyes. If you have specific ideas for improvements, you can also ask users about that, ask customers to tell you the sorts of things your competitors are doing that they like or perhaps ask them about *sticking points* in your current solutions: what is it that they find problematic, where would they like to see improvements? And, of course, in those kinds of areas, you are talking more about open-ended. But certainly, if you've got a list of things that your competitors are doing, it's very easy to ask people
whether they would be interested or would find those particular features useful. For new solutions, you often end up using quantitative research, which is what a survey is, to what we call "triangulate" – get extra data about – back up – qualitative research that you've done. So, you might have gone out and done some contextual inquiry and you might have some really exciting ideas about new product directions, and you want to make sure that that makes sense for the majority of your customer base.
You wouldn't just go out on the strength of a dozen interviews and launch a new product or major revisions to a product or service. So, the use of a survey is really almost essential in those kinds of cases. *Alternatives* – I mentioned contextual inquiry. The great thing about contextual inquiry is that it's grounded. We go out and speak to real people about real situations in a fairly – what's called – *ethnographic way*.
So, we're trying to do it in their own settings, where they would be using this product or service. And a contextual inquiry is *extremely exploratory*. So, if you start hearing about certain ideas on a regular basis, you can start asking *more* about that and try to expand the scope of your inquiries to cover these new concepts and find out a lot more about the product or service that you should be providing, as opposed to the one that you perhaps currently are or were planning to.
*Semi-structured interviews* – well, these are a really important part of most qualitative research and, in fact, is used in contextual inquiry as well, but they aren't necessarily as well grounded. We don't necessarily go out into the user's environment to do those, but one of the attractions there is that we can start off in both of these examples – contextual inquiry and semi-structured interviews – start off with a collection of initial questions and then explore from those.
So, we might have only a short list of topics that we definitely wanted to cover and we'll let the conversation ramble into interesting connected areas – not just ramble in general, by the way; "interesting connected areas" is an important part of that. You want to make sure that you're still within the focus of your inquiry, of your research. *Card sorting* – it's really good for early research for finding the relationships between concepts. We've got concepts on cards, and we ask people to sort those cards
into groups, either of their own creation, so they're allowed to make the groups up themselves – that's called an *open sort* – or a *closed sort*, where we provide the groups and we want to see if people agree with where they're putting things; and in between, of course, those two is something that I call a *hybrid sort*. It has different names. And there are other early-testing tools, which we do talk about elsewhere. Those are, I should say, *tree sorting* or *tree testing* and *first-click testing*, where we're trying out very specific things; we give users a goal, and we try to see how they
address that goal with the solutions that we're thinking of providing. So, in the case of tree sorting, it's actually *menu testing*. So, the tree is the menu, and we say, "Where would you find this?" / "How would you do this on this site?" and you show them the menus a step at a time. And there is no site yet. There's just a listing of the menu items in a step-by-step progression. So, they're shown the top-level menus, they're shown the second-level menus, etc., as they navigate through.
So, it's really easy to do and you get some really good hard data out of that. And, similarly, with first-click testing, you might have just wireframes or really early prototypes; it can even be sketchings and you ask people to try to achieve a goal with these designs. You record where they click and how they try to achieve that. So, it's actually first-click testing is the most interesting part of that: Where do they focus their attention initially when trying to achieve those goals? So, these are all alternatives to asking people about things.
And, of course, in these latter cases we're talking about seeing people do things rather than asking them their opinions, which is a much more reliable way of getting data – not that surveys are entirely unreliable; that's not the case – but first-hand information about what people do rather than what they talk about doing is much safer. And this is a pretty typical field-working experience. The guy on the right has a PDA or phone.
Hopefully, it's a multiple-choice questionnaire he's asking because it's really very hard to make notes on a device like that, but this is the kind of situation where you can direct the questioning according to how the participant is answering. So, this is an alternative to surveys.
He emphasizes the importance of surveys in triangulating data to back up qualitative research findings, ensuring we have a complete understanding of the user's requirements and preferences.
Descriptive research focuses on describing the subject being studied and getting answers to questions like what, where, when, and who of the research question. However, it doesn’t include the answers to the underlying reasons, or the “why” behind the answers obtained from the research. We can use both f qualitative and quantitative methods to conduct descriptive research. Descriptive research does not describe the methods, but rather the data gathered through the research (regardless of the methods used).
When we use quantitative research and gather numerical data, we can use statistical analysis to understand relationships between different variables. Here’s William Hudson, CEO of Syntagm with more on correlation and how we can apply tests such as Pearson’s r and Spearman Rank Coefficient to our data.
Correlation is actually quite an interesting topic, perhaps not directly relevant to user experience in many cases, but it certainly does have its uses. And it's where we're trying to decide whether there is any kind of relationship between two different variables. So, for example, in user experience we might be tempted to find out whether the longer that somebody spends on a particular page the more they spend at checkout or the more they donate to our cause
or whether there's an inverse relationship, which is where the opposite would happen, or if there's no relationship at all; so, by way of real-world example, if we look at the connection between people's heights and weights, they *are* correlated, but there's no real direct relationship. People who are taller also tend to be heavier; it's just the way the world works. But that doesn't define the weight, by any chance – you can't predict somebody's weight from their height.
But the scattergram, it's called – it's literally every point plotted on both dimensions. On the right-hand side here, we have heights and weights, and every dot represents a person. It tends to show that as people get higher the weights go up. So, that is by way of a correlation. It's also called a *scatter plot* as I've named it in the slides. But the word "scatter" is the important one. And it's actually a very good way of looking at your data and trying to find out if there's anything interesting going on there.
We're going to talk initially about a very common – or perhaps *the* most common – correlation test. And it's called Pearson's Correlation Coefficient, and sometimes it's just referred to as "Pearson's r"; "r" is the name of the variable that's always used for the correlation coefficient. And the way that it works is that r expresses the strength of the relationship from -1, which would be perfectly inversely correlated – that *would* mean that as the height went up the weight would go down in a very direct way
– which, as you can see, is not really at all the case – or up to one, +1, which is where the weight goes up, the height follows directly and vice versa. And, of course, in the middle of all that is zero, where there is no correlation. And if there is no correlation, you'd expect to see that scatter plot or scattergram having points all over the white space, not just towards the middle. The nature of the relationships are these.
Now – certainly with Pearson's r, we're talking about *linear* relationships. So, if it's not linear, then Pearson's won't think it's correlated. But we can see the sorts of things that we might get as results from Pearson's. So, top-left-hand corner is a strong positive correlation. A bit to the right is a weak positive correlation where the points aren't quite following the straight line that we've drawn. The third on the right is a strong negative correlation, where the line is going down. So, as one variable increased up the y-axis, the other decreased across the x-axis.
Similarly, weak negative correlation – the points aren't quite as tightly packed around the line. And then, moderate, and finally no correlation, which is what I mentioned as being r equals or approximately equal to zero. Now, there is a kind of rule of thumb with correlation coefficients. And it's shown here; this is taken from "Statistics For Dummies". It is actually, in spite of its title, quite a good book on statistics.
But this particular page is taken from their website. And you can see it runs from -1 to +1, as I said. And around about 0.7 plus or minus 1 is a strong relationship; 0.5 moderate, and 0.3 weak. So, anything between 0.3 and 0 is more or less not very correlated. And anything between 0.7 and 1, whether it's positive or negative, is very strongly correlated. Let's have a look at correlation in user experience. Charts are from the Google Merchandise store.
The top one is to do with the length of sessions, and the bottom one is to do with conversion value – the amount of money that people have spent at checkout. And while we don't need these to be normally distributed for the correlation tests to work, it's a little bit unlikely that if you've got two extremely different-looking distributions – maybe one's normal and one's not, or maybe they've got glitches in different places –
the fact that they're different distributions means that they're not nearly as likely to be correlated. So, looking at a particular instance, the conversion value – the very first column – is labeled zero. That's somebody's checked out with presumably very little; actually, it's not quite zero, is it? It's a little bit above zero: a few pounds or dollars. There are not many of those, but if we look at a similar location in the session lengths, the zero- to ten-second session length is the most common by a long way.
And, of course, we know the reason for that, and the reason for that is that lots of people come to a website and decide it isn't really where they wanted to be; so, they've left quite quickly. And it needn't necessarily be a problem. We'd actually have to try this. But the point is that we're interested in the correlation between session length and conversion value. And there is no conversion value for people who have not gone through the checkout. So, we almost certainly are going to be throwing out that 0- to 10-second column, anyhow. So, we might still find some correlation there.
Note that there are many more than just one correlation test. Pearson's r is just the most common, most used. Pearson's r is very much a linear correlation test, as I've mentioned; so, if the relationship between your two variables is not linear, then it's not going to find much in the way of a correlation. Another very common and popular test is called the Spearman Rank Coefficient, and that is *not* sensitive to linearity.
It doesn't have to be a linear relationship. So, this scatter plot shows a very strong correlation result: 0.92, which is getting very close to 1. But it's certainly not a linear relationship. You can see it's curved; curvy linear is the proper term; and so, that might actually be more relevant if you're not sure exactly what kind of relationship you're going to have; or if you look at a scattergram and you can see that it's certainly not a linear relationship although there is some kind of correlation going on there.
This helps interpret phenomena such as user experience by analyzing session lengths and conversion values, revealing whether variables like time spent on a page affect checkout values, for example.
Random Sampling:
Each individual in the population has an equitable opportunity to be chosen, which minimizes biases and simplifies analysis.
Systematic Sampling:
Selecting every k-th item from a list after a random start. It's simpler and faster than random sampling when dealing with large populations.
Stratified Sampling:
Segregate the population into subgroups or strata according to comparable characteristics. Then, samples are taken randomly from each stratum.
Cluster Sampling:
Divide the population into clusters and choose a random sample.
Multistage Sampling:
Various sampling techniques are used at different stages to collect detailed information from diverse populations.
Convenience Sampling:
The researcher selects the sample based on availability and willingness to participate, which may only represent part of the population.
Quota Sampling:
Segment the population into subgroups, and samples are non-randomly selected to fulfill a predetermined quota from each subset.
These are just a few techniques, and choosing the right one depends on your research question, discipline, resource availability, and the level of accuracy required. In quantitative research, there isn't a one-size-fits-all sampling technique; choosing a method that aligns with your research goals and population is critical. However, a well-planned strategy is essential to avoid wasting resources and time, as highlighted in our video featuring William Hudson, CEO of Syntagm.
I wanted to say a bit more about this important issue of recruiting participants. The quality of the results hinges entirely on the quality of the participants. If you're asking participants to do things and they're not paying attention or they're simply skipping through as quickly as they can – which does happen – then you're going to be very disappointed with the results
and possibly simply have to write off the whole thing as an expensive waste of time. So, recruiting participants is a very important topic, but it's surprisingly difficult. Or, certainly, it can be. You have the idea that these people might want to help you improve your interactive solution – whatever it is; a website, an app, what have you – and lots of people *are* very motivated to do that. And you simply pay them a simple reward and everyone goes away quite happy.
But it's certainly true with *online research* that there are people who would simply take part in order to get the reward and do very little for it. And it comes as quite a shock, I'm afraid, if you're a trusting person, that this kind of thing happens. I was involved in a fairly good-sized study in the U.S. – a university, who I won't name – and we had as participants in a series of studies students, their parents and the staff of the university.
And, believe it or not, the students were the best behaved of the lot in terms of actually being conscientious in answering the questions or performing the tasks as required or as requested. Staff were possibly even the worst. And I think their attitude was "Well, you're already paying me, so why won't you just give me this extra money without me having to do much for it?" I really don't understand the background to that particular issue.
And the parents, I'm afraid, were not a great deal better. So, we had to throw away a fair amount of data. Now, when I say "a fair amount", throwing away 10% of your data is probably pretty extreme. Certainly, 5% you might want to plan for. But the kinds of things that these participants get up to – particularly if you're talking about online panels, and you'll often come across panels if you go to the tool provider, if you're using, say for example, a card-sorting tool
or a first-click test tool and they offer you respondents for a price each, then be aware that those respondents have signed up for this purpose, for the purpose of doing studies and getting some kind of reward. And some of them are a little bit what you might call on the cynical side. They do as little as possible. We've even on card sort studies had people log in, do nothing for half an hour and then log out and claim that they had done the study.
So, it can be as vexing as that, I'm afraid. So, the kinds of things that people get up to: They do the minimum necessary; that was the scenario I was just describing. They can answer questions in a survery without reading them. So, they would do what's called *straightlining*. Straightlining is where they are effectively just answering every question the same in a straight line down the page or down the screen. And they also could attempt to perform tasks without understanding them.
So, if you're doing a first-click test and you ask them, "Go and find this particular piece of apparel, where would you click first?", they'd just click. They're not reading it; they didn't really read the question. They're not looking at the design mockup being offered; they're just clicking, so as to get credit for doing this. Like I say, I don't want to paint all respondents with this rather black brush, but it's *some* people do this. And we just have to work out how to keep those people from polluting our results. So, the reward is sometimes the issue, that if you are too generous in the reward
that you're offering, you will attract the wrong kind of participant. Certainly I've seen that happen within organizations doing studies on intranets, where somebody decided to give away a rather expensive piece of equipment at the time: a DVD reader, which was – when this happened – quite a valuable thing to have. And the quality of the results plummetted. Happily, it was something where we could actually look at the quality of the results and
simply filter out those people who really hadn't been paying much attention to what they were supposed to be doing. So, like I say, you can expect for online studies to discard been 5 and 10% of your participants' results. You also – if you're doing face-to-face research – and you're trying to do quantitative sorts of numbers, say, you'd be having 20 or 30 participants, you probably won't have a figure quite as bad as that, but I still have seen, even in face-to-face card sorts, for example,
people literally didn't *understand* what they were supposed to be doing, or didn't get what they were supposed to be doing, and consequently their results were not terribly useful. So, you're not going to get away with 100% valuable participation, I'm afraid. And so, I'm going to call these people who aren't doing it, and some of them are not doing it because they don't understand, but the vast majority are not doing it because they don't want to spend the time or the effort; I'm going to call them *failing participants*. And the thing is, we actually need to be able to *find* them in the data and take them out.
You have to be careful how you select participants, how you filter them and how you actually measure the quality of their output, as it were. And one of the big sources of useful information are the actual tools that you are using. In an online survey, you can see how long people have spent, you can see how many questions they have answered. And, similarly, with first-click testing, you can see how many of the tasks they completed; you can see how long they spent doing it.
And with some of these, we actually can also see how successful they were. In both of the early-design testing methods – card sorting and first-click testing – we are allowed to nominate "correct" answers – which is, I keep using the term in double-quotes here because there are no actually correct answers in surveys, for example; so, I'm using "correct" in a particular way: "Correct" is what we think they should be doing when they're doing a card sort, *approximately*, or, in particular, when they're doing a *first-click test*,
that we think they ought to be clicking around about here. Surveys as a group are a completely different kettle of fish, as it were. There are really no correct answers when you start. You've got your list of research questions – things that you want to *know* – but what you need to do is to incorporate questions and answers in such a way that you can check that people are indeed *paying attention* and *answering consistently*. So, you might for example change the wording of a question and reintroduce it later on
to see if you get the same answer. The idea is to be able to get a score for each participant. And the score is your own score, about basically how much you trust them or maybe the *inverse* of how much you trust them. So, as the score goes up, your trust goes down. So, if these people keep doing inconsistent or confusing things, like replying to questions with answers that aren't actually real answers – you've made them up – or not answering two questions which are effectively the same the same way, etc.,
then you would get to a point where you'd say, "Well, I just don't trust this participant," and you would yank their data from your results. Happily, most of these tools do make it easy for you to yank individual results. So, we have to design the studies to *find* these failing participants. And, as I say, for some these tools – online tools we'll be using – that is relatively straightforward, but tedious. But with surveys, in particular, you are going to have to put quite a bit of effort into that kind of research.
Steps we can take in particular: Provide consistency checks between tasks or questions. Ensure that "straightlined" results – where people are always answering in the same place on each and every question down the page – ask the same question again in slightly different wording or with the answers in a different order. Now, I wouldn't go around changing the order of answers on a regular basis. You might have one part of the questionnaire where "good" is on the right and "bad" is on the left;
and you might decide to change it in a completely different part of the questionnaire and make it really obvious that you've changed it to those who are paying attention. But whatever it is that you do, what you're *trying* to do is to find people who really aren't paying much attention to the directions on the survey or whatever the research tool is, and catch them out and pull them out of your results. And of the issues you should be aware of if you're paying for participants from something
like your research tool *supplier* is that you can go back to them and say, "These people did not do a very good job of completing this survey, this study." And ask them to refund you for the cost of those. You tell them that you're having to pull their data out of your results. Also, it helps to tidy up their respondent pool. Perhaps it's not your particular concern, but if you do end up using them again, it would be nice to know that some of these people who are simply gaming the system have been removed from the respondent pool.
So, reporting them – getting them removed from the pool – is a sensible thing to be doing. And, finally, devising a scoring system to check the consistency and also checking for fake responses and people who are just not basically doing the research as you need them to do it.
He emphasizes the importance of recruiting participants meticulously, ensuring their engagement and the quality of their responses. Accurate and thoughtful participant responses are crucial for obtaining reliable results. William also sheds light on dealing with failing participants and scrutinizing response quality to refine the outcomes.
The 4 types of quantitative research are Descriptive, Correlational, Causal-Comparative/Quasi-Experimental, and Experimental Research. Descriptive research aims to depict ‘what exists’ clearly and precisely. Correlational research examines relationships between variables. Causal-comparative research investigates the cause-effect relationship between variables. Experimental research explores causal relationships by manipulating independent variables. To gain deeper insights into quantitative research methods in UX, consider enrolling in our Data-Driven Design: Quantitative Research for UX course.
The strength of quantitative research is its ability to provide precise numerical data for analyzing target variables.This allows for generalized conclusions and predictions about future occurrences, proving invaluable in various fields, including user experience. William Hudson, CEO of Syntagm, discusses the role of surveys, analytics, and testing in providing objective insights in our video on quantitative research methods, highlighting the significance of structured methodologies in eliciting reliable results.
This is a very typical project lifecycle in high-level terms. Generally start off with *requirements* – finding out what's needed, and we go off and talk to stakeholders. And one of the problems we have with *user requirements*, in particular, is that often analysts and requirements researchers in the IT world tend to go off and want to ask *users* what they want.
They don't really understand that users don't quite know what they want, that you actually need to do user research, and that is one of the biggest issues that we face in user experience: is the lack of understanding of user research and the whole field of user experience. From requirements, we might expect to be doing surveys to find out – particularly if we have an existing offering of some kind – we might find out what's good about it, what's not so good about it,
what people would like to do with it. And surveys might be helpful in those particular areas. Now, bear in mind that generally when we're talking about surveys, we already need to have some idea of the questions and the kinds of answers people are going to give us. It is really a very bad plan to launch a large survey without doing some early research on that, doing some qualitative research on how people think about these questions and these topics
and trying to understand it a little bit better before we launch a major initiative in terms of survey research. We can also use surveys in *analysis and design* perhaps to ask people which kinds of things might work better for their particular needs and behaviors. We also can start to employ *early-design testing*, even in the analysis and design phase so that we've got perhaps some wireframes that we're thinking about on the *design* side,
and we can start to *test* them – start to try to find out: "Will people understand this? Will they be able to perform the most important tasks from perspective?" I have been involved in user testing of new product ideas where users had *no idea* what the service being offered was about because it was just presented *so confusingly*; there was no clear message; there was no clear understanding of the concepts behind the message because it wasn't very clear to start with, and so on.
So, early-design testing really has an important role to play there. *Implementation* and *testing* – that's when we can start doing a lot more in terms of evaluating what's going on with our products. There we would employ *usability evaluations*. And the things that I've called "early-design testing", by the way, can be done later on too. It's just they don't really involve the finished product. So, they're perhaps not quite as relevant. But if we've got questions about how the navigation might be changed,
then we might fall back to the tree testing where we're just showing people the navigation hierarchy rather than the whole site and asking them to perform tasks and just tweak the navigation as required to improve that. And one of my big complaints with our whole industry – still, after all these decades! – is that we do tend only to be allowed to do usability evaluations, and we do tend to wait until implementation has taken place
and the product is being tested before we start to try to involve real users, which really is far too late in the whole process. If you want to be able to be confident in the concepts and the terminology that your interactive solution is providing to your users and customers, then that needs to start way back at the beginning of the project cycle. And then, finally, once we've got live solutions available,
we can use *analytics* for websites and apps and we can also use A/B and multivariate testing to make sure that our designs are optimal. If we find problems, we might set up an A/B experiment to see whether this particular alternative would be a better solution or we could go down the multivariate route where we provide permutations of a *number* of different design elements on a particular page and see which of those elements proved to be the most effective.
The fact that if you're doing project development, software development in an iterative environment – like agile, for example – then you might be doing a little bit of this in every single iteration; so, there might be a little bit of work on the requirements at the front and there might be a little bit of design and analysis. Having said that, there is usually some upfront requirements and analysis and design that has to go on so that you know what *shape* your project is
– what *shape and size* I think is perhaps a better or more complete description – because in order for you to be able to even guess at how long this is going to take you, you need to have *scoped* it. And to scope it means to set the boundaries, and to set the boundaries means to understand the requirements and to understand what kind of solutions would be acceptable; so, there will be some of this done always up front. Anybody who sets on a major project *without* doing upfront requirements analysis and design of some sort
is – I'm afraid – probably asking for trouble.
To master quantitative research methods, enroll in our comprehensive course, Data-Driven Design: Quantitative Research for UX.
The big question – *why design with data?* There are a number of benefits, though, to quantitative methods. We can get a better understanding of our design issues because it's a different way of looking at the issues. So, different perspectives often lead to better understanding. If you're working in project teams or within organizations who really don't have
a good understanding of *qualitative methods*, being able to supplement those with quantitative research is very important. You might be in a big organization that's very technology-focused. You might just be in a little team that's technology-focused, or you might just be working with a developer who just doesn't get qualitative research. So, in all of these cases, big, small and in between, having different tools in your bag is going to be really, really important. We can get greater confidence in our design decisions.
Overall, that means that we are making much more *persuasive justifications* for design choices.
This course empowers you to leverage quantitative data to make informed design decisions, providing a deep dive into methods like surveys and analytics. Whether you’re a novice or a seasoned professional, this course at Interaction Design Foundation offers valuable insights and practical knowledge, ensuring you acquire the skills necessary to excel in user experience research. Explore our diverse topics to elevate your understanding of quantitative research methods.
Remember, the more you learn about design, the more you make yourself valuable.
Improve your UX / UI Design skills and grow your career! Join IxDF now!
You earned your gift with a perfect score! Let us send it to you.
We've emailed your gift to name@email.com.
Improve your UX / UI Design skills and grow your career! Join IxDF now!
Here's the entire UX literature on Quantitative Research by the Interaction Design Foundation, collated in one place:
Take a deep dive into Quantitative Research with our course User Research – Methods and Best Practices .
How do you plan to design a product or service that your users will love, if you don't know what they want in the first place? As a user experience designer, you shouldn't leave it to chance to design something outstanding; you should make the effort to understand your users and build on that knowledge from the outset. User research is the way to do this, and it can therefore be thought of as the largest part of user experience design.
In fact, user research is often the first step of a UX design process—after all, you cannot begin to design a product or service without first understanding what your users want! As you gain the skills required, and learn about the best practices in user research, you’ll get first-hand knowledge of your users and be able to design the optimal product—one that’s truly relevant for your users and, subsequently, outperforms your competitors’.
This course will give you insights into the most essential qualitative research methods around and will teach you how to put them into practice in your design work. You’ll also have the opportunity to embark on three practical projects where you can apply what you’ve learned to carry out user research in the real world. You’ll learn details about how to plan user research projects and fit them into your own work processes in a way that maximizes the impact your research can have on your designs. On top of that, you’ll gain practice with different methods that will help you analyze the results of your research and communicate your findings to your clients and stakeholders—workshops, user journeys and personas, just to name a few!
By the end of the course, you’ll have not only a Course Certificate but also three case studies to add to your portfolio. And remember, a portfolio with engaging case studies is invaluable if you are looking to break into a career in UX design or user research!
We believe you should learn from the best, so we’ve gathered a team of experts to help teach this course alongside our own course instructors. That means you’ll meet a new instructor in each of the lessons on research methods who is an expert in their field—we hope you enjoy what they have in store for you!
We believe in Open Access and the democratization of knowledge. Unfortunately, world-class educational materials such as this page are normally hidden behind paywalls or in expensive textbooks.
If you want this to change, , link to us, or join us to help us democratize design knowledge!