To evaluate the usability of a medical device as it relates to safety, great importance is given to usability tests, both in formative and summative evaluations, as in them is possible to detected usability defects, i.e. the instance of a use error being committed. In this optic, it is important that a user test has an appropriate probability of observing a use error caused by a design defect, and this probability is related to the number of tests participants. But the question is: how many participants are needed?
A methodology proposed by AAMI HE75:2009, Human factors engineering – Design of medical devices, and contained in ISO 62366-2 represents recommendations for sample size selection in usability tests as it explains a correlation between the number of participants used in the test, called sample size, and the probability of observation. According to this methodology, the probability is cumulative, that is, it takes into account the probabilities of the individual users to commit use error, according to an exponential relationship. Consequently, confidence in the test findings of the adequacy of a user interface increases when the sample size is increased as the cumulative probability of a usability defect increases too. The methodology also suggests that large sample sizes of 100 are not always needed for usability tests, but that there are trade-offs that need to be considered when using various sample sizes.
Moreover, to determine the appropriate sample size, it is important that the manufacturer considers the potential consequences of use error, the complexity of the design and degree of similarity to existing medical devices as well as the expected heterogeneity of each user group, meaning that the users group used in user test should be representative of all users and therefore should include users who differentiate in occupational background, expected knowledge and skills levels, and medical device use patterns.
This methodology is depicted in ISO 62366-2 in both graphical and table based forms.
Figure K.1 of ISO 62366-2 shows graphically the exponential decrease of the cumulative probability of detected problems as the number of participants increases when sample size is greater than 10. It is generated from the following equation (Equation K.1 of ISO 62366-2:
R = 1 – (1 – P)N
R is the cumulative probability of detecting a usability problem,
P is the probability of a single test participant having a usability problem (or the underlying usability defect probability),
N is the number of test participants in the evaluation
Figure K.1 of ISO 62366-2: Number of test participants needed in a USABILITY TEST
Table K.1 of ISO 62366-2 illustrates the probability of observing at least one instance of a use error as a function of sample size and underlying population use error rates. in is important to underline that the underlying population use error rate can never be known and has to be estimated.
Table K.1 of ISO 62366-2: Cumulative probability of detecting a USABILITY problem
Table K.1 illustrates how small sample sizes can be used satisfactorily to identify usability defects. it shows the cumulative probability of a usability defect being detected in a usability test, given the underlying probability that a single test participant would show a particular problem for a given task. this table applies to all kinds of populations and types of usability tests. According to this table, for example, if the underlying (assumed true population value) probability of a single test participant having a usability problem is 0,25, then the cumulative probability of detection is 0,82 with six test participants.