Human Reliability Analysis

The term "human reliability" is usually defined as the probability that a person will correctly performs some system-required activity during a given time period (if time is a limiting factor) without performing any extraneous activity that can degrade the system. The historical background for the development of the set of methods that are commonly referred to as Human Reliability Analysis (HRA) was the need to describe incorrect human actions in the context of Probabilistic Risk Assessment (PRA) or Probabilistic Safety Analysis (PSA). The premises for HRA were, and are, therefore that it must function within the constraints defined by PRA/PSA, and specifically that it can produce the human action probabilities that are needed by the PRA/PSA.

 

The accident sequence that is analysed by a PRA/PSA is typically represented as an event tree (see Figure 1). A node in the sequence of events that may lead to the accident represents a specific function, task, or activity that can have two different outcomes, usually denoted success and failure. A node can either represent the function of a technical system or component, or the interaction between an operator and the process. For example, if the analysis considers the sequence of events that are part of landing an aircraft, the event "timely extraction of flaps", which is an action that must be taken by the pilot, is represented by a node. From the perspective of the PRA/PSA there is a need to know whether it is likely that an event will succeed or fail, and further to determine the probability of failure in order to calculate the combined probability that a specific outcome or end state will occur. If the node represents the function of a mechanical or electronic component, the failure probability can, in principle, be calculated based on engineering knowledge alone. If the node represents the interaction between an operator and the process, engineering knowledge must be supplemented by a way of calculating the probability that the human, as a "component", will fail. Historically, the role of HRA has been to provide the foundation for calculating this probability. The sought for value has traditionally been called a human error probability (HEP), but as the following will show this is both a misunderstood and misleading term.

 

Figure 1: A simplified event tree representation.

 

Human Error probabilities and Performance Shaping Factors

The practice of HRA goes back to the early 1960s, but the majority of HRA methods were developed in the middle of the 1980s – mainly as a consequence of the concern caused by the accident in 1979 at the nuclear power plant at Three Mile Island. Partly due to the conditions under which it was developed, HRA methods from the beginning used procedures similar to those employed in conventional reliability analysis. The main difference was that human task activities were substituted for equipment failures and that modifications were made to account for the greater variability and interdependence of human performance as compared with that of equipment. The traditional approach is first to determine the HEP for a node, either by using established tables, human reliability models, or expert judgement. The characterisation of human failure modes is usually very simple, for instance in terms of "error of omission" and "errors of commission". Since human actions clearly do not take place in a vacuum, a second step is to account for the influence of possible Performance Shaping Factors (PSF) such as task characteristics, aspects of the physical environment, work time characteristics, etc. This influence is expressed as a numerical factor that is used to modify the basic HEP. The resulting formula for calculating the probability of a specific erroneous action (PEA) is shown below:

From the standpoint of behavioural sciences, this formula makes two fundamental assumptions. Firstly, that the probability of failure can be determined for specific types of action independently of any context. Secondly, that the effects of the context are additive, which is the same as saying that the various performance conditions (such as interface quality, stress, level of training, task complexity, etc.) do not influence one another. Neither of these assumptions are reasonable, and either alone constitutes a grave deficiency of the approach. Quite apart from these principal objections, HRA methods in practice turned out not to be sufficiently effective and the need for substantial improvements was gradually realised. In 1990, this was expressed as a criticism against HRA on the following points (Dougherty, 1990):  

Considering the historical basis for HRA methods, it is reasonable to suggest that one improvement should be a better integration of psychological theory with HRA models and methods. In particular, it is strongly felt by many with a behavioural science background that quantification should await an improved theoretical foundation.

 

Context and Cognition in HRA

The above mentioned criticism caused an intensive debate within the community of HRA theoreticians and practitioners, and pointed to two problems that need to be addressed by HRA methods: the problem of context and the problem of cognition. Technical systems that depend on human-machine interaction to accomplish their function, such as nuclear power plants, are typically tightly coupled and have complex interactions. Plant operators and plant components should therefore be seen as interacting parts of an overall system that responds to upset conditions. The actions of operators are not simply responses to external events, but are governed by their beliefs as to the current state of the plant. Since operators make use of their knowledge and experience, their beliefs at any given point in time are influenced by the past sequence of events and by their earlier trains of thought. In addition, operators rarely work alone but are part of a team – especially during abnormal conditions. Altogether this means that human performance takes place in a context which consists both of the actual working conditions and the operator’s perception or understanding of them. It also means that the operator’s actions are a result of cognition and beliefs, rather than simple responses to events in the environment, and that the beliefs may be shaped - and shared - by the group.

 

The problem of context is a noticeable feature of other current theories, such as the so-called multi-threaded failure models that describe how accidents in systems may evolve (Reason, 1990). Events are described as determined by a combination of psychological, technological, and organisational or environmental factors. The issue of organisational reliability has recently received specific attention by the PRA/PSA community, and proposals for an organisational reliability analysis have been made. On the level of human performance, which is the focus of HRA, context is important because human action always is embedded in a context. Given a little thought this is obvious, but the preferred mode of representation that the analyses use - the event tree or operator action tree - is prone to be misleading, since it represents actions without a context. One unfortunate consequence of that has been the preoccupation with the HEP, and the oversimplified concept of "human error". The consequence of acknowledging the importance of the context is that HRA should not attempt to analyse actions separately, but instead treat them as parts of a whole. Similarly, "human error" should be seen as the way in which erroneous actions can manifest themselves in a specific context, rather than as a distinct and well-defined category.

 

In the search for a way of describing and understanding the failure of human actions, several classes of models have been used. In brief, HRA methods seem to include one of the following types of operator model.

The problem of cognition can be illustrated by the popular notion of "cognitive error". The problem facing HRA analysts was that human cognition undoubtedly affected human action, but that it did not fit easily into any of the established classification schemes or the information processing models. One solution was simply to declare that any failure of a human activity represented in an event tree, such as diagnosis, was a "cognitive error". However, that did not solve the problem of where to put the category in the existing schemes. A genuine solution is, of course, to realise that human cognition is a cause rather than an effect, and that "cognitive error" therefore is not a new category or error mode. Instead, all actions – whether erroneous or not - are determined by cognition, and the trusted categories of "error of omission" and "error of commission" are therefore as cognitive as anything else. Furthermore, the cognitive viewpoint implies that unwanted consequences are due to a mismatch between cognition and context, rather than to specific "cognitive error mechanisms".

 

Principles of a Contemporary HRA

The current dilemma of HRA stems from its uneasy position between PRA/PSA and information processing psychology. As argued above, HRA has inherited most of its concepts and methods from the practice of PRA/PSA. This means, that HRA uses a fixed event representation based on a pre-defined sequence of steps, that the methods are a mixture of qualitative and quantitative techniques, and that the underlying models of operator actions and behaviour are either probabilistic or simplified information processing models. Yet information processing psychology is characterised by almost the opposite approach. The representation is typically information flow diagrams of internal functions, rather than binary trees of external events; the methods are generally qualitative / descriptive, and the models are deterministic (because an information processing model basically is a deterministic device). HRA practitioners have therefore had great problems in using the concepts from information processing psychology as a basis for generating action failure probabilities; a proper adaptation would practically require a complete renewal of HRA. Even worse, information processing models are almost exclusively directed at retrospective analysis, and excel in providing detailed explanations of how something can be explained in terms of internal mental mechanisms. HRA, on the other hand, must by necessity look forward and try to make predictions.

 

Since neither classical HRA nor information processing psychology is easily reconciled with the need to account for the interaction between context and cognition, it is difficult to find a solution to the current predicament. The main problem is that the conceptual foundation for HRA methods either is misleading or is partly missing. The solution can therefore not be found by piecemeal improvements of the HRA methods. In particular, attempts to rely on information processing psychology as a way out are doomed to fail because such models are retrospective, rather than predictive.

 

The solution to the current problems in HRA is not just to develop a new approach to quantification, but to develop a new approach to performance prediction that clearly distinguishes between the qualitative and quantitative parts. The purpose of qualitative performance prediction is to find out which events are likely to occur, in particular what the possible outcomes are. The purpose of quantitative performance prediction is to find out how probable it is that a specific event will occur, using the standard expression of probability as a number between 0 and 1. Qualitative performance prediction thus generates a set of outcomes that represent the result of various event developments. The validity of the set depends on the assumptions on which the analysis is based, in particular the detailed descriptions of the process, the operator(s), and the interaction. If the assumptions are accepted as reasonable, the set of outcomes will by itself provide a good indication of the reliability of the system, and whether unwanted outcomes can occur at all. That may, in the first instance, be sufficient. It may only be necessary to proceed to a quantitative performance prediction if significant unwanted consequences are part of the set of possible outcomes.

 

Qualitative Performance Prediction

The qualitative performance prediction must be based on a consistent classification scheme that describes human actions in terms of functions, causes, dependencies, etc., as well as a systematic way or method of using the classification scheme. A consistent classification scheme is essential if assignments of erroneous actions are to be justified on psychological grounds. A clear method is necessary because the analysis otherwise will become limited by inconsistencies when the classification scheme is applied by different investigators, or even by the same investigator working on different occasions. It can further be argued that a classification scheme must refer to a set of concepts for the domain in question, i.e., a set of supporting principles, specifically a viable model of cognition at work. The model will guide the definition of specific system failures from a consideration of the characteristics of human cognition relative to the context in which the behaviour occurs. The overall approach of a contemporary HRA should be something along the following lines:

  1. Application analysis. It is necessary first to analyse the application and the context. This may in particular involve a task analysis, where the tasks to be considered can be derived from the PRA/PSA as well as from other sources. The application analysis must furthermore consider the organisation and the technical system, rather than just the operator and the control tasks.

  2. Context description. The context must be systematically described in terms of aspects that are common across situations. If insufficient information is available it may be necessary to make assumptions based on general experience, particularly about aspects of the organisation, in order to complete the characterisation.

  3. Specification of target events. The target events for the human actions / performance can be specified in several ways. One obvious source is the PRA/PSA, since the event trees define the minimum set of events that must be considered. Another is the outcome of the application and task analyses. A task analysis will, in particular, go into more detail than the PRA/PSA event tree, and may thereby suggest events or conditions that should be analysed further.

  4. Qualitative performance analysis. The qualitative performance analysis uses the classification scheme, as modified by the context, to describe the possible effects or outcomes for a specific initiating event. The qualitative performance analysis, properly moderated by the context, may in itself provide useful results, for instance by showing whether there will be many or few unwanted outcomes.

  5. Quantitative performance prediction. The last step is the quantitative performance prediction, which is discussed in detail below.

Quantitative Performance Prediction

The quantification of the probabilities is, of course, the sine qua non for the use of HRA in PRA/PSA. The quantification has always been a thorny issue, and will likely remain so for many years to come. Some behavioural scientists have argued - at times very forcefully - that quantification in principle is impossible. More to the point, however, is whether quantification is really necessary, and how it can be done when required. From a historical perspective, the need for quantification may, indeed, be seen as an artefact of how PRA/PSA is carried out.

 

At present, the consensus among experts in the field is that quantification should not be attempted unless a solid qualitative basis or description has first been established. If the qualitative performance analysis identifies potentially critical tasks or actions, and if the failure modes can be identified, then it is perhaps not necessary to quantify beyond a conservative estimate. In other words, the search for specific HEPs for specific actions may not always be necessary. To the extent that quantification is required, the qualitative analysis may at least be useful in identifying possible dependencies between actions. The description of the context may also serve as a basis for defining ways of preventing or reducing specific types of erroneous actions through barriers or recovery.

 

Traditionally, quantification involves finding the probability that a specific action may go wrong and then modifying that by the aggregated effect of a more or less systematic set of PSFs. An alternative approach is to begin the prediction by identifying the context and the common performance conditions (Hollnagel, 1998). Following that, each target event is analysed in the given context. This means that the analysis is shaped by the defined context and that the list of possible causes of a failure contains those that are the likely given the available information. Once the set of possible causes has been identified, the need will arise to quantify the probability that the target event fails. Since, however, the target event is associated with a set of context specific error modes, it is reasonable to estimate the probability on that basis - rather than treating the target event in isolation.

 

Initially, it may be necessary to use expert judgements as a source of the probability estimates. But keeping the criticisms of HRA in mind, it is crucial that the expert judgements are calibrated as well as possible. Meanwhile, every effort should be made to collect sufficient empirical data. Note, however, that empirical data should not be sought for separate actions but for types of actions. When the observations are made, in real life or in simulators, both actions and context should be recorded, and a statistical analysis should be used to disentangle them. This may, eventually, provide a reliable set of empirical data to supplement or replace expert judgements. The principles of data collection are thus fundamentally different from a traditional HRA. Furthermore, data collection can be guided by the classification scheme, since this provides a consistent and comprehensive principle according to which the data can be organised and interpreted. That by itself will make the whole exercise more manageable.

 

Literature

Dougherty, E. M. Jr. (1990). Human reliability analysis - Where shouldst thou turn? Reliability Engineering and System Safety, 29(3), 283-299.
Hollnagel, E. (1998). Cognitive reliability and error analysis method. Oxford: Elsevier Science Ltd.
Reason, J. T. (1990). Human error. Cambridge, U.K.: Cambridge University Press.

 

© Erik Hollnagel, 2005

 

Back