There is an adage in statistics that to a person with one hand in a bucket of ice water and the other in a bucket of hot coals, average temperature has little meaning. The point being that the use of average can be very misleading. Such may be the case in alarm management where companies are driven to have the average alarm rates on their units fall below published guidelines.
A common metric is alarms per hour, with a target of less than six. If a unit meets this metric, is its alarm system acceptable or “okay”? I personally have had concerns regarding this metric. Not that it is bad, but perhaps too simplistic. It’s only an average, with no target for the distribution around the mean, or standard deviation. The metric also treats all alarms the same, regardless of priorities. Finally, there is no accounting for the size of the span of control.
In order to determine if my concerns were valid, I had data pulled on two units that had similar alarm rates over the same 30 days. One unit averaged 4.0 alarms per hour and the other averaged 4.1 alarms per hour. Both are under the six per hour target, and could therefore be considered to have a “good” alarm system. The units were from the same refinery and with the same control system so those would not be a variable.
The first analysis looked at the frequency distribution for the alarms. Figure 1 shows the distribution of daily alarm averages for both units. For example, Unit A had one day where the daily average was 10 alarms, two days where there were an average of 60 alarms, etc. Shown on both (the blue line) is the monthly average of just under 100 alarms per day. As can quickly be seen, while the average may be the same, the distribution is not. Unit A has large outliers in alarms per day despite being extremely consistent in having nine out of thirty days with identical alarm rates. From a mathematical perspective, the difference in distribution is captured in the standard deviation,
Unit A = 86.5
Unit B = 36.7.
The next analysis was to look at the priority distribution of actuated alarms. Like the average alarm rate, the priority distribution was relatively similar for the two units, as seen in Figure 2. While neither achieves the 5%/15%/80% distribution recommended in ISA18.2, they do have more medium than high and more low than medium. Again, the units look very similar.
They do not look quite the same if we compare the daily alarm actuations broken out by priorities. Figure 3 has the alarms each day and their priorities. As can be seen, while Unit B has a smattering of high priority alarms throughout the month, almost all of Unit A’s high priority occurred on the same day.
It could reasonably be argued that the “bad” day for Unit A should be excluded from the analysis as an outlier. This would lower the daily average alarm rate, and it would also affect the distribution of actuated alarms, as nearly all high priority alarms occurred on the same day. How many facilities would know to exclude data unless looking at something similar to the figure above? Yes, algorithms could be built to exclude certain days or time periods, but many companies struggle now to determine simple monthly averages of alarm actuations.
While the metrics would say that the two units are at least very similar, they are not. Unit A is a 183-loop Cogeneration unit. Unit B is a 290-loop Fluid Catalytic Cracking (FCC) unit. Most industry personnel would argue that the FCC is the more complex unit to operate, and it is 60% larger. However, they are equivalent with respect to the metrics regarding their alarm systems. If we normalized for number of loops, they would not be the same. Unit A’s alarm system would be seen as inferior to Unit B’s. However, current alarm metrics do not account for span of control nor for a unit’s size.
So are the two alarm systems of equal quality? They are—using common metrics for alarm system performance. However, it is clear that they are not of equal quality. It can be seen in the variance in alarms around the average, despite the averages being almost equal. It can be seen in the use of monthly averages that can be unduly influenced by a bad day or two. Finally, the size and complexity of the unit should be factored into the metrics. An undefeated team from the SEC and one from a Division III school are not equal.
There are two points to this analysis. First, using the correct metrics is vital. Using overly simplistic measures can hide the real issue. Second, don’t put all your faith in metrics. Statistics are good and valuable, but in compressing the data for ease of understanding, something is often lost. Don’t abdicate your responsibility for understanding your system to an algorithm.
Would you like this sort of analysis performed on one or more of your process units? Contact us today at Beville@Beville.com
Beville Engineering has been conducting alarm management projects since 1984 and, to date, has completed over 150 alarm rationalizations. Beville was the first to identify that an alarm philosophy is an essential part of alarm rationalization. This led to David Strobhar being referred to as the “father” of alarm management (Rothenberg, D.H., "Alarm Management for Process Control," Momentum Press, New York, 2009, p567). Mr. Strobhar is co-editor of the rationalization clause of ANSI/ISA18.2, "Alarm Management for the Process Industries."
Beville is constantly asked how many loops a board operator can handle. While this is a poor metric due to the large number of influencing factors (such as alarms, as illustrated by our first article), we invite you to take part in a very short, 4-6 question survey about span of control. Results will be shared in our next newsletter.
Click here to take the survey (opens in new window)Copyright © 2017 Beville Engineering, Inc. , All Rights Reserved
RELATED EXTERNAL MEDIA
|Consortium Reports New Findings on Alarm Rates||Automation World|
|How Many Alarms Can An Operator Handle||Chemical Processing|
|Impact of Alarm Rates and Interface Design on Operator Performance||Automation World|
|Operator Interfaces: Moving from Comfortable to Most Effective||Automation World|
|Operator Performance as a Function of Alarm Rate and Interface Design||Mesa.org|
This year's Fall meeting for the Center for Operator Performance will be October 24-26 in Corpus Christi. For more information, please contact Lisa Via. Guests are always welcome!
Our summer newsletter is now available. Click here!
Take our short survey on operator span of control. Click here (new window)
David Strobhar's book, "Human Factors in Process Plant Operation," is now available in both hardcover and Kindle e-book.
Copyright © 1996-2018 Beville Engineering, Inc. All rights reserved. (937)434-1093. Beville@Beville.com