Surveillance of infections depends, of course, on being able to count them. Recent years have seen several studies (here, here and here) revealing differences in how central line infections get counted. Problems of non-comparability of infection data didn’t matter so much when institutions were conducting surveillance for their own internal purposes. Now that infection rates have become a high stakes metric, tied to financial and reputational sanctions, comparability has become much more important. But why differences in measurement occur is not clear, and one suspicion is that “gaming” – where healthcare workers deliberately manipulate the data – might be responsible. We set out to find out more.
We used the opportunity presented by Matching Michigan, a programme modeled largely on the iconic Michigan-Keystone project, to explore what English hospitals did when asked to collect and report their ICU-acquired central line infection rates. Funded by the Health Foundation, a major UK charitable foundation, we used ethnographic methods involving direct observations in intensive care units, interviews with hospital staff, and documentary analysis.
the recent Milbank Quarterly, showed that even though all hospitals in the programme were given standardised definitions based on those used by the US CDC, variability in how they gathered and interpreted the data was rife. But gaming played little or no role in explaining what happened. Most variability in fact arose because of what we called “mundane” reasons. These included challenges in setting up data collection systems (we identified three distinct systems in use), different practices in sending blood samples for analysis, and difficulties in deciding the source of infections.
One interesting and unusual feature of Matching Michigan was that it asked units to distinguish, based on the CDC definitions, between catheter-associated infections and catheter-related infections. The difference between them relates to the standard of evidence needed to establish whether an infection originated in the central line or not.
To satisfy the catheter-associated infection, only one blood sample (or catheter tip) is needed, plus a clinical judgement about where the infection is coming from. This definition tends to increase sensitivity at the expense of specificity, but that may be good enough if the main purpose is to guide clinical decisions and provide a reasonable estimate of how well infections are being controlled. Satisfying the catheter-related definition, on the other hand, requires two blood samples: one from the central line and one from somewhere else in the body. Both must test positive for the same micro-organism, determined using semi-quantitative or quantitative techniques. While it provides a better standard of proof, it requires a lot more resources; many laboratories in England were not equipped to support it, and didn’t necessarily consider it a clinical priority to do so. A majority of infections reported to the program therefore relied on the catheter-associated definition.
Using the catheter-associated infection definition, because it relied on clinical judgement, invited inevitable ambiguities about what counted as a central line infection. Those decisions were typically made by ICU physicians (not infection preventionists as in the original Keystone study), and those decisions were made using different kinds of evidence in different ways. For instance, physicians varied in their practices for routine screening of catheter tips; in their propensity to initiate treatment in cases of suspected infection; and in the number and kinds of samples they sent to laboratories for analysis. This meant that those charged with counting the infections were using very different source information.
Despite absence of evidence of deliberate manipulation, the fact that data had to be reported externally to the programme did appear to have some bearing on “what counted”, though not in consistently predictable ways. For instance, some units decided that patients at low risk for central line infections (such as those who had a line in for just a few hours after cardiac surgery) were not eligible for counting in the programme, while others excluded those seen to be unusual in some way (such as those with multiple lines in after complex surgery). This meant that neither the denominators nor the numerators for calculating the infection rates were counted in exactly the same ways across all the ICUs in the programme.
We concluded that unless hospitals are deploying the same methods to generate the data, using their reported rates to produce league tables or performance or impose financial sanctions is probably not appropriate. Much more needs to be done to ensure that reported infection rates are credible, useful, fully integrated with clinical priorities, and comparable.