The importance of considering time when evaluating risks of base jumping (and maybe even antibiotics)
This is a guest post by L. Silvia Munoz-Price, MD, PhD. Associate Professor of Medicine at the Medical College of Wisconsin. Enterprise Epidemiologist at Froedtert Health. Milwaukee.
Over a month ago, Eli asked me to write this piece to discuss my recent CID paper on handling time dependent variables. I knew this had to be done with an analogy but after several weeks of mulling over this, I was still uncertain on how to colloquially explain this concept to you. So, as I was almost ready to forget about this post while on a plane to Miami, I had sudden inspiration as I was about to nap! I really hope this helps everyone understand this statistical concept. If not, then I'm not sure reading the CID paper will help you much either…stick to 2x2 tables (sorry!!).
Setting: Let’s imagine Eli and his wife invited my hubby and me to go base jumping in New Zealand for a week (See figure…Eli take note!!). So, now let’s observe our jumping habits: of course, I would jump once and be done with it for lifetime. My hubby would probably not jump at all and just enjoy watching crazy people jump. Let’s say Eli decides to jump every day (two days he jumps twice!) and his wife jumps three consecutive days.
Study design: Ok. Not to be morbid, but the easiest outcome to evaluate is mortality (binary variable; 1: dead or 0:alive) by the end of the vacation. The exposure variable of interest is base jumping.
Option 1: The easiest way to look at this association is to construct a 2x2 table: Did you jump? (yes/no) Did you die? (yes/no). See, the problem with this analysis is that it ignores the intensity of the exposure as Eli, his wife, and I would be considered as a “yes” and only my husband would be a “no”. But, is it reasonable to analyze the exposures for Eli, his wife and I the same way? Intuitively, we probably could say no.
Option 2: A tad more elaborate way to look at this would be to count the number of jumps per person and enter these numbers in the analysis. So, Eli would have 9, his wife would have 3, I would have 1, and my husband would have 0. What is the problem with this approach? Well, it completely disregards time of exposures, correct? It is like having all those jumps in only one day. We need to ask: when was “that” day that all those exposures got summed? Was it at the beginning of the week or towards the end? Did the outcome happen at the beginning of the week, in the middle or at the end? Is it reasonable to analyze all those jumps clustered in time within a single day? Intuitively, I would say no. A similar problem happens with number of days that jumps occurred, especially for me. When did my one jump happen (at the beginning of the trip or towards the end?).
Option 3: A more elaborate way to determine the association between jumping and mortality is to account for the richness of the exposures. Not just taking into account the specific days the jumps occurred, but how many jumps occurred each day and from which different altitudes these jumps took place. Then we can calculate the hazard of dying on a daily basis based on the previous 24 hours of jumps. Let’s go over this a bit further. The hazard on day 1 would be calculated using 3 people. Assuming we all survived, on day 2 the hazard would be calculated only among the people that jumped (2). On day 3, assuming we all survived, the hazard would be calculated again only among the people who jumped that day (2 jumps). On day 4, assuming we all survived the hazard would be calculated only among the people who jumped (1). If any of us were to arrive to the outcome during the observation, then that person would be removed from the analysis. This is the concept of time dependent exposures. You measure the outcome as the exposure occurs over time. This is in contrast to what we usually do in our hospital epi studies: exposure treated as binary variable (yes/no) or exposure treated as number of days exposed (9 or 3 or 1) or even as number of jumps performed. More concerning, the outcome on the latter examples is fixed towards the end of the observation rather than measured as time progresses.
Bringing it home: ANTIBIOTIC EXPOSURES. Antibiotics are such rich exposures. Think about it. They can be given during many different days throughout the hospital stay and there are many types of antibiotics, with various doses and routes. Outcome variables, such as acquiring a multidrug resistant organism or even developing an infection by this organism also vary in time during hospitalization. Is it reasonable to analyze all those antibiotic exposures clustered in time within a single day or even worse as binary variables? Is it optimal to fix the outcome variable as happening at the end of hospitalization? Intuitively, I would say no to both. There are a couple of examples in the ID literature that compare these analyses. One of them by my co-author Marc Bonten. However, specifically for antibiotics it is not fully clear to me if the associations found would justify the additional cost and time of obtaining all this rich information about exposures and outcomes (note: think about relooking at your cohort datasets using this method).
Let’s end this post here [so that I can take a quick nap before landing] and see the feedback I get with this example. If the feedback is good then I will explain the biases that can occur by not accounting for time in your analyses, and maybe go over delayed effect of antibiotics. In the meantime, I will be sipping a mojito with my hubby while enjoying Miami. Salud!