Creating a causal diagram, or DAG, especially for the inexperienced, is not an easy task and will take time. But the effort required is not the only reason that DAGs might not be used for a research study.
Listed below are some of the reasons that a researcher might have for not wanting to create a DAG or, more generally, for not wanting to use explicit causal thinking in a study that nevertheless involves a research question relating to causes of an outcome, including broad questions such as looking for risk factors.
Underneath each reason is a response that a researcher might consider before making a final decision.
1. It isn’t a ‘causal inference’ study
Things to consider:
‘Causal inference’ does not mean that a study is looking to establish whether a causal effect exists or is clinically meaningful. There is always some uncertainty. Unfortunately, in common English the words ’cause’ and ‘causal’ tend to be used when there is a firm belief that a causal relationship exists, as opposed to the scientific perspective that some doubt will always remain.
If the study is investigating associations between causes and an outcome, then inferences will inevitably be made, even if with great uncertainty, or with ambiguous language such as only using the word ‘association’. There is a large body of evidence from psychology that suggests we have an innate tendency to view the world and interpret associations in terms of cause and effect.1
If the reason for the study is to better understand how or why an outcome occurs, or to identify factors that might be modified to alter the outcome, then making inferences about the causes of the outcome is the goal of the study, even if those inferences are very cautious.
If the study is using a model to estimate effect sizes of variables with associated confidence intervals or p-values, then the aim is to adjust for confounding and that is a causal concept.
2. Creating a DAG will take too much time that our research funding does not allow
This might be a very real problem, at least in the short term. It may reflect a lack of understanding by those who applied for the funding of all of the steps involved in properly analysing the data.
3. Uncertainty about the specific causal relationships (arrows) to include in the DAG
This can relate to different opinions in the research group about whether an arrow can be justified by existing published evidence, and if not, whether it should be.
The term ‘causal relationship’ in this context does not refer only to strong or meaningful relationships, but to any, including tiny ones. And, in health research, there is likely to be some causal relationship between most variables (that is, not exactly zero, though at times, small enough to be ignored).
Hence, the arrows included should reflect reasoning based on evidence of the strength of associations and/or plausible mechanisms and/or knowledge of which factors tend to precede other factors in time. Published evidence could be included in a manuscript to help justify the inclusion or absence of specific arrows, but the practice is uncommon.2
4. Unsure which direction an arrow should point when the relationship is bidirectional
This problem can be approached in at least two ways.
The first is to decide in which direction the stronger causal effect will go.
The second is to use multiple variables to represent the same variable at different points in time. This more accurately represents the causal mechanisms that occur in reality and which result in variables affecting each other over time.
5. Concern that a reviewer will disagree about the causal relationships in the DAG
This relates to reasons 3 and 4 where there is uncertainty over the existence and/or direction of the causal relationships expressed in the DAG. It may reflect an expectation that what is presented in a manuscript should be ‘correct’. DAGs express assumptions, however, and there will generally be many equally plausible DAGs for every research question in every study.
6. A DAG is created but it suggests that a variable always adjusted for should not be
Some variables in epidemiological research are commonly always adjusted for, such as sex and age, along with variables that are specific to certain subject areas, for example, socioeconomic status, marital status, birth weight, etc. This can be regardless of what the outcome and variable of interest is.
Hence, if a DAG is created and then suggests that one of these variables is more likely to be a mediator than a confounder, then the DAG may be dropped and the variable adjusted for if the researcher lacks confidence in their own judgement, or researchers in the group more senior to themselves, but who are not familiar with DAGs, wish to keep the adjustment as that is standard practice. They may also not be used to thinking of variables in terms of mediators.
A related concern would be an expectation that a journal reviewer will reject a paper if the analysis fails to adjust for variables that, in general, are always adjusted for.
It is hoped that resources such as this website can help speed up the adoption of DAGs as standard research tools, and eventually reduce the chance of situations such as this from occurring.
- Newman G. The Bias Toward Cause and Effect. In: Stone GWM, Sarah J, eds. Psychology of Bias; 2012:69-82.
- Tennant PWG, Murray EJ, Arnold KF, et al. Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations. International Journal of Epidemiology. 2021;50(2):620-632. doi:10.1093/ije/dyaa213