Critiquing Causality: Philosophical Foundations and Technical Challenges in Machine Learning and Analytics

Causality, the relationship between cause and effect, has long been a central topic in philosophy. Philosophers have scrutinized and debated the nature of causality, questioning its fundamental assumptions and implications. This article delves into the major critiques of causality in philosophy, exploring the complex issues that arise when examining the foundations of cause and effect, as well as the practical challenges of establishing causality in machine learning and data analytics.

One of the most influential critiques of causality comes from David Hume, an 18th-century Scottish philosopher. Hume argued that our belief in causality is not derived from reason but from habit and experience. He contended that we never actually observe causation itself; we only see a succession of events, and our assumption of a necessary connection between cause and effect is a psychological habit rather than a logically or empirically grounded principle (Ducasse, 1966). For instance, consider a government implementing a new educational policy and subsequently observing an improvement in student test scores. We might assume that the new policy caused the improvement. However, according to Hume, we cannot rationally justify this inference beyond our habitual expectations, as we only observe a sequence of events (the policy implementation followed by higher test scores) rather than a direct causal connection. Hume’s skepticism challenges the very notion of causality as an objective and necessary connection. 

Immanuel Kant, another influential 18th-century philosopher, responded to Hume’s skepticism by arguing that causality is an a priori concept that structures our experiences (Langsam, 1994). In his ‘Critique of Pure Reason,’ Kant proposed that causality is a category of understanding that the mind uses to organize the sensory data it receives. According to Kant, causality is not something we derive from experience but a precondition for making sense of experience. Critics of Kant, however, question whether this framework truly addresses Hume’s skepticism or merely sidesteps it.

Modern philosophical critiques often focus on the complexity and multifaceted nature of causality. Causality is not always a simple linear relationship but can be intricate, involving multiple interacting factors. Philosophers like Nancy Cartwright argue that causal laws in the natural and social sciences are often context-dependent and cannot be universally applied. Cartwright’s perspective on causation differs significantly from that of Hume and Kant. While Hume and Kant viewed causation as a single, uniform concept, Cartwright argues that causation is diverse and consists of different kinds of causal relations within various systems (Cartwright, 2004). She emphasizes the specificity of causes and the importance of considering context and specific structures in understanding causation. Cartwright contends that each causal relation requires a detailed and context-specific description. This perspective highlights the limitations of simplistic causal models and underscores the need for more sophisticated approaches that account for complexity and variability.

Advancements in physics, particularly quantum mechanics, have introduced new critiques of classical causality. In the quantum realm, events can occur without clear causal antecedents, and particles can be entangled in ways that defy traditional causal explanations. Winter (2017) explores the intricate relationship between causality and quantum theory, delving into key issues such as the EPR paradox, Bell’s inequality, and the measurement problem, which challenge traditional notions of causality in the quantum realm. For example, Bell’s theorem implies the existence of non-local correlations, where the behavior of one particle can instantaneously influence the behavior of another particle regardless of the distance between them. Such non-local correlations challenge classical notions of causality, which typically assume that causes precede their effects and operate only within a local region of space and time. We encourage interested readers to consult the original paper.

Technical Challenges in Machine Learning and Analytics

In applied machine learning and analytics, various techniques and models aim to identify causal relationships, each presenting unique strengths and challenges. For instance, Propensity Score Matching reduces bias by matching similar units based on their propensity scores, which estimate the probability of receiving the treatment. This method helps control for observed covariates but cannot address unobserved confounders that may influence both treatment assignment and outcomes. Causal Bayesian Networks offer a comprehensive view of causal structures by modeling relationships among variables with a graphical approach, yet they necessitate substantial domain knowledge for accurate construction. Regression Discontinuity Design provides clear causal estimates near a predefined cutoff point, but it primarily estimates treatment effects in the vicinity of this threshold. Linear Regression with Instrumental Variables (IV) is robust when a strong instrument is available to instrumentally estimate causal effects, but it requires careful model specification to avoid bias. Choosing the appropriate method depends on understanding data characteristics, making assumptions explicit, and aligning with research objectives to effectively uncover causal insights.

Research that critically examines specific techniques and models provides valuable insights into their strengths and limitations. For example, Baker et al. (2022) investigate the potential biases introduced by the staggered Difference-in-Differences (DiD) method. DiD is used to estimate the effect of a treatment or intervention (e.g., a new policy, a subsidy program) by comparing the change in outcomes (e.g., employment rates, productivity levels) between two groups: one group treated with the intervention and the control group (i.e., not treated). Staggered DiD is a generalized form of this approach that can accommodate situations where the treatment is implemented at different times for different units, and the control group includes units that have not yet received the treatment (not-yet-treated) or will never receive it (never-treated), with similar characteristics.

According to Baker et al. (2022), staggered DiD designs can introduce biases in assessing the impact of treatment, particularly due to treatment effect heterogeneity across units (i.e., different individuals or entities subject to the treatment at different points in time) or over time. In the context of the paper, treatment effect heterogeneity refers to the idea that the impact of a certain treatment or intervention can vary among different groups of individuals or entities and change over different time periods. Imagine a government subsidy program for small businesses. Businesses receiving the subsidy early may see an initial rapid increase in sales, which could diminish over time due to market saturation or competitor adjustments. In contrast, businesses receiving the subsidy later may experience delayed but sustained sales growth by leveraging insights from early recipients and making more informed strategic decisions. According to the paper, the biases introduced by treatment effect heterogeneity can lead to incorrect or misleading conclusions about the treatment effects, including estimates with the wrong sign.

The paper introduces alternative estimators, such as event study DiD specifications and stacked regression, to address biases in staggered DiD designs. Event study DiD designs incorporate multiple periods before and after the treatment by using leads and lags of the treatment variable, rather than relying on a single binary indicator. This approach accounts for dynamic treatment effects. On the other hand, stacked regression involves creating datasets for each specific event and then stacking these datasets based on their timing. For example, in a study on the effects of minimum wage increases on employment, each minimum wage change in different states or regions would be considered a separate event (see Cengiz et al., 2019). This method enables the calculation of an average treatment effect across all events using a single set of treatment indicators. By utilizing these alternative estimators, researchers can improve the accuracy of treatment effect estimates in settings with staggered treatment timing and heterogeneous treatment effects, providing methods that may enhance the reliability of DiD analyses in empirical research.

Causal forests are another example of machine learning/analytics techniques that can be used for practical causal inference. Just like other techniques that use random forests (a type of machine learning method), causal forests have the ability to capture non-linear relationships and high-order interactions. This means that causal forests can find complex patterns and connections in data that aren’t straightforward. They can detect relationships where the effect of one thing on another isn’t simple or direct, and they can also figure out how multiple factors work together in complicated ways.

With this ability, causal forests can automatically identify and analyze heterogeneous treatment effects, which is a valuable strength compared to regression-based approaches like DiD. Imagine you have a group of people with a certain condition, and you want to know which treatment works best for each person based on their individual characteristics. Causal forests use advanced computer algorithms to analyze a lot of data and figure out these personalized effects. They work by looking at how treatments, like medications or therapies, interact with specific traits of each person to predict how well the treatment will work for them.

However, causal forests come with their own set of challenges. This approach requires meticulous tuning and validation, and its theoretical benefits over regression-based methods may not always translate well, especially with certain types of data. For instance, Venkatasubramaniam et al. (2023) compared causal forests with a penalized regression approach to analyze the heterogeneous effects of diabetes treatments. Their study highlighted several drawbacks of causal forests in this context. The predictions from causal forests were notably miscalibrated when validated against real-world UK primary care data, indicating a significant discrepancy between predicted and observed outcomes. Moreover, the causal forest algorithm tended to produce more conservative estimates of treatment effects heterogeneity compared to the regression approach, potentially leading to an underestimation of treatment effects variability. While these limitations might be specific to the study’s context and the relatively low-dimensional dataset used, the authors caution against relying solely on causal forests. Instead, they advocate for comparing outputs from causal forests with those from regression approaches.

In applied machine learning and analytics, understanding the contextual nuances surrounding causality issues, particularly within domains like social sciences and healthcare, is crucial. These contexts often introduce complexities and confounding factors that significantly influence the outcomes of causal inference models. Therefore, conducting thorough sensitivity analyses and robustness checks is essential. These procedures help assess the stability and reliability of the causal relationships identified by the models, ensuring that the findings maintain integrity across different scenarios and conditions. By integrating such rigorous practices into the analytical process, practitioners can enhance the validity and applicability of their findings, ultimately yielding more trustworthy insights for decision-making and policy formulation.


References:

Baker, A. C., Larcker, D. F., & Wang, C. C. (2022). How much should we trust staggered difference-in-differences estimates? Journal of Financial Economics, 144(2), 370–395. https://doi.org/10.1016/j.jfineco.2022.01.004

Cartwright, N. (2004). Causation: One Word, Many Things. Philosophy of Science71(5), 805–819.

Cengiz, D., Dube, A., Lindner, A., & Zipperer, B. (2019). The effect of minimum wages on Low-Wage jobs*. ˜the œQuarterly Journal of Economics, 134(3), 1405–1454. https://doi.org/10.1093/qje/qjz014

Ducasse, C. J. (1966). Critique of Hume’s Conception of Causality. The Journal of Philosophy, 63(6), 141–148. https://doi.org/10.2307/2024169

Langsam, H. (1994). Kant, Hume, and Our Ordinary Concept of Causation. Philosophy and Phenomenological Research, 54(3), 625–647. https://doi.org/10.2307/2108584

Venkatasubramaniam, A., Mateen, B. A., Shields, B. M., Hattersley, A. T., Jones, A. G., Vollmer, S. J., & Dennis, J. M. (2023). Comparison of causal forest and regression-based approaches to evaluate treatment effect heterogeneity: an application for type 2 diabetes precision medicine. BMC Medical Informatics and Decision Making, 23(1), 110. https://doi.org/10.1186/s12911-023-02207-2

Winter, B.K. (2017). Causality and quantum theory. arXiv: Quantum Physics. https://doi.org/10.48550/arXiv.1705.07201

Like (0)