Select Language

Interpretable Machine Learning for Exchange Rate Forecasting with Macroeconomic Fundamentals

A study applying interpretable machine learning to forecast and explain the CAD/USD exchange rate, identifying crude oil, gold, and the TSX as key drivers.
computecurrency.net | PDF Size: 1.1 MB
Rating: 4.5/5
Your Rating
You have already rated this document
PDF Document Cover - Interpretable Machine Learning for Exchange Rate Forecasting with Macroeconomic Fundamentals

1. Introduction

Forecasting exchange rates is notoriously difficult due to the complexity, nonlinearity, and frequent structural breaks in financial systems. Traditional econometric models often struggle to capture these dynamics and provide clear explanations for their predictions. This study addresses this gap by developing a fundamental-based model for the Canadian–U.S. dollar (CAD/USD) exchange rate within an interpretable machine learning (IML) framework. The primary goal is not only to achieve accurate predictions but also to explain them using macroeconomic fundamentals, thereby increasing trust and actionable insights for policymakers and economists.

The research is motivated by Canada's status as a major commodity exporter, particularly of crude oil, which constituted 14.1% of total exports in 2019 and 61% of U.S. crude oil imports in 2021. Understanding the time-varying impact of such commodities on the exchange rate is crucial.

Key Challenges Addressed:

  • Nonlinearity: Relationships between macroeconomic variables are often nonlinear.
  • Multicollinearity: Many factors influence exchange rates simultaneously.
  • Interpretability: Black-box models lack theoretical consistency and trust.

2. Methodology & Framework

The study employs a comprehensive IML pipeline combining predictive modeling with post-hoc interpretation.

2.1 Data & Variables

A set of macroeconomic and financial variables hypothesized to influence the CAD/USD rate was collected. This likely includes:

  • Commodity Prices: Crude oil (WTI), gold, natural gas.
  • Financial Indicators: S&P/TSX Composite Index, interest rate differentials (Canada vs. U.S.).
  • Macroeconomic Fundamentals: GDP growth, inflation differentials, trade balance.

Data is preprocessed (e.g., stationarity transformations, handling missing values) to suit ML models.

2.2 Machine Learning Models

The study likely utilizes powerful, yet complex, ensemble models known for high predictive accuracy:

  • Gradient Boosting Machines (GBM/XGBoost/LightGBM): Effective for capturing nonlinear patterns and interactions.
  • Random Forests: Robust to overfitting and provides inherent feature importance measures.
  • Neural Networks: Potentially used for capturing deep, complex temporal dependencies.

Models are trained to predict future exchange rate movements or levels.

2.3 Interpretability Techniques

To open the "black box," the study applies state-of-the-art IML methods:

  • SHAP (SHapley Additive exPlanations): A game-theoretic approach to quantify the contribution of each feature to each individual prediction. It provides both global and local interpretability.
  • Partial Dependence Plots (PDPs): Visualize the marginal effect of a feature on the predicted outcome.
  • Feature Importance Rankings: Derived from model-specific metrics or permutation importance.

These techniques help answer *why* a certain prediction was made.

3. Empirical Results & Analysis

3.1 Model Performance

The machine learning models demonstrated superior predictive accuracy compared to traditional linear benchmarks (e.g., Vector Autoregression - VAR). Performance was evaluated using metrics like Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and possibly directional accuracy. The results validate the capability of ML to model complex exchange rate dynamics.

3.2 Feature Importance & SHAP Analysis

The interpretability analysis yielded clear, economically intuitive insights:

  1. Crude Oil Price: Emerged as the most significant determinant. SHAP values revealed its effect is time-varying, with changes in sign and magnitude aligning with major events in commodity markets (e.g., the 2014 oil price crash, OPEC+ decisions). This aligns with Canada's evolving oil export landscape.
  2. Gold Price: The second most important variable, acting as a safe-haven asset and inflation hedge influencing the CAD.
  3. TSX Stock Index: Ranked third, reflecting the domestic economic health and capital flows.

Chart Description (Implied): A SHAP summary plot would show each variable as a row. For crude oil, dots would be spread across both positive and negative SHAP values on the x-axis (impact on prediction), with color indicating the feature's value (e.g., blue for low oil price, red for high). This visually confirms the time-varying and non-monotonic relationship.

3.3 Ablation Study for Model Refinement

A key innovation is using interpretation outputs (like low-importance features identified by SHAP) to guide an ablation study. Features deemed less important are iteratively removed, and model performance is re-evaluated. This process:

  • Simplifies the model, reducing overfitting and computational cost.
  • Potentially improves predictive accuracy by eliminating noise.
  • Creates a more parsimonious and focused final model, enhancing practical utility.

4. Core Insight & Analyst Perspective

Core Insight:

This paper delivers a powerful one-two punch: it doesn't just prove ML can forecast FX better; it weaponizes interpretability to validate economic theory with data-driven granularity. The finding that oil's impact on CAD/USD is non-linear and regime-dependent isn't just academic—it's a direct challenge to linear, static policy models. This work bridges the often-widening gap between high-finance quant models and central bank econometric suites.

Logical Flow:

The methodology is elegantly recursive: 1) Use robust ML (XGBoost/RF) to capture complex patterns, 2) Use SHAP to "debug" the model's logic, and 3) Feed those insights back via ablation to prune and improve the model. This creates a self-refining analytical engine. It mirrors the philosophy in seminal IML works like Lundberg & Lee's "A Unified Approach to Interpreting Model Predictions" (2017), which introduced SHAP, by making explanation a core part of the model development lifecycle, not an afterthought.

Strengths & Flaws:

Strengths: The ablation study guided by interpretability is a masterstroke for practical model deployment. Focusing on CAD/USD and commodities provides a clean, compelling narrative. The use of SHAP provides both global and local explanations, catering to both policymakers (big picture) and traders (specific scenarios).

Flaws: The paper likely underplays the temporal instability of the derived "explanations." SHAP values can shift dramatically with new data, a known challenge discussed in works like Slack et al.'s "Fooling LIME and SHAP" (2020). The model, while interpretable, may still be a "glass box" rather than a truly causal model—it shows correlation, not causation, a limitation inherent in most IML approaches applied to observational economic data.

Actionable Insights:

For Central Banks: This framework is a blueprint for building more transparent and accountable policy models. The Bank of Canada could operationalize this to stress-test different commodity price scenarios with clear attribution. For Asset Managers: The identified non-linear oil-CAD nexus is a tradable insight. It argues for dynamic hedging ratios, not static ones. For Researchers: The template is exportable. Apply it to AUD/commodities, NOK/oil, or emerging market currencies. The next frontier is integrating this with causal discovery methods (e.g., leveraging frameworks from Pearl's causality work) to move beyond explanation towards true causal inference, making the models even more robust for policy simulation.

5. Technical Implementation Details

5.1 Mathematical Formulation

The core predictive model can be represented as:

$\hat{y}_t = f(\mathbf{x}_t) + \epsilon_t$

where $\hat{y}_t$ is the forecasted exchange rate return or level at time $t$, $f(\cdot)$ is the complex function learned by the ML model (e.g., a gradient boosting ensemble), $\mathbf{x}_t$ is the vector of input features (oil price, gold, TSX, etc.), and $\epsilon_t$ is the error term.

The SHAP value $\phi_i$ for feature $i$ for a single prediction explains the deviation from the average prediction:

$f(\mathbf{x}) = \phi_0 + \sum_{i=1}^{M} \phi_i$

where $\phi_0$ is the base value (average model output) and $M$ is the number of features. $\phi_i$ is calculated using the classic Shapley value formula from cooperative game theory, considering all possible feature combinations:

$\phi_i = \sum_{S \subseteq \{1,\ldots,M\} \setminus \{i\}} \frac{|S|! \, (M - |S| - 1)!}{M!} [f_{S \cup \{i\}}(\mathbf{x}_{S \cup \{i\}}) - f_S(\mathbf{x}_S)]$

This ensures a fair attribution of the prediction to each feature.

5.2 Analysis Framework Example

Scenario: Understanding the model's prediction for a strong CAD appreciation on a specific date.

Step-by-Step IML Analysis:

  1. Local SHAP Explanation: Generate force plot or waterfall plot for the specific prediction.
    • Output: "Prediction: CAD appreciates by 1.5%. Key drivers: WTI Oil (+1.1%), Gold Price (+0.3%), TSX (-0.2% due to a slight drop)."
  2. Contextual Check: Cross-reference with market events.
    • Action: "On this date, OPEC+ announced a production cut, spiking oil prices. The model's high positive SHAP for oil aligns perfectly with this fundamental shock."
  3. PDP Analysis: Examine the PDP for oil prices.
    • Observation: "The PDP shows a steep positive slope at current price levels, confirming the model is in a regime where oil price increases strongly boost the CAD."
  4. Ablation Feedback: If, for many predictions, a feature like "U.S. Industrial Production" has near-zero SHAP values, it becomes a candidate for removal in the next model training iteration to enhance simplicity and robustness.

6. Future Applications & Research Directions

  • Real-Time Policy Dashboard: Central banks could deploy this IML framework as a live dashboard, showing real-time driver contributions to the exchange rate, aiding in communication and intervention decisions.
  • Multi-Country & Currency Basket Analysis: Extend the framework to model cross-currency relationships or a trade-weighted exchange rate index, identifying common global drivers versus country-specific ones.
  • Integration with Causal Inference: Combine IML with recent advances in causal ML (e.g., Double Machine Learning, Causal Forests) to move from "what is associated?" to "what would happen if we changed X?", enabling counterfactual policy analysis.
  • Alternative Data: Incorporate sentiment analysis from news/social media, shipping traffic data, or satellite imagery of oil storage to improve lead times and predictive power.
  • Explainable AI (XAI) for Regulation: As regulatory scrutiny on AI in finance increases (e.g., EU's AI Act), such interpretable frameworks provide a pathway for compliant and auditable model deployment.

7. References

  1. Lundberg, S. M., & Lee, S. I. (2017). A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems (NeurIPS), 30.
  2. Chen, S. S., & Chen, H. C. (2007). Oil prices and real exchange rates. Energy Economics, 29(3), 390-404.
  3. Beckmann, J., Czudaj, R., & Arora, V. (2020). The relationship between oil prices and exchange rates: Revisiting theory and evidence. Energy Economics, 88, 104772.
  4. Ferraro, D., Rogoff, K., & Rossi, B. (2015). Can oil prices forecast exchange rates? An empirical analysis of the relationship between commodity prices and exchange rates. Journal of International Money and Finance, 54, 116-141.
  5. Slack, D., Hilgard, S., Jia, E., Singh, S., & Lakkaraju, H. (2020). Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (AIES).
  6. Pearl, J. (2009). Causality: Models, Reasoning, and Inference (2nd ed.). Cambridge University Press.
  7. U.S. Energy Information Administration (EIA). (2022). U.S. Imports from Canada of Crude Oil. [Data set].