1. Introduction
Accurate forecasting of the US Dollar to Bangladeshi Taka (USD/BDT) exchange rate is critical for Bangladesh's import-dependent economy, impacting trade balances, inflation, and foreign reserve management. Traditional statistical models often fail to capture the non-linear, complex patterns characteristic of emerging market currencies, especially under economic uncertainty. This study addresses this gap by developing and evaluating advanced machine learning models, specifically Long Short-Term Memory (LSTM) neural networks and Gradient Boosting Classifiers (GBC), using historical data from 2018 to 2023. The research aims to provide robust tools for financial risk mitigation and policy formulation.
2. Literature Review
The application of deep learning, particularly LSTM networks, has shown significant promise in financial time series forecasting. Pioneered by Hochreiter & Schmidhuber to solve the vanishing gradient problem in RNNs, LSTMs excel at capturing long-term dependencies. Subsequent enhancements like forget gates (Gers et al.) improved adaptability to volatility. Empirical studies, such as those on USD/INR, demonstrate LSTMs outperforming traditional ARIMA models by 18–22% in directional accuracy. However, research specifically targeting the USD/BDT pair, considering Bangladesh's unique managed-float regime and local macroeconomic shocks, remains limited. This study builds upon and extends this nascent field.
3. Methodology & Data
3.1 Data Collection & Preprocessing
Daily USD/BDT exchange rate data from January 2018 to December 2023 was sourced from Yahoo Finance. The dataset was cleaned, and features such as normalized daily returns, simple moving averages (SMA), and relative strength index (RSI) were engineered to capture market trends and volatility. The data was split into training (80%) and testing (20%) sets.
3.2 LSTM Model Architecture
The core forecasting model is a stacked LSTM network. The architecture typically involves:
- Input Layer: Sequences of historical price/feature data.
- LSTM Layers: Two or more layers with dropout for regularization to prevent overfitting.
- Dense Layer: A fully connected layer for output.
- Output Layer: A single neuron for predicting the next period's exchange rate.
The model was trained using the Adam optimizer and Mean Squared Error (MSE) as the loss function.
3.3 Gradient Boosting Classifier
For directional prediction (up/down movement), a Gradient Boosting Classifier (GBC) was implemented. It uses an ensemble of weak prediction models (decision trees) to create a strong classifier, focusing on minimizing prediction error through iterative learning.
LSTM Accuracy
99.449%
LSTM RMSE
0.9858
Profitable Trade Rate (GBC)
40.82%
ARIMA RMSE (Baseline)
1.342
4. Experimental Results & Analysis
4.1 Performance Metrics
The LSTM model achieved exceptional results: an accuracy of 99.449%, a Root Mean Square Error (RMSE) of 0.9858, and a test loss of 0.8523. This performance significantly outperformed the traditional ARIMA model, which had an RMSE of 1.342. The high accuracy indicates the LSTM's superior capability in modeling the complex temporal dynamics of the USD/BDT exchange rate.
4.2 Backtesting & Trading Simulation
The Gradient Boosting Classifier was backtested on a trading simulation starting with $10,000 initial capital. Over 49 trades, the model achieved a profitable trade rate of 40.82%. However, the simulation resulted in a net loss of $20,653.25. This highlights a critical insight: high directional accuracy does not automatically translate to profitable trading strategies, as transaction costs, slippage, and risk management (stop-loss/take-profit levels not mentioned in the PDF) play decisive roles.
Chart Description (Implied): A line chart would likely show the historical USD/BDT rate declining from approximately 0.012 (2018) to 0.009 (2023). A second chart would plot the cumulative P&L of the GBC trading strategy, showing an initial period of gains followed by a steep drawdown leading to the final net loss.
5. Technical Deep Dive
The core of the LSTM's effectiveness lies in its cell state and gating mechanisms. The key equations for an LSTM cell at time step $t$ are:
Forget Gate: $f_t = \sigma(W_f \cdot [h_{t-1}, x_t] + b_f)$
Input Gate: $i_t = \sigma(W_i \cdot [h_{t-1}, x_t] + b_i)$
Candidate Cell State: $\tilde{C}_t = \tanh(W_C \cdot [h_{t-1}, x_t] + b_C)$
Cell State Update: $C_t = f_t * C_{t-1} + i_t * \tilde{C}_t$
Output Gate: $o_t = \sigma(W_o \cdot [h_{t-1}, x_t] + b_o)$
Hidden State Output: $h_t = o_t * \tanh(C_t)$
Where $\sigma$ is the sigmoid function, $*$ denotes element-wise multiplication, $W$ and $b$ are weights and biases, $x_t$ is the input, $h_t$ is the hidden state, and $C_t$ is the cell state. This architecture allows the model to selectively remember or forget information over long sequences, crucial for financial time series with long-range dependencies.
6. Analytical Framework & Case Example
Framework: The Forex ML Pipeline
This study exemplifies a standard yet effective pipeline for financial ML:
- Problem Framing: Regression (LSTM for price) vs. Classification (GBC for direction).
- Feature Engineering: Creating predictive signals from raw prices (returns, technical indicators).
- Model Selection & Training: Choosing sequence-aware models (LSTM) for temporal data.
- Rigorous Validation: Using time-series cross-validation, not random splits, to avoid look-ahead bias.
- Strategy Backtesting: Translating model predictions into a simulated trading strategy with realistic constraints.
Case Example: Signal Generation
A simplified rule based on the LSTM forecast could be: "If the predicted price for tomorrow is > (today's price + a threshold $\alpha$), generate a BUY signal." The GBC directly outputs a class label (1 for UP, 0 for DOWN). The critical lesson from the paper's trading loss is the necessity of a subsequent risk management layer that determines position sizing, stop-loss orders, and portfolio allocation, which was likely absent or simplistic in the simulation.
7. Future Applications & Directions
The future of AI in forex forecasting lies in multi-modal, adaptive systems:
- Integration of Alternative Data: Incorporating real-time news sentiment analysis (using NLP models like BERT), central bank communication tone, and geopolitical risk indices, as seen in hedge funds like Two Sigma.
- Hybrid & Attention-Based Models: Moving beyond standard LSTMs to Transformer architectures with self-attention mechanisms (like those in Vaswani et al.'s "Attention is All You Need") which can weigh the importance of different time steps more flexibly.
- Reinforcement Learning (RL): Developing RL agents that learn optimal trading policies directly, considering costs and risk-adjusted returns, rather than just predicting prices. This aligns with research from DeepMind and OpenAI in simulated environments.
- Explainable AI (XAI): Implementing techniques like SHAP or LIME to interpret model predictions, which is crucial for regulatory compliance and gaining trust from financial institutions.
- Cross-Market Learning: Training models on multiple currency pairs or asset classes to learn universal patterns of volatility and contagion.
8. References
- Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation.
- Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). Learning to Forget: Continual Prediction with LSTM.
- Rahman et al. (2022). LSTM-based Forecasting for Emerging Market Currencies: A USD/INR Case Study. Journal of Computational Finance.
- Afrin, S., et al. (2021). Forecasting USD/BDT Exchange Rate Using Machine Learning. International Conference on Computer and Information Technology.
- Vaswani, A., et al. (2017). Attention Is All You Need. Advances in Neural Information Processing Systems.
- Yahoo Finance. (2023). USD/BDT Historical Data.
9. Industry Analyst's Perspective
Core Insight: This paper is a classic example of the "accuracy-profitability paradox" in quantitative finance. The authors have built a technically sound LSTM model that achieves near-perfect 99.45% accuracy on USD/BDT forecasting—a commendable feat—yet their associated trading strategy bled capital catastrophically. The real story isn't the model's precision; it's the glaring disconnect between academic metric optimization and real-world trading P&L. It underscores a truth many quants learn the hard way: minimizing RMSE is not the same as maximizing Sharpe Ratio.
Logical Flow: The research follows a standard pipeline: data acquisition, feature engineering, model selection (LSTM/GBC), and performance validation. The logical flaw, however, is in the leap from validation to application. The backtesting appears naive, likely lacking robust transaction cost modeling, slippage, and, most critically, a coherent risk management framework. A 40% win rate with a large negative net outcome suggests the strategy's losses per losing trade were far larger than gains per winning trade—a fatal flaw no amount of LSTM accuracy can fix.
Strengths & Flaws:
- Strengths: Excellent model engineering for a niche, under-researched currency pair (USD/BDT). The comparison against ARIMA provides a clear benchmark. The explicit mention of the trading loss is intellectually honest and more valuable than many papers that only highlight successes.
- Flaws: The trading simulation is essentially an afterthought, revealing a lack of integration between the prediction and execution layers—the very heart of systematic trading. There's no discussion of position sizing (e.g., Kelly Criterion), stop-losses, or portfolio context. Furthermore, while LSTMs are powerful, their black-box nature remains a significant barrier to adoption in regulated financial institutions compared to more interpretable ensembles like Gradient Boosted Trees.
Actionable Insights:
- Bridge the Gap with Reinforcement Learning: Instead of treating prediction and trading as separate steps, future work should employ end-to-end Reinforcement Learning (RL). An RL agent, akin to those used by DeepMind for game playing, can learn to optimize for direct trading metrics (e.g., cumulative return, Sortino ratio) from the raw data, inherently factoring in costs and risk.
- Adopt a "Prediction-Execution-Risk" Trinity: Any forecasting research must be evaluated within a triad. The prediction model is just one vertex. Equal rigor must be applied to the execution model (market impact, costs) and the risk model (VaR, expected shortfall, drawdown control).
- Focus on Regime Detection: The USD/BDT, under a managed float, has distinct regimes (stable, intervention, crisis). Models like Markov Switching Models or clustering algorithms should be used to detect the current regime first, then apply the most suitable forecasting model. A one-model-fits-all approach is myopic.
- Prioritize Explainability: To move from academic exercise to trader's tool, implement XAI techniques. Showing a trader that a "sell" signal is 60% driven by a widening trade deficit and 40% by RSI divergence builds trust far more than a 99% accurate black box.