模型喺訓練期間嘅損失函數通常係均方誤差 (MSE),正如之前所定義,優化器(例如 Adam)會透過調整權重 (W, b) 來最小化呢個誤差。
7. 分析框架:一個實際案例
情境: 一家量化對沖基金希望為歐元/美元開發一個低延遲、注重能源效益的交易訊號。
框架應用:
問題定義: Predict the next 4-hour candle direction (up/down) with >55% accuracy, with a model inference time < 10ms and a goal to reduce training energy by 20% compared to a baseline LSTM.
Data & Preprocessing: 使用5年嘅每小時OHLCV數據。創建特徵:對數回報率、滾動波動率窗口,以及訂單簿失衡代理指標。進行標準化並序列化成50個時間步長嘅窗口。
Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. 神經計算, 9(8), 1735–1780.
Sejnowski, T. J., et al. (2020). The Carbon Footprint of AI and Machine Learning. Communications of the ACM.
Bank for International Settlements (BIS). (2019). Triennial Central Bank Survey of Foreign Exchange and OTC Derivatives Markets.
Zhu, J.-Y., et al. (2017). Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. IEEE International Conference on Computer Vision (ICCV). (以CycleGAN為創新深度學習架構嘅例子).
Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and Policy Considerations for Deep Learning in NLP. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
TensorFlow Model Optimization Toolkit. (n.d.). Retrieved from https://www.tensorflow.org/model_optimization