Supervised Learning (in Trading)

Definition

Supervised learning is the ML paradigm where models learn from labelled data — features (X) paired with target outputs (y). In trading, this typically means training a model to predict 'will price go up or down in the next N bars?' (y) from current market features (X). The most common ML approach in algorithmic trading because of its clear interpretation and validation framework.

In-depth: Supervised Learning (in Trading)

Supervised learning is the dominant machine learning paradigm in algorithmic trading because of its compatibility with the trading problem structure: 'given current conditions, predict future outcome'.

Problem formulation: given a feature matrix X (rows = time observations, columns = features) and target vector y (rows = time observations, values = future outcome), learn a function f(X) → y that minimises prediction error on out-of-sample data.

Common feature classes in trading ML: - Price-derived: recent returns over multiple horizons (1m, 5m, 30m), price relative to moving averages, distance from recent highs/lows - Volatility: realised volatility over recent windows, GARCH-modelled volatility forecasts, range-based measures - Volume: relative volume vs rolling average, volume-weighted price levels - Technical indicators: RSI, MACD, Bollinger Band position, stochastic oscillators (used as features, not as primary signals) - Time/regime features: time-of-day, day-of-week, days-since-last-news, regime classifications - Microstructure: bid-ask spread, order book imbalance (if accessible), recent trade direction

Common target variables: - Binary classification: 'up vs down' in next N bars - Multi-class: 'up significantly / unchanged / down significantly' - Regression: continuous return over next N bars - Triple barrier (López de Prado): outcome is whichever of [target_up, target_down, time_horizon] is hit first

Algorithm choices in order of typical use in retail/professional trading: 1. Gradient-boosted trees (XGBoost, LightGBM, CatBoost): the workhorse of tabular ML in trading. Handle non-linearities, interactions, mixed feature types well. Easy to interpret via feature importance. 2. Neural networks (MLPs, CNNs, LSTMs): used when feature engineering would be cumbersome (raw price sequences, chart images). Higher capacity but more overfitting risk. 3. Random Forests: less flashy than gradient boosting but more robust to overfitting on small datasets. 4. Logistic regression: simplest model; useful as a baseline. Often surprisingly competitive when features are well-engineered. 5. Support Vector Machines: less common today; out-performed by tree ensembles in most practical applications.

Validation challenges specific to trading: 1. Walk-forward validation (not random k-fold): time series cannot be shuffled. Must train on time period T, validate on T+1, retrain on T+1, etc. 2. Class imbalance: many trading targets have one class dominating (e.g. 'no significant move' is more common than 'big move'). Use stratified sampling, class weights, or threshold tuning. 3. Look-ahead bias: any feature computation must use only data available at the prediction time. Subtle bugs: using future volatility in z-score normalisation; using future high in target computation but training feature on similar future-derived values. 4. Non-stationarity: model performance often degrades as the market regime shifts away from training conditions. Periodic retraining is essential.

Integration with MT5 EAs: - ML inference can run in MQL5 directly for small models (decision trees, simple MLPs) - For larger models (neural networks, XGBoost), external Python service is more practical - Communication: MQL5 EA writes feature vectors to file or socket; Python service reads features, runs inference, writes prediction; EA reads prediction - Latency: file-based round-trip is typically 50-500ms — fine for hourly/daily strategies, problematic for scalping; socket-based is 1-10ms — usable for most strategies

For FxRobotEasy: our flagship AI EAs use supervised learning models trained on multi-year forex data with walk-forward validation. The models predict regime classifications (trend vs chop vs breakout) that inform parameter switching in the MQL5 execution layer.

Supervised Learning (in Trading)

Supervised Learning (in Trading)

Definition

In-depth: Supervised Learning (in Trading)

Related Terms

Related Guides

See Also (External)

More AI / Machine Learning Terms

Explore the Ecosystem

Supervised Learning (in Trading)

Definition

In-depth: Supervised Learning (in Trading)

Related Terms

Related Guides

See Also (External)

More AI / Machine Learning Terms