§ 02 · Case study · 2024
Diagnostic AI optimisation
Hyperspectral imaging · Alzheimer’s detection
Hotchkiss Brain Institute
Presymptomatic Alzheimer’s detection from hyperspectral blood-sample imaging.

- Client
- Hotchkiss Brain Institute
- Service
- Diagnostic AI optimisation
- Year
- 2024
- Stack
- Random Forest · LSTM · HIVE-COTE 2.0 · XGBoost · LightGBM · MiniRocket
At a glance
Synopsis
We worked with the Hotchkiss Brain Institute on presymptomatic detection of Alzheimer’s disease from hyperspectral imaging on blood samples. The engagement was equal parts signal processing, time-series modelling, and the careful methodology that keeps a clinical pipeline honest.
The problem
What was in the way.
Hyperspectral imaging on blood samples produces a noisy, high-dimensional time-series. Two failure modes were always within reach. Data leakage between train and evaluation could produce a model that looked spectacular and meant nothing. Outliers could quietly drive whichever architecture happened to be sensitive to them, again producing accuracy that would not survive a real clinical setting.
Anything we shipped had to be defensible against both.
The approach
How we built it.
We built the pipeline with Gaussian smoothing to reduce sensor noise, MinMax scaling for normalisation, and MiniRocket-style feature extraction to lift the signal out of the time-series. Splits were constructed to make leakage structurally hard, not just unlikely. Outliers were inspected, not silently absorbed.
On the modelling side, we benchmarked GRU, RNN, LSTM, and HIVE-COTE 2.0 against XGBoost, LightGBM, and Random Forest, on the same splits. Random Forest came out on top, both on raw accuracy and on stability across folds. We picked it on the evidence.
The outcome
What it does now.
The Random Forest model exceeded 98% accuracy on hyperspectral blood-sample classification, with the leakage and outlier controls in place. Equally important, the pipeline is reproducible end-to-end, so future researchers can re-run, extend, and audit the result rather than taking it on trust.
Result
§ 02
Accuracy
> 98%
Random Forest on hyperspectral blood samples, with leakage and outlier controls
Stack
Random Forest · LSTM · HIVE-COTE 2.0 · XGBoost · LightGBM · MiniRocket
What we did
- Hyperspectral imaging
- Time-series ML
- Feature extraction
- Clinical research
- Reproducibility