Hotchkiss Brain Institute

Presymptomatic Alzheimer’s detection from hyperspectral blood-sample imaging.

Neural decoding waveforms on a clinical research display, with a gloved researcher’s hand reaching toward the screen. — Neural decoding pipeline · clinical researchIMG-03

Client: Hotchkiss Brain Institute
Service: Diagnostic AI optimisation
Year: 2024
Stack: Random Forest · LSTM · HIVE-COTE 2.0 · XGBoost · LightGBM · MiniRocket

Synopsis

We worked with the Hotchkiss Brain Institute on presymptomatic detection of Alzheimer’s disease from hyperspectral imaging on blood samples. The engagement was equal parts signal processing, time-series modelling, and the careful methodology that keeps a clinical pipeline honest.

The problem

What was in the way.

Hyperspectral imaging on blood samples produces a noisy, high-dimensional time-series. Two failure modes were always within reach. Data leakage between train and evaluation could produce a model that looked spectacular and meant nothing. Outliers could quietly drive whichever architecture happened to be sensitive to them, again producing accuracy that would not survive a real clinical setting.

Anything we shipped had to be defensible against both.

The approach

How we built it.

We built the pipeline with Gaussian smoothing to reduce sensor noise, MinMax scaling for normalisation, and MiniRocket-style feature extraction to lift the signal out of the time-series. Splits were constructed to make leakage structurally hard, not just unlikely. Outliers were inspected, not silently absorbed.

On the modelling side, we benchmarked GRU, RNN, LSTM, and HIVE-COTE 2.0 against XGBoost, LightGBM, and Random Forest, on the same splits. Random Forest came out on top, both on raw accuracy and on stability across folds. We picked it on the evidence.

The outcome

What it does now.

The Random Forest model exceeded 98% accuracy on hyperspectral blood-sample classification, with the leakage and outlier controls in place. Equally important, the pipeline is reproducible end-to-end, so future researchers can re-run, extend, and audit the result rather than taking it on trust.

Result

§ 02

Accuracy

> 98%

Random Forest on hyperspectral blood samples, with leakage and outlier controls

Stack

Random Forest · LSTM · HIVE-COTE 2.0 · XGBoost · LightGBM · MiniRocket

What we did

Hyperspectral imaging
Time-series ML
Feature extraction
Clinical research
Reproducibility

Our offices

Follow us

Hotchkiss Brain Institute

What was in the way.

How we built it.

What it does now.

More case studies

A translation system for low-resource languages that adapts to each community over time

A monocular nutrition estimator that runs in real time inside a meal-tracking app

Talk to the people who’ll do the work.

Intro call