§ 02  ·  Case study  ·  2024

Diagnostic AI optimisation

Hyperspectral imaging · Alzheimer’s detection

Hotchkiss Brain Institute

Presymptomatic Alzheimer’s detection from hyperspectral blood-sample imaging.

Neural decoding waveforms on a clinical research display, with a gloved researcher’s hand reaching toward the screen.
Neural decoding pipeline · clinical researchIMG-03

At a glance

Client
Hotchkiss Brain Institute
Service
Diagnostic AI optimisation
Year
2024
Stack
Random Forest · LSTM · HIVE-COTE 2.0 · XGBoost · LightGBM · MiniRocket

Synopsis

We worked with the Hotchkiss Brain Institute on presymptomatic detection of Alzheimer’s disease from hyperspectral imaging on blood samples. The engagement was equal parts signal processing, time-series modelling, and the careful methodology that keeps a clinical pipeline honest.

01

The problem

What was in the way.

Hyperspectral imaging on blood samples produces a noisy, high-dimensional time-series. Two failure modes were always within reach. Data leakage between train and evaluation could produce a model that looked spectacular and meant nothing. Outliers could quietly drive whichever architecture happened to be sensitive to them, again producing accuracy that would not survive a real clinical setting.

Anything we shipped had to be defensible against both.

02

The approach

How we built it.

We built the pipeline with Gaussian smoothing to reduce sensor noise, MinMax scaling for normalisation, and MiniRocket-style feature extraction to lift the signal out of the time-series. Splits were constructed to make leakage structurally hard, not just unlikely. Outliers were inspected, not silently absorbed.

On the modelling side, we benchmarked GRU, RNN, LSTM, and HIVE-COTE 2.0 against XGBoost, LightGBM, and Random Forest, on the same splits. Random Forest came out on top, both on raw accuracy and on stability across folds. We picked it on the evidence.

03

The outcome

What it does now.

The Random Forest model exceeded 98% accuracy on hyperspectral blood-sample classification, with the leakage and outlier controls in place. Equally important, the pipeline is reproducible end-to-end, so future researchers can re-run, extend, and audit the result rather than taking it on trust.

Result

§ 02

Accuracy

> 98%

Random Forest on hyperspectral blood samples, with leakage and outlier controls

Stack

Random Forest · LSTM · HIVE-COTE 2.0 · XGBoost · LightGBM · MiniRocket

What we did

  • Hyperspectral imaging
  • Time-series ML
  • Feature extraction
  • Clinical research
  • Reproducibility

More case studies

A translation system for low-resource languages that adapts to each community over time

A machine-translation system for low-resource languages with an online tuning loop, so quality compounds per community as members feed back corrections.

Read more

A monocular nutrition estimator that runs in real time inside a meal-tracking app

A single-image food and nutrition estimator hitting 36% PMAE on nutrition5k, with sub-second inference on an A100, designed for in-app capture flows.

Read more

§ 03  ·  Engagement intake

01 / Start a brief

Talk to the people who’ll do the work.

We staff small and senior, scope by phase, and end on a written deliverable. We don’t sell decks or hours.

If we’re not the right team for the job, we say so on the first call. The bar is production, not pitch.

team@grouplabs.ca
Compose a brief30 min · intro
WGS84YYC / YUL
CalgaryYYC
51.05°N · 114.07°W
MontrealYUL
45.51°N · 73.55°W
Δ 3,020 km

02 / Where to find us

01

Calgary, Alberta

Studio HQ
+1 (587) 700-9968
Lat / Lng
51.0486°N · 114.0708°W
Local
—:— MST · UTC−07
02

Montreal, Quebec

Satellite office
+1 (825) 365-9891
Lat / Lng
45.5089°N · 73.5542°W
Local
—:— EST · UTC−05