Project Sandbox: Archival Devlog
Cosimaging
An archival coding challenge and personal sandbox exploring SMLM analysis algorithms, GPU acceleration, and zero-dependency pipeline orchestration.
The Engineering Challenge
In single-molecule localization microscopy (SMLM), workflows frequently suffer from massive friction, requiring disparate tools across ImageJ, MATLAB, and custom Python scripts to take raw camera data to cluster statistics.
The objective of this personal sandbox project was to see if a fully integrated, zero-dependency data pipeline could be orchestrated within a single environment. The benchmark was to handle millions of data points, execute publication-standard mathematics, and structure the entire workflow without requiring command-line execution.
Architectural Pillars
Unified Orchestration
The primary design goal was data continuity. The architecture was built to handle localisation, filtering, drift correction, clustering, and 3D visualisation within a single memory space, eliminating file-formatting friction.
Algorithmic Fidelity
Mathematical modules were coded from scratch to strictly mimic peer-reviewed methods: Thompson-Larson-Webb precision, Costes auto-thresholding, Manders coefficients, Fourier Ring Correlation, and DBSCAN.
Zero-CLI Execution
Engineered a full graphical interface to navigate from raw data arrays to PDF reports, proving complex imaging pipelines can be executed without script or command-line dependencies.
Standalone Compilation
To bypass environment setups, the entire Python stack was compiled into a standalone execution layer, handling complex dependencies seamlessly.
Automated Pipeline
Prototyped a one-click Auto-Analysis wizard to run the entire SMLM pipeline with smart defaults, automatically generating comprehensive PDF + CSV reports.
Batch Processing
Engineered batch capabilities to process hundreds of datasets sequentially using saved workflow parameters, ensuring analytical reproducibility.
Technical Deep-Dive
The sandbox successfully prototyped and validated core processing layers across multiple modalities.
🔬 Super-Resolution Microscopy (SMLM) Pipeline
A complete algorithmic workflow designed for single-molecule localisation microscopy, from raw blinking movies to cluster statistics.
- Molecule Localisation — Implemented a batched 2D Gaussian fitting engine (linearised weighted least-squares) to convert raw TIFF blinking movies into precise molecular coordinates. Output calculations included X/Y, photon count, background, PSF width, Thompson-formula precision, and SNR.
- Data Quality Check — Designed an automated grading matrix to evaluate localisation data from A (Excellent) to D (Poor). Included technique-specific thresholds for dSTORM, PALM, and DNA-PAINT.
- Localisation Filters — Built non-destructive, interactive range sliders for spatial attributes (Photons, Background, Precision, PSF Sigma, SNR) featuring real-time 2D view updates.
- Single Peak Removal — Integrated isolated noise spike removal utilising KD-Tree based spatio-temporal nearest-neighbour analysis with Z-axis scaling.
- Drift Correction (DME) — Executed temporal segment cross-correlation paired with parabolic sub-pixel refinement algorithms.
- Image Resolution (FRC) — Coded Fourier Ring Correlation using deterministic odd/even frame splits, Tukey windowed FFTs, and smoothed curves to report resolution at the 1/7 threshold (Nieuwenhuizen et al., 2013).
- Cluster Analysis (DBSCAN) — Integrated density-based spatial clustering featuring separate XY and Z epsilon parameters for 3D topologies, validated via Ripley’s K analysis.
- Positivity Analysis — Developed spatial centroid merging algorithms to classify clusters as single-, double-, or triple-positive across multi-channel datasets.
📐 Widefield & Confocal Image Analysis
Mathematical toolsets engineered to process standard TIFF microscopy image arrays.
- Co-localisation — Implemented publication-standard metrics following Manders (1993) and Costes (2004). Utilised Costes automatic thresholding to segregate signal from background. Integrated a 200-iteration block-scramble significance test (p ≥ 0.95) and cytofluorogram scatter generation.
- Auto Mask (Otsu) — Programmed automatic signal/background separation using Otsu’s thresholding to generate binary mask layers for downstream clustering.
- Z-Projection (MIP) — Built streaming Maximum Intensity Projection (MIP) arrays capable of handling multi-frame TIFF stacks with memory-efficient architectures.
- Intensity Line Profile — Engineered interactive point placement tools to measure intensity arrays along designated topological paths.
🎨 Visualisation & Rendering
- 2D Viewer — Engineered a high-performance rendering canvas featuring Level of Detail (LOD) management, smooth panning, and customisable LUTs for multi-layer data compositing.
- 3D Scatter Viewer — Utilised PyVista and VTK to build a GPU-accelerated 3D scatter viewer capable of handling datasets exceeding millions of points with eye-dome lighting.
- Orthogonal Projections — Constructed simultaneous XY, XZ, and YZ viewports for verifying 3D data topologies.
- 3D Volume Projector — Integrated cluster-based density projections with inter-cluster distance measurement tools.
⚡ Automation & Batch Processing
- Auto-Analysis Wizard — Designed a 3-step orchestration framework to run the full algorithmic pipeline automatically: quality check → filter → drift → FRC → cluster → co-localisation.
- Batch Processing — Enabled folder-level ingestion of CSV or TIFF arrays, allowing the replay of saved JSON workflows with consistent deterministic parameters.
- Workflow Logs — Ensured every action within the execution environment generated a timestamped JSON workflow log for perfect scientific reproducibility.
🧰 Architecture Quality of Life
- Layer System — Built a management tree for multiple data sources (CSV, TIFF, Cluster, Mask) with visibility toggles and per-layer properties.
- Column Mapping — Implemented auto-detection for non-standard CSV column names from ThunderSTORM and rapidSTORM outputs.
- Multi-Channel TIFF Handling — Coded routines to load dual-channel side-by-side TIFFs and auto-split them into separate rendering layers.
Automation Orchestration
1. Data Ingestion
Raw TIFF stacks or CSV arrays are ingested via drag-and-drop, triggering auto-detection of column mapping and camera presets.
2. Pipeline Execution
The wizard orchestrates the full pipeline run, or modules can be engaged individually for granular algorithmic control.
3. Deterministic Export
Outputs are compiled into publication-ready PDF reports, CSV statistics, high-resolution renders, and JSON workflow logs.
Technical Specifications
| Platform | Windows 10 / 11 (standalone compiled executable) |
| Core Language | Python 3.10+ |
| Acceleration Stack | PyVista/VTK for 3D, CuPy for GPU-accelerated FFT |
| Data Throughput | Validated with 10M+ localisations and 4K×4K TIFF stacks |
| Ingest Formats | CSV (any delimiter), TIFF (8/16/32-bit, multi-frame) |
| Core Libraries | NumPy, SciPy, Pandas, Matplotlib, scikit-learn, scikit-image, ttkbootstrap |



