DeepFocus autofocus

Overview

DeepFocus uses a phase-diversity approach: the algorithm acquires two images at slightly different working distances (current WD + sigma and WD - sigma), crops several patch pairs from them, and runs a convolutional network that predicts a correction vector (delta WD, delta stigmation X, delta stigmation Y). The corrections are applied to the SEM and the process repeats until focus converges (typically 2–3 iterations).

The model is trained on your microscope and sample type, so you collect paired images with known aberrations, train offline, then deploy the saved TorchScript model for fast inference.

Getting started (first time)

I've never used DeepFocus and the model doesn't know my microscope or my samples. What do I do?

Collect training data on your SEM: focus manually at several sample locations (and optionally at different magnifications via fov_um_list, one FOV per location), then run collect_training_data(). This acquires perturbed image pairs (WD±σ) with known correction targets. See Training: Collect data.
Train your first model: run train_model(data_dir=..., config=..., output_path="./deepfocus_model.pt") without pretrained_checkpoint. This trains from scratch (encoder starts from ImageNet, then everything is trained on your data). See Training: Train the model.
Use autofocus: in your scripts, call imaging.autofocus(AutofocusConfig(model_path="./deepfocus_model.pt", ...)). See Using autofocus.

For a new microscope or sample type later: collect a smaller recalibration dataset (e.g. 10 locations, ~100 pairs), then run train_model(..., pretrained_checkpoint="./deepfocus_model.pt", fine_tune=True, output_path="./deepfocus_recal.pt"). This loads your existing model, freezes the encoder, and trains only the last layers on the new data. Use the new deepfocus_recal.pt for autofocus. See Recalibration.

Note: the pretrained model from the DeepFocus paper (Schubert et al.) uses a different architecture; our pipeline uses an EfficientNet-based model and expects data collected via Semphony. Start by training on your own data to get your first .pt, then use that as pretrained_checkpoint for future fine-tuning.

Prerequisites

TESCAN SEM with SharkSEM SDK; stigmation must be available via geometric transformations (EnumGeometries / SetGeometry).
Python 3.9+ with the Semphony SDK and the optional autofocus extra: pip install semphony[autofocus] (adds PyTorch, torchvision, Pillow).
For training: a GPU is recommended; training can take on the order of tens of hours for a full run.

Training: Collect data

Manually focus the SEM at a starting location, then run the data collection script. It will read the current focus (WD, stigmation) as baseline. For each location it may perturb stage X/Y and pick a magnification (FOV), then apply random focus aberrations (WD, stigmation), acquire two images per aberration (WD+sigma and WD-sigma), and save the image paths and ground-truth correction vector to a manifest.

from semphony import SemphonyClient, ImagingDevice, AutofocusConfig, TrainingConfig
from semphony.autofocus.training import collect_training_data, train_model

client = SemphonyClient(base_url="...", api_key="...")
device = ImagingDevice(client, device_slug="tescan-001")
run = client.run("deepfocus_collect", resume=True)
with run.session() as s:
    device.attach(s)
    collect_training_data(
        device,
        TrainingConfig(n_locations=32, n_aberrations_per_location=10),
        output_dir="./deepfocus_training_data/",
    )

Output: a directory containing image pairs and dataset_manifest.json with hierarchical location/aberration metadata plus refocus checkpoints. Approximate time: 32 locations × 10 aberrations = 320 pairs; expect on the order of 2–4 hours depending on dwell time and resolution.

Training: Train the model

Run the training loop on the collected data. You can do this on a separate machine with a GPU; the only input is the data directory and the output is a TorchScript model file. For first-time training, do not pass pretrained_checkpoint. For recalibration, pass pretrained_checkpoint=path/to/your/model.pt and fine_tune=True (see Recalibration).

from semphony.autofocus.training import train_model
from semphony.autofocus import TrainingConfig

train_model(
    data_dir="./deepfocus_training_data/",
    config=TrainingConfig(),
    output_path="./deepfocus_model.pt",
    # pretrained_checkpoint="./deepfocus_model.pt",  # for recalibration only
    # fine_tune=True,
)

Full training (e.g. 1M steps) can take on the order of 44 hours on a single GPU. Monitor validation loss; the script saves the model as a TorchScript file for fast inference.

Recalibration (new microscope or sample)

When you switch to a new microscope or sample type, use a pretrained checkpoint from your previous training: collect a smaller dataset (e.g. 10 locations, ~100 pairs) on the new setup, then run train_model with pretrained_checkpoint set to your existing .pt and fine_tune=True. The encoder is frozen and only the last layers are trained. This typically recovers good performance in about 2 hours on a single GPU.

train_model(
    data_dir="./deepfocus_recal_data/",
    config=TrainingConfig(max_training_steps=50_000, lr_decay_steps=1000, lr_decay_factor=0.95),
    output_path="./deepfocus_recal.pt",
    fine_tune=True,
    pretrained_checkpoint="./deepfocus_model.pt",  # your existing Semphony-trained model
)

Using autofocus

In your experiment script, call imaging.autofocus(config) with an AutofocusConfig that points to your trained model. The method runs the iterative loop (acquire pair → predict correction → apply → repeat) until convergence or max_iterations.

from semphony.autofocus import AutofocusConfig, AutofocusResult

config = AutofocusConfig(
    model_path="./deepfocus_model.pt",
    sigma_wd_um=5.0,
    n_patches=5,
    max_iterations=10,
    use_gpu=True,
)
result = device.imaging.autofocus(config)
# result.converged, result.iterations, result.final_wd_mm, result.final_stigm_x, result.final_stigm_y

Typical convergence: 2–3 iterations. If the model diverges on a new setup, use recalibration (see above).

Architecture diagram

flowchart TD
    subgraph inference [Inference loop]
        CurrentParams[Current WD and stigmation] --> Perturb[Perturb WD by +/- sigma]
        Perturb --> AcquirePair[Acquire 2 images]
        AcquirePair --> CropPatches[Crop N patch pairs]
        CropPatches --> CNN[EfficientNet-B0 CNN]
        CNN --> DeltaF[Predicted correction]
        DeltaF --> Apply[Apply to SEM]
        Apply --> Check{Converged?}
        Check -->|No| Perturb
        Check -->|Yes| Done[Focused image]
    end

Configuration reference

AutofocusConfig (inference)

Field	Default	Description
model_path	—	Path to TorchScript model (.pt)
sigma_wd_um	5.0	WD perturbation in µm
n_patches	5	Patch pairs per iteration
patch_size	512	Patch edge length (px)
max_iterations	10	Max correction iterations
wd_tolerance_um	0.25	Convergence threshold WD
stig_tolerance	0.25	Convergence threshold stigmation
use_gpu	False	Use CUDA if available
correct_stigmation	True	Apply stigmation corrections; set False for WD-only (e.g. TESCAN without stigmator control)

TrainingConfig (training)

Field	Default	Description
n_locations	32	Sample locations
n_aberrations_per_location	10	Aberrations per location
wd_range_um	20.0	WD sampling range
stig_range	5.0	Stigmation sampling range (ignored if wd_only=True)
wd_only	False	If True, only WD is perturbed and set; no stigmation (for SEMs without stigmator control)
sigma_wd_um	5.0	Perturbation for image pairs
scan_speed_index	None	Scan speed for acquisitions (ZEISS DP_SCANRATE / TESCAN ScSetSpeed, 0–21). None = use base_config or SEM default
fov_um_list	None	Optional list of FOVs (µm) for multiple magnifications; each location picks one at random (shared by all aberrations at that location)
stage_perturbation_fov_fraction	None	Optional (min_frac, max_frac) of FOV; stage moved by random distance in [min, max]×FOV in random direction (stable with mag)
stage_perturbation_xy_um	None	Optional (max_dx, max_dy) in µm; fixed range. Ignored if stage_perturbation_fov_fraction is set. The legacy `stage_perturbation_xy_mm` field is still accepted (auto-converted to µm) but deprecated.
resume	True	Detect existing image pairs and `dataset_manifest.json` in `output_dir`; continue at the next location (partial locations are skipped)
learning_rate	1e-3	AdamW learning rate
max_training_steps	1_000_000	Training steps

Troubleshooting

Model diverges or corrections are wrong: Recalibrate with a small dataset on the current microscope/sample (see Recalibration).
Inference is slow: Set use_gpu=True in AutofocusConfig if a CUDA GPU is available.
No stigmator control (e.g. TESCAN): Use WD-only mode: set correct_stigmation=False in AutofocusConfig when calling imaging.autofocus(), and wd_only=True in TrainingConfig when collecting training data. Only working distance will be corrected.
Stigmation geometry not found: Ensure the SEM exposes stigmation via EnumGeometries (e.g. a geometry whose name contains "stigm" or "stig"). If not, use WD-only (see above).
get_focus_params / set_focus_params not available: Ensure the device client and control server have the imaging.get_focus_params and imaging.set_focus_params commands registered and the TESCAN controller implements them.