Training#

Begin a training round in a terminal using the following command template:

omnipose --train --use_gpu --dir <training image directory> \
         --img_filter <img_filter> --mask_filter <mask_filter> \
         --nchan <nchan> --all_channels --channel_axis <channel_axis> \
         --pretrained_model None --diameter 0 --nclasses 2 \
         --learning_rate 0.1 --RAdam --batch_size 16 --n_epochs <n_epochs>

Note

Training should be done only via CLI. If image preprocessing is required, I highly suggest doing that in a script and saving to a new folder (as opposed to attempting preprocessing + training in one script/notebook).

The main commands here are:

omnipose

calls __main__.py in cellpose-omni, which first loads the images in --dir and formats them. Then --train toggles on the training branch (versus evaluation).

--dir

points to a folder of image and label pairs. With --look_one_level_down, you can let --dir point to a folder with subfolders. This can be very useful when training on several distinct subsets of ground truth data.

--diameter

should be set to 0 (and is now 0 by default) to disable rescaling. Anything else will rescale your images relative to a mean diameter of 30 (see Cell diameter), such that --diameter 15 will upscale your image by a factor of 2 along each axis and --diameter 60 will likewise downscale by a factor of 2. If you need automatic diameter estimation, see Diameter and the Size Model.

--nchan, --nclasses

define the number of image channels and the number of prediction classes. These should always be specified for custom models, as the defaults are --nchan 1 (mono-channel images) and --nclasses 2 (flow and distance field predictions). If you train a model with --nclasses 3 (add the boundary field) or have multichannel images these will be in the model file name. Use these when running the model, too, both in CLI and in cellpose_omni.models.CellposeModel().

--all_channels

tells Omnipose to use all nchan channels for segmentation. The relatively complicated --chan and --chan2 settings from Cellpose are still available, but I never use them. I highly recommend preprocessing your training set to have the channels you want to use (and for evaluation, do the same preprocessing in a script/notebook).

--channel_axis

lets you specify where your channels are in your arrays. Conventional ordering is CYX for multichannel 2D images, so --channel_axis defaults to 0. RGB images will have --channel_axis 2.

Warning

Paths given to --dir or --test_dir must absolute paths.

Hyperparameters#

It is best for reproducibility to explicitly choose hyperparameters at runtime rather than relying on defaults.

--RAdam

selects the RAdam optimizer (versus the default SGD). I found RAdam to be a bit faster and more stable compared to SGD and other optimizers.

--learning_rate

controls the optimizer step size.

--batch_size

controls the number of images the network sees for each step (with the last batch being smaller if the number of images is not evenly divisible by batch_size). A random crop is selected from each image (see --tyx). This means that only a portion of each image is seen during a given epoch. Smaller batches can sometimes lead to better generalization. Larger batches can lead to better stability. I have found that it does not make a very large difference in model performance, but larger batches can train faster (see --dataparallel).

--tyx

controls the crop size for selecting a sample from each training image (see Image dimensions).

--n_epochs

controls how many times the network is shown the full dataset. I usually do 4000.

--dataloader

toggles on parallel dataloading. Preprocessing batches for training is a CPU bottleneck, but the DataParallel library helps a lot with that. Use --num_workers to control how many cores will participate. This is only a benefit when you have more images in your training set than cores on your machine.

Model saving#

You can choose how often to save your models with --save_every <n>. This overwrites the model every time. To save a new model each n epochs, you can use save_each (useful for debugging / comparing across epochs).

Training data#

Your training set should consist of at least two tuples of images, labels, and (optionally) label link files.

File naming conventions#

Each tuple of images and labels should be formatted as <base><img_filter>.<img_ext>, <base><mask_filter>.<mask_ext>, and (optionally) <base>_links.txt. base can be any string. The img_filter defaults to an empty string '' and the mask_filter defaults to _masks. These can be arranged in a single training folder:

folder/
├── A.tif
├── A_masks.tif
├── B.tif
├── B_masks.tif
└── ...

Or in subfolders (when using --look_one_level_down):

folder/
├── subfolder_1/
│   ├── A.tif
│   └── A_masks.tif
├── subfolder_2/
    ├── B.tif
    ├── B_masks.tif
    └── ...
└── ...

If you use the --img_filter option (--img_filter img in this case), the suffix only goes on image files:

folder/
├── A_img.tif
├── A_masks.tif
├── B_img.tif
├── B_masks.tif
└── ...

File extensions#

Microscopy images should generally be saved in a lossless format like PNG or TIF. Instance label matrices may likewise be stored as images in either PNG or TIF. Note that TIF supports up to 32 bits per channel whereas PNG only supports 16. That said, if you have more than \(2^{16}-1 = 65535\) labels in one image, you should definitely be cropping your images into several smaller images.

Image dimensions#

You should aim to make training images of roughly size (512,512). During training, the tyx parameter (set to 224,224 by default) controls the size of warped image crops in each batch shown to the network. Although the true rectangular patch selected from each image in a batch has randomly expanded or contracted dimensions (within a range 0.5-1.5), you should aim to have the tyx dimensions roughly half that of the images in the training set. If much smaller, then each image will not be sufficiently covered during an epoch (requiring more epochs to converge). Larger tyx will just slow down training and possibly hurt generalizability.

If an image dimension is substantially larger than 512 px, subdivide it along that axis. For example, (2048,2048) images should be split into 16 (512,512) images (4 along each axis). Smaller images are far easier to annotate correctly.

If your image dimensions are substantially smaller than 512 px, you can instead decrease the tyx parameter. For example, if your training images are around size (256,256), then I would recommend the CLI flag --tyx 128,128.

Note

The tyx tuple elements must be evenly divisible by 8 (for U-net downsampling).

Object density#

As a general rule, you want to train on images with densely packed objects. This is to balance the foreground class to the background class. In other words, we want Omnipose to focus on predicting good output in foreground regions rather than zero output in background regions. If your images have a lot of useless background, crop out just the denser regions. This can be done automatically if you can segment clusters/microcolonies of cells. You can use functions in omnipose.utils for processing a binary image into crops that you can then join into an ensemble image using a rectangle packing algorithm. Training on these images allows Omnipose to see the same number of cells but a lot faster, as it does not waste time looking at too much background.

Ground truth quality#

Garbage in, garbage out. It is better to have fewer images with meticulously crafted, consistent labels than many images with sloppy labels. Your labels should...

  1. be based on supplemental channels wherever the primary channel is ambiguous

  2. be label matrices, not semantic (binary) masks

  3. not miss a single cell

  4. extend to cell boundaries

  5. meet each other at cell interfaces

You will probably spend 10x more time annotating ground truth images than acquiring them, so it is worth putting in the effort to find a membrane dye that does not conflict with main channel(s) on which your model will be trained. This is purely for the purposes of having a physiological reference for the ground truth of cell extent and cell septation, not for training the segmentation model.

Tip

If using a transmissive modality like phase contrast or brightfield or DIC, use the same filter cube as your fluorescence channel. This usually removes any offset between the channels. Otherwise, be sure to do multimodal registration between the channels.

Transfer learning#

You can use --pretrained_model None to train from scratch or --pretrained_model <model path> to start from an existing model. Once a model is initialized and trained, you cannot change its structure. This is defined by nchan (the number of channels used for segmentation), nclasses (the number of prediction classes), and dim (the dimension of the images). You must use precisely the same nchan, nclasses, and dim that were used to train the existing model. See Models for a table of the pretrained model parameters.

Diameter and the Size Model#

The Cellpose pretrained models are trained using resized images so that the cells have the same median diameter across all images. If you choose to use a pretrained model, then this fixed median diameter is used. Omnipose models are generally not trained with rescaling. cyto2_omni is the exception, as its images are extremely diverse in size.

If you choose to train from scratch, you can set the median diameter you want to use for rescaling with the --diameter flag, or set it to 0 to disable rescaling. The cyto, cyto2, and cyto2_omni models were trained with a diameter of 30 pixels and the nuclei model was trained with a diameter of 17 pixels.

If your target image set varies a lot in cell diameter (i.e., the images you want to segment vary unpredictably in size), you may also want to learn a SizeModel() that predicts the diameter from the network style vectors. Add the flag --train_size and this model will be trained and saved as an *.npy file. Omnipose models generally do not come with a SizeModel(), with the exception of cyto2_omni.

Examples#

To train on cytoplasmic images (green cyto and red nuclei) starting with a pretrained model from cellpose_omni (cyto or nuclei):

omnipose --train --dir <train_path> --pretrained_model cyto --chan 2 --chan2 1

You can train from scratch as well:

omnipose --train --dir <train_path> --pretrained_model None

You can also specify the full path to a pretrained model to use:

omnipose --dir <train_path> --pretrained_model <model_path> --save_png

To train the bact_phase_omni model from scratch using the same parameters from the Omnipose paper, download the dataset and run

omnipose --train --use_gpu --dir <bacterial_dataset_directory> --mask_filter _masks \
         --n_epochs 4000 --pretrained_model None --learning_rate 0.1 --diameter 0 \
         --batch_size 16  --RAdam --nclasses 3

Training 3D models#


To train a 3D model on image volumes, specify the dimension argument: --dim 3. You may run out of VRAM on your GPU. In that case, you can specify a smaller crop size, e.g., --tyx 50,50,50. The command I used in the paper on the Arabidopsis thaliana lateral root primordia dataset was:

omnipose --use_gpu --train --dir <path> --mask_filter _masks \
         --n_epochs 4000 --pretrained_model None --learning_rate 0.1 --save_every 50 \
         --save_each  --verbose --look_one_level_down --all_channels --dim 3 \
         --RAdam --batch_size 4 --diameter 0 --nclasses 3