The basics in 2D#

This notebook demonstrates how to load images, display them, segment them using Omnipose, and visualize both the segmentation results and the intermediate network output. Here we show the details behind the most typical workflow: single-channel segmentation. The bacterial images used are (1) from my own image library and (2-5) from the DeLTA2.0 paper. From the latter, we shall see how to handle images that are intrinsically grayscale but were exported and published as RGB(A) - i.e., there is no extra information in those extra channels to aid segmentation. For two-channel segmentation, see the multi_channel_cyto notebook.

Before running this notebook, install the latest version of Omnipose from GitHub. This automatically installs our Cellpose fork.

 1# Import dependencies
 2import numpy as np
 3from cellpose_omni import models, core
 4
 5# This checks to see if you have set up your GPU properly.
 6# CPU performance is a lot slower, but not a problem if you 
 7# are only processing a few images.
 8use_GPU = core.use_gpu()
 9print('>>> GPU activated? {}'.format(use_GPU))
10
11# for plotting
12import matplotlib as mpl
13import matplotlib.pyplot as plt
14mpl.rcParams['figure.dpi'] = 300
15plt.style.use('dark_background')
16%matplotlib inline
2024-03-07 12:19:28,408	[INFO]	        _use...torch()	 line 74	** TORCH GPU version installed and working. **
>>> GPU activated? True

How to load your images#

There are several ways to load your image files into a notebook. If you have a specific set of images, put their full paths into a list. For example:

# make it a list even if there is only one file
files = ['path_to_image_1'] 
files = ['path_to_image_1','path_to_image_2']

# you can also add to the list like so:
files = files + ['path_to_image_3']

Alternatively, you can load all the images in a directory. Here are a few templates you can use to get the list of directories automatically by searching for image files matching a cetrain file lane with an extension and keywords in the file name.

from pathlib import Path

basedir = '<path_to_image_folder>'
# use rglob to search subfolders recursively 
files = [str(p) for p in Path(basedir).rglob("*.tif")] 
# change the search string to grab only one channel
files = [str(p) for p in Path(basedir).glob("*C1.png")]
# specify a match anywhere in the file name
files = [str(p) for p in Path(basedir).glob("*488*.png")] 

We can also use the cellpose_omni.io library to grab all the images in the test_files folder. This is very handy for grabbing images of different extensions. Here we are using four RGB(A) images from the DeLTA 2.0 training set (on which the bact_phase_omni model has never been trained) as well as an RGB image acquired in the same lab as much of the Omnipose bact_phase dataset.

1from pathlib import Path
2import os
3from cellpose_omni import io
4import omnipose
5omnidir = Path(omnipose.__file__).parent.parent
6basedir = os.path.join(omnidir,'docs','test_files')
7files = io.get_image_files(basedir)
8
9omnidir
PosixPath('/home/kcutler/DataDrive/omnipose')

Next we read in the images from the file list. It's a good idea to display the images before proceeding. Here I happen to be reading in some RBG tiles of grayscale phase contrast images (such as you might use for figures etc.) as well as some single-channel images. As part of the visualization process, the images are rescaled to be in the range 0-1. Omnipose does this exact thing internally (you don't have to rescale them prior to running segmentation via CLI).

 1from cellpose_omni import io, transforms
 2from omnipose.utils import normalize99
 3imgs = [io.imread(f) for f in files]
 4
 5# print some info about the images.
 6for i in imgs:
 7    print('Original image shape:',i.shape)
 8    print('data type:',i.dtype)
 9    print('data range: min {}, max {}\n'.format(i.min(),i.max()))
10nimg = len(imgs)
11print('\nnumber of images:',nimg)
12
13fig = plt.figure(figsize=[40]*2,frameon=False) # initialize figure
14print('\n')
15for k in range(len(imgs)):
16    img = transforms.move_min_dim(imgs[k]) # move the channel dimension last
17    if len(img.shape)>2:
18        # imgs[k] = img[:,:,1] # could pick out a specific channel
19        imgs[k] = np.mean(img,axis=-1) # or just turn into grayscale 
20        
21    imgs[k] = normalize99(imgs[k])
22    # imgs[k] = np.pad(imgs[k],10,'edge')
23    print('new shape: ', imgs[k].shape)
24    plt.subplot(1,len(files),k+1)
25    plt.imshow(imgs[k],cmap='gray')
26    plt.axis('off')
Original image shape: (287, 377, 3)
data type: uint8
data range: min 4, max 22

Original image shape: (564, 564, 3)
data type: uint8
data range: min 30, max 203

Original image shape: (783, 908)
data type: uint8
data range: min 0, max 255

Original image shape: (396, 390, 4)
data type: uint8
data range: min 49, max 255

Original image shape: (281, 310)
data type: uint16
data range: min 360, max 64813

Original image shape: (384, 392)
data type: uint16
data range: min 0, max 65535

Original image shape: (334, 321)
data type: uint16
data range: min 2582, max 39614


number of images: 7


new shape:  (287, 377)
new shape:  (564, 564)
new shape:  (783, 908)
new shape:  (396, 390)
new shape:  (281, 310)
new shape:  (384, 392)
new shape:  (334, 321)
../_images/30668397671e0df72e5d5fe474198475963585c6d041b8229cdc09f2bbe6256d.png

Note that the first two images are RGB, the third and fifth are mono-channel, and the fourth is RGBA (the alpha channel encodes transparency). Exporting to RGB is usually just done for making diagrams or making images compatible with non-scientific viewing software. Pro tip: Adobe Illustrator will not interpolate the pixels in your image if you save it as RGB, but it will if you keep it mono-channel. Usually you want exact, non-interpolated (pixelated) images to be presented since it is your raw data, so you can convert it to grayscale by im_RGB = [im,im,im] (or more slick, im_RGB = [im,]*3 or *4 for RGBA). However, storing all your images this way is a waste of space - just do it for the ones you need for a figure.

Also note that the DeLTA images (1-4) are uint8, so 0 to 2**8-1 = 255. Image 1 only takes up values in the range 4 to 22 out of a possible 0 to 255, meaning it was probably way too dark and not rescaled prior to conversion to an 8-bit image. In my experience, images are typically 14-bit (that depends on your camera) and therefore saved as 16-bit lossless formats like PNG or TIF (Omnipose can detect and segment JPEGs, but you would never use those for anything scientific, even for figures due to compression artifacts). Using only 22-4 = 18 levels of gray to depict the cells causes the distinct 'posterized' effect that you can see if you zoom up on the image.

Initialize model#

Here we use one of the built-in model names. You can print out the available model names, too:

1import cellpose_omni
2from cellpose_omni import models
3from cellpose_omni.models import MODEL_NAMES
4
5MODEL_NAMES
['bact_phase_omni',
 'bact_fluor_omni',
 'worm_omni',
 'worm_bact_omni',
 'worm_high_res_omni',
 'cyto2_omni',
 'plant_omni',
 'bact_phase_cp',
 'bact_fluor_cp',
 'plant_cp',
 'worm_cp',
 'cyto',
 'nuclei',
 'cyto2']

We will choose the bact_phase_omni model.

1model_name = 'bact_phase_omni'
2model = models.CellposeModel(gpu=use_GPU, model_type=model_name)
2024-03-06 22:38:32,042	[INFO]	[models.py __init__() line 428]	>>bact_phase_omni<< model set to be used
2024-03-06 22:38:32,043	[INFO]	[core.py _use_gpu_torch() line 74]	** TORCH GPU version installed and working. **
2024-03-06 22:38:32,043	[INFO]	[       assign_device() line 85]	>>>> using GPU

Run segmentation#

 1import time
 2chans = [0,0] #this means segment based on first channel, no second channel 
 3
 4n = [-1] # make a list of integers to select which images you want to segment
 5n = range(nimg) # or just segment them all 
 6
 7# define parameters
 8params = {'channels':chans, # always define this with the model
 9          'rescale': None, # upscale or downscale your images, None = no rescaling 
10          'mask_threshold': -1, # erode or dilate masks with higher or lower values 
11          'flow_threshold': 0, # default is .4, but only needed if there are spurious masks to clean up; slows down output
12          'transparency': True, # transparency in flow output
13          'omni': True, # we can turn off Omnipose mask reconstruction, not advised 
14          'cluster': True, # use DBSCAN clustering
15          'resample': True, # whether or not to run dynamics on rescaled grid or original grid 
16          'verbose': False, # turn on if you want to see more output 
17          'tile': False, # average the outputs from flipped (augmented) images; slower, usually not needed 
18          'niter': 7, # None lets Omnipose calculate # of Euler iterations (usually <20) but you can tune it for over/under segmentation 
19          'augment': False, # Can optionally rotate the image and average outputs, usually not needed 
20          'affinity_seg': False, # new feature, stay tuned...
21         }
22
23tic = time.time() 
24masks, flows, styles = model.eval([imgs[i] for i in n],**params)
25
26net_time = time.time() - tic
27
28print('total segmentation time: {}s'.format(net_time))
total segmentation time: 1.7130911350250244s

Plot the results#

 1from cellpose_omni import plot
 2import omnipose
 3
 4for idx,i in enumerate(n):
 5
 6    maski = masks[idx] # get masks
 7    bdi = flows[idx][-1] # get boundaries
 8    flowi = flows[idx][0] # get RGB flows 
 9
10    # set up the output figure to better match the resolution of the images 
11    f = 5
12    szX = maski.shape[-1]/mpl.rcParams['figure.dpi']*f
13    szY = maski.shape[-2]/mpl.rcParams['figure.dpi']*f
14    fig = plt.figure(figsize=(szY,szX*4))
15    fig.patch.set_facecolor([0]*4)
16    
17    plot.show_segmentation(fig, omnipose.utils.normalize99(imgs[i]), 
18                           maski, flowi, bdi, channels=chans, omni=True,
19                           interpolation=None)
20
21    plt.tight_layout()
22    plt.show()
../_images/cb88fa5c076758babad1f3b4499b86d91ba77e5734b47fc110a3558d812a008e.png ../_images/147997f9845e585dea4014e1020b52a4f5e17324d00bad9b9661b1bfc85a2559.png ../_images/3c05e24cc46f98fab5c2c46c8e620e9ab7686d11788f0533494be88ee99d6fe0.png ../_images/0a846103bebea5200af31ab501af6d07896e3276645765a2f95ba322e9268f6d.png ../_images/3d38b7d4c7b21a06efee4a60820bcc487ef63a479490beb48d5aa830cf586588.png ../_images/ebd5cc13bf1470bcdbcef19a7770c9ecfe8bebeb20c82985942c7a87ebecd2ca.png ../_images/ed88663251eb6d555949dfba6413337508e690026032ca8416916e9e773aa026.png

Save the results#

Often you will want to save your masks before moving on to analysis (that way to can just load them in instead of re-running segmentation). I improved the cellpose.io function quite a bit to be more flexible in where it can save. See the documentation page for the full list of options.

 1io.save_masks(imgs, masks, flows, files, 
 2              png=False,
 3              tif=True, # whether to use PNG or TIF format
 4              suffix='', # suffix to add to files if needed 
 5              save_flows=False, # saves both RGB depiction as *_flows.png and the raw components as *_dP.tif
 6              save_outlines=False, # save outline images 
 7              dir_above=0, # save output in the image directory or in the directory above (at the level of the image directory)
 8              in_folders=True, # save output in folders (recommended)
 9              save_txt=False, # txt file for outlines in imageJ
10              save_ncolor=False) # save ncolor version of masks for visualization and editing 

Debug results#

The RGB flows shown above will give you some insight as to if there is an issue with the flow field outputs, but you can also check out the boundary and distance output:

1for idx,i in enumerate(n):
2
3    disti = flows[idx][2] # distance field prediction
4    bdlti = flows[idx][4] # boundary logits prediction 
5    fig = plt.figure(figsize=(12,5),frameon=False)
6    plt.imshow(np.hstack([disti,bdlti]))
7    plt.axis('off')
8    plt.tight_layout()
9    plt.show()
../_images/a7057fba37a774b61866ea550ca920481953618d116c732f832ef1679d066f40.png ../_images/d41491abc634e88de11fe3c44c39b5ef206fe4c0b9e129b672a0cbfa06bf8976.png ../_images/43ae45bd9d4ac656a89749317633ec7cbb5a67b76b9c8b5b1b6991880bf2b679.png ../_images/615cf21a32cd3fe030df2a9453f0bd81e09b5fdf6d6818f964bc6111ef8547af.png ../_images/96b588c0c72243800453f3a3a72db578bb5d636dcea29f2638599395396b1954.png ../_images/bdb4411c11e95143ad5cd252783f6afc52edd9aff614a0403360785d902e0069.png ../_images/8f70c316a3f3a548e5df903beeeee19a106ac6722561ba6fee6b8a53f6d52660.png

Notes on the above#

The distance field is trained with background pixels set to -5 in older models and -<mean diameter> in newer models. This helps to make the desired network output more balanced and give more distinction between edge pixels (which have values close to 0) and background (which ordinarily would have a value of 0). The flow field, being the gradient of the distance field, by definition has a magnitude of 1 everywhere - but we rescale it by 5 for training. This helps by bringing the desired flow component output more in the range of the boundary output, which is the input to the sigmoid function (so-called 'logits') and therefore ranges from about -5 to 5.

What you can see in the images above is that the features in the boundary, distance, and flow fields are all very consistent with each other. For example, the flow field has positive divergence where the boundary output is high and negative divergence where the distance field is high. This is by design, as I included the boundary field for the sole purpose of improving the prediction accuracy on the flow and distance fields.