You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jul 29, 2023. It is now read-only.
* Commented microDL config files
Doc directory added with commented config files explaining the parameters used in microDL workflow
* config documentation update based on review
The microDL config file documentation has been updated based on suggestions from @jennyfolkesson , @Christianfoley and @JohannaRahm.
* Updated readme and config documentation
Readme files for microDL and specific to preprocessing, train, inference modules and config file documentation are updated based on review.
* Update preprocessing readme
Description of details available in json updated.
* Checking error in yaml files
Spaces before comments were removed to eliminate error from yaml files.
* added citation file
* updated documentation and figures for clarity
* Formatted inference readme files
The readme files were linted with markdownlint
* Updated preprocessing config using 2D U-Net
Depth of tiles is specifically defined for 2D U-Net
* Config files for microDL 2.5D U-Net model
Config files are tailored to predict cell membrane using 3D image input and training a 2.5D U-Net model.
* Match preprocessing to training config
Preprocessing channels changed to match the channels mentioned in training and inference config files to avoid confusion.
* moved all config files to the same folder.
* update the paths of configs in the notebooks
* Config file links attached
Attached the links to config files forpreprocessing, training and inference for notebook, 2D model and 2.5D model.
* Check for failed build
Reformatting links on readme to check for failed build.
* Added changes based on review
Clarified changes added to the documentation based on input from @JohannaRahm and @jennyfolkesson .
Co-authored-by: Shalin Mehta <2934183+mattersoflight@users.noreply.github.com>
microDL is a deep learning pipeline for efficient 2D and 3D image translation. We commonly use it to virtually stain label-free images, i.e., to predict fluorescence-like images. Label-free imaging visualizes many structures simultaneously. Virtual staining enables identification of diverse structures without extensive human annotation - the annotations are provided by the molecular markers of the structure. This pipeline was originally developed for 3D virutal staining of tissue and cell structures from label-free images of their density and anisotropy: <https://doi.org/10.7554/eLife.55502>. We are currently extending it to enable generalizable virtual staining of nuclei and membrane in diverse imaging conditions and across multiple cell types. We provide a computationally and memory efficient variant of U-Net (2.5D U-Net) for 3D virtual staining.
9
+
10
+
You can train a microDL model using label-free images and corresponding fluorescence channels you want to predict. Once the model is trained using the dataset provided you can use the model to predict the same fluorescence channels or segmneted masks in other datasets using the label-free images.
11
+
12
+
In the example below, phase images and corresponding nuclear and membrane stained images are used to train a 2.5D U-Net model.
13
+
The model can be used to predict the nuclear and membrane channels using label-free phase images.
microDL allows you to design, train and evaluate U-Net models. It supports 2D U-Nets for 2D image translation and 2.5D (3D encoder, 2D decoder) U-Nets for 3D image translation.
5
24
6
-
microDL allows you to design, train and evaluate U-Net models using just a few YAML config files. It supports 2D, 2.5D (3D encoder, 2D decoder) and 3D U-Nets, as well as 3D networks with anistropic filters. It also supports networks with an encoder plus dense layers for image to vector or image to scalar models. Our hope is that microDL will provide easy to use CLIs for segmentation, regression and classification tasks of microscopy images.
25
+
Our goal is to enable robust translation of images across diverse microscopy methods.
7
26
8
-
microDL consists of three modules:
27
+
microDL consists of three modules that are accessible via CLI and customized via a configuration file in YAML format:
*[Training](micro_dl/train/readme.md): model creation, loss functions (w/wo masks), metrics, learning rates
31
+
*[Inference](micro_dl/inference/readme.md): on full images or on tiles that can be stitched to full images
32
+
33
+
Note: microDL also supports 3D U-Nets and image segmentation, but we don't use these features frequently and they are the least tested.
13
34
14
35
## Getting Started
15
36
16
-
Assuming your data is already formatted in a way that microDL understands (see Data Format below), you can run preprocessing, training and inference in three command lines.
17
-
For config settings, see module specific readme's in micro_dl/preprocessing, micro_dl/training and micro_dl/inference.
37
+
### Introductory exercise from DL@MBL
38
+
39
+
If you are new to image translation or U-Nets, start with [slides](notebooks/dlmbl2022/20220828_DLMBL_ImageTranslation.pdf) from the didactic lecture from [deep learning @ marine biological laboratory](https://www.mbl.edu/education/advanced-research-training-courses/course-offerings/dlmbl-deep-learning-microscopy-image-analysis).
40
+
41
+
You can download test data and walk through the exercise by following [these instructions](notebooks/dlmbl2022/README.md).
42
+
43
+
### Using command line interface (CLI)
44
+
45
+
Refer to the [requirements](#requirements) section to set up the microDL environment.
46
+
47
+
Build a [docker](#docker) to set up your microDL environment if the dependencies are not compatible with the hardware environment on your computational facility.
48
+
49
+
Format your input data to match the microDL [data format](#data-format) requirements.
50
+
51
+
Once your data is already formatted in a way that microDL understands, you can run preprocessing, training and inference in three command lines.
52
+
For config settings, see module specific readme's in [micro_dl/preprocessing](micro_dl/preprocessing/readme.md),
It is recommended that you run microDL inside a Docker container, especially if you're using shared resources like a GPU server. microDL comes with two Docker images, one for Python3.6 with CUDA 9 support (which is most likely what
32
71
you'll want), and one for Python3.5 with CUDA 8.0 support. If you're working at the CZ Biohub you should be in the Docker group on our GPU servers Fry/Fry2, if not you can request anyone in the data science team to join. The Python 3.6 image is already built on Fry/Fry2, but if you want to modify it and build your own Docker image/tag somewhere,
Then you can access your notebooks in your browser at:
96
+
51
97
```buildoutcfg
52
98
http://<your server name (e.g. fry)>:<whatever port you mapped to when starting up docker>
53
99
```
100
+
54
101
You will need to copy/paste the token generated in your Docker container.
55
102
56
103
### Data Format
57
104
105
+
Input data should be in the format of single page tiff files. If you use zarr files, you can convert
106
+
them to single page tiff files using the [zarr to single page tiff conversion script](https://github.com/mehta-lab/microDL/blob/master/scripts/hcszarr2single_tif_mp.py).
107
+
58
108
To train directly on datasets that have already been split into 2D frames, the dataset
59
109
should have the following structure:
60
110
@@ -66,39 +116,30 @@ dir_name
66
116
|- im_c***_z***_t***_p***.png
67
117
|- ...
68
118
```
119
+
69
120
The image naming convention is (parenthesis is their name in frames_meta.csv)
121
+
70
122
***c** = channel index (channel_idx)
71
123
***z** = slice index in z stack (slice_idx)
72
124
***t** = timepoint index (time_idx)
73
125
***p** = position (field of view) index (pos_idx)
74
126
75
-
If you download your dataset from the CZ Biohub imaging database [imagingDB](https://github.com/czbiohub/imagingDB)
76
-
you will get your dataset correctly formatted and can directly input that into microDL.
77
-
If you don't have your data in the imaging database, write a script that converts your
78
-
your data to image files that adhere to the naming convention above, then run
127
+
If your data is not in the zarr or tiff format supported by the preprocessing module, write a script that converts your your data to image files that adhere to the naming convention above, then run
That will generate the frames_meta.csv file you will need for data preprocessing.
84
134
135
+
Before preprocessing make sure the z stacked images are aligned to be centered at the focal plane at all positions. If the focal plane in image stacks imaged
136
+
at different positions in a plate are at different z levels, align them using the [z alignment script](https://github.com/mehta-lab/microDL/blob/master/scripts/align_z_focus.py).
85
137
86
-
## Requirements
138
+
## Failure modes
87
139
88
-
There is a requirements.txt file we use for continuous integration, and a requirements_docker.txt file we use to build the Docker image. The main packages you'll need are:
89
-
90
-
* keras
91
-
* tensorflow
92
-
* cv2
93
-
* Cython
94
-
* matplotlib
95
-
* natsort
96
-
* nose
97
-
* numpy
98
-
* pandas
99
-
* pydot
100
-
* scikit-image
101
-
* scikit-learn
102
-
* scipy
103
-
* testfixtures (for running tests)
140
+
Although deep learning pipelines solve complex computer vision problems with impressive accuracy, they can fail in ways that are not intuitive to human vision. We think that trasparent discussion of failure modes of deep learning pipelines is necessary for the field to continue advancing. These [slides](notebooks/dlmbl2022/20220830_DLMBL_FailureModes.pdf) from DL@MBL 2022 course summarize failure modes of the virtual stanining that we have identified and some ideas for improving its robustness. If microDL fails with your data, please start a discussion via issues on this repositroy.
141
+
142
+
## Requirements
104
143
144
+
These are the [required dependencies](requirements.txt) for continuous integration, and a [required dependencies](requirements_docker.txt) to build a docker image.
145
+
This version (1.0.0) assumes single-page tiff data format and is built on tensorflow 1.13, keras 2.1.6. The next version will directly read zarr datasets and be re-written using pytorch.
microDL: robust and efficient virtual staining of label-free microscopy data
4
+
message: >-
5
+
Please use citation information in this file if you publish results using this pipeline. If you use the pipeine for virtual staining, please also cite our paper:
# Configuration detailing the staining prediction on label-free images
2
+
3
+
# define the dataset you want to use the trained model for virtual staining/segmentation,
4
+
# the output formats required and predicted image quality metrics
5
+
6
+
# point to directory where your trained model is saved
7
+
model_dir: '/home/Translation_temp_2/'
8
+
9
+
# directory with images on which you want to perform the prediction, the inference dataset
10
+
image_dir: '/home/InferenceData/'
11
+
12
+
# preprocess_dir contains the preprocessing_info.json file, used to extract information about normalization step
13
+
preprocess_dir: '/home/Processed_temp_1'
14
+
15
+
# define inference dataset channels
16
+
dataset:
17
+
input_channels: [2] # label-free channel used for prediction by model
18
+
target_channels: [0] # target image channel (fluorescence image) to compare how well the prediction worked
19
+
pos_ids: [0, 1, 3, 4, 6, 8, 10] # may not effect the positions where interference is performed if data split is defined
20
+
slice_ids: [12,13,14] # slices where inference is performed, condition same as above
21
+
22
+
# define the output image format
23
+
images:
24
+
image_format: 'zyx'# output predicted image order of dimension
25
+
image_ext: '.tif'# output images are stored as single page tiff files
26
+
suffix: '25DUnet_membrane'# saved output image name suffix
27
+
name_format: sms # 'sms' corresponds to image naming format 'img_channelname_t***_p***_z***_customfield', default naming convention is 'im_c***_z***_t***_p***'
28
+
pred_chan_name: 'pred'# suffix added to saved output image name
29
+
save_to_image_dir: False # 'False' saves output in model directory, 'True' in input image directory
30
+
save_folder_name: predictions # specify the name of directory to be created inside model dir to save output images
31
+
data_split: test # define which image set in train/val/test/all split are to be used for the prediction
32
+
save_figs: True # do you want to save a figure panel to compare the predicted and target images
33
+
34
+
# metrics to be computed to define prediction quality, the values will be printed on output figure panel and saved as text files
35
+
metrics:
36
+
metrics: [ssim, corr, r2, mae, mse] # metrics for output image quality check: refer to readme for details
37
+
metrics_orientations: ['xy'] # for 'xy' slice, 'xz' slice or 'yz' slice, where xz and yz for 3D predictions
# Configuration detailing the staining prediction on label-free images
2
+
3
+
# define the dataset you want to use the trained model for virtual staining/segmentation,
4
+
# the output formats required and predicted image quality metrics
5
+
6
+
# point to directory where your trained model is saved
7
+
model_dir: '/home/Translation_temp_2/'
8
+
9
+
# directory with images on which you want to perform the prediction, the inference dataset
10
+
image_dir: '/home/InferenceData/'
11
+
12
+
# preprocess_dir contains the preprocessing_info.json file, used to extract information about normalization step
13
+
preprocess_dir: '/home/Processed_temp_1'
14
+
15
+
# define inference dataset channels
16
+
dataset:
17
+
input_channels: [2] # label-free channel used for prediction by model
18
+
target_channels: [1] # target image channel (fluorescence image) to compare how well the prediction worked
19
+
pos_ids: [0, 1, 3, 4, 6, 8, 10] # may not effect the positions where interference is performed if data split is defined
20
+
slice_ids: [0] # slices where inference is performed, condition same as above
21
+
22
+
# define the output image format
23
+
images:
24
+
image_format: 'zyx'# output predicted image order of dimension
25
+
image_ext: '.tif'# output images are stored as single page tiff files
26
+
suffix: '2DUnet_nucl'# saved output image name suffix
27
+
name_format: sms # 'sms' corresponds to image naming format 'img_channelname_t***_p***_z***_customfield', default naming convention is 'im_c***_z***_t***_p***'
28
+
pred_chan_name: 'pred'# suffix added to saved output image name
29
+
save_to_image_dir: False # 'False' saves output in model directory, 'True' in input image directory
30
+
save_folder_name: predictions # specify the name of directory to be created inside model dir to save output images
31
+
data_split: val # define which image set in train/val/test/all split are to be used for the prediction
32
+
save_figs: True # do you want to save a figure panel to compare the predicted and target images
33
+
34
+
# metrics to be computed to define prediction quality, the values will be printed on output figure panel and saved as text files
35
+
metrics:
36
+
metrics: [ssim, corr, r2, mae, mse] # metrics for output image quality check: refer to readme for details
37
+
metrics_orientations: ['xy'] # for 'xy' slice, 'xz' slice or 'yz' slice, where xz and yz for 3D predictions
0 commit comments