Skip to content

yumibriones/HPL-Modified

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HPL-Modified

Modifications to the Histomorphological Phenotype Learning pipeline. This pipeline generates histomorphological phenotype clusters (HPCs) from tiled H&E images via unsupervised learning.

Original HPL paper by Quiros et al. is here: https://www.nature.com/articles/s41467-024-48666-7.

Authors

Yumi Briones - yb2612@nyu.edu, Yumi.Briones@nyulangone.org
Jennifer Motter - mottej02@nyu.edu, Jennifer.Motter@nyulangone.org
Alyssa Pradhan - amp10295@nyu.edu, Alyssa.Pradhan@nyulangone.org

Repo structure

  • VICReg - README documentation and files to perform HPL-VICReg and HPL-BarlowTwins
  • ViT - README documentation and files to perform HPL-ViT
  • CLIP - README documentation and files to perform HPL-CLIP and HPL-CONCH

Data

All data are from https://github.com/AdalbertoCq/Histomorphological-Phenotype-Learning.

  1. For initial training, we used a 250k subsample of LUAD and LUSC samples: LUAD & LUSC 250K subsample
  2. For complete train, validation, and test sets, we used: LUAD & LUSC datasets
  3. To get original HPL tile embeddings, we used: LUAD & LUSC tile vector representations
  4. To get the original HPL-HPC assignments, we used: LUAD vs LUSC type classification and HPC assignments

Modifications

HPL-VICReg

Point person: Jennifer Motter

Original VICReg paper: https://arxiv.org/pdf/2105.04906

Details: https://github.com/yumibriones/HPL-Modified/blob/main/VICReg/README.md

We changed the self-supervised learning (SSL) method of HPL from Barlow Twins to Variance-Invariance-Covariance Regularization (VICReg).

HPL-ViT

Point person: Alyssa Pradhan

Original ViT paper: https://arxiv.org/pdf/2010.11929

Details: https://github.com/yumibriones/HPL-Modified/blob/main/ViT/README.md

We replaced the convolutional neural network (CNN) backbone of HPL to a vision transformer (ViT).

HPL-CLIP

Point person: Yumi Briones

Original CLIP paper: https://arxiv.org/pdf/2103.00020

Details: https://github.com/yumibriones/HPL-Modified/blob/main/CLIP/README.md

To enable multimodal learning, we integrated Contrastive Language-Image Pre-Training (CLIP) by OpenAI (open_clip implementation) into the HPL pipeline.

HPL-CONCH

As a bonus, we generated and clustered image embeddings from CONtrastive learning from Captions for Histopathology (CONCH) by the Mahmood Lab (https://github.com/mahmoodlab/CONCH). This is a CLIP-style model that has been trained on over a million histopathology image-caption pairs. A caveat is that pathological information is included in the captions, so clusters generated from this method will not be completely unsupervised.

Results

We redid UMAP and Leiden clustering on the original HPL embeddings. We repeated this analysis for all modifications of HPL (i.e., HPL-CLIP, HPL-CONCH, HPL-VICReg, HPL-ViT). Results can be found here: HPL-Modified Results.

Briefly, this is how results were generated:

  1. Extract embeddings from the original HPL pipeline using extract_embeddings_hpl.ipynb.
  2. Extract embeddings from VICReg, ViT, CLIP with the scripts in each folder.
  3. Run UMAP/Leiden clustering on embeddings using run_umap_leiden.py.*
  4. Plot UMAP with clustering results/clinical features overlaid on top using plot_umap.py.*

*If submitting as a batch job on HPC, use corresponding scripts in each folder: VICReg, ViT, CLIP. Make sure to adjust filepaths accordingly.

Evaluation

We evaluated our models in terms of (1) similarity of clusters to the original HPL pipeline, and (2) how well the clusters separate LUAD from LUSC. Evaluation is done here: evaluation.ipynb.

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •