Further Assessing Current Endeavors in PLMs

git clone git@github.com:keiserlab/face-plm.git
cd face-plm
conda create -n face_plm python=3.10 -y
conda activate face_plm
pip install -e .
pip install pytorch-lightning
In order to get started with generating embeddings and training models you need to sign into HuggingFace and WandB.
huggingface-cli login
wandb login
Additionally, you need to change the wandb config file at: training_probe/config/wandb_config/base_wandb.yaml The entity and project need to be updated to properly log to the desired WandB location.
_target_: face_plm.probes.utils.WandbRunConfig
run_name: base_run+name
entity: temp_entity # CHANGEME
project: temp_project # CHANGEME
bash scripts/setup_embed_env.sh
With ESM (requires ESMC/3 access)
bash scripts/get_all_plm_embedding.sh
Without ESM
bash scripts/get_all_plm_embedding_no_esm.sh
bash scripts/get_all_layer_ankh_embedding.sh
bash scripts/train_single_model.sh CONFIG_NAME
Example config: esmc_600m-agg_mlp
bash scripts/train_cross_val_model.sh CONFIG_NAME
Example config: esmc_600m-agg_mlp
bash scripts/train_cross_val_model_ankh_multilayer.sh CONFIG_NAME
Example config: ankh_base_layer_specific_0-12
bash scripts/finetune_mlm.sh ankh_large_ft_ec27
bash scripts/finetune_mlm.sh ankh_base_ft_kcat
bash scripts/train_cross_val_direct_finetune.sh CONFIG_NAME
Example config: ankh_base_ft_kcat
bash no_torch_probing.sh OUTPUT_DIR
example output_dir: ./probe_outputs/