To install MIC:
git clone https://github.com/keiserlab/metric-ion-classification.git
cd metric-ion-classification
conda env create --name mic-env -f mic-env.yml
conda activate mic-env
pip install .
Installation may take some time, especially regarding conda dependency solving. We strongly recommend using an updated conda with conda-libmamba-solver, in which case installation should take a few minutes.
If you encounter issues intalling LUNA, see https://github.com/keiserlab/LUNA installation instructions.
The dependencies for MIC are listed in mic-env.yml, reproduced here:
- rdkit
- seaborn
- openbabel
- pip
- pymol-open-source
- xopen
- colorlog
- networkx
- scipy
- pytorch
- cpuonly
- numpy=1.21.6
- biopython==1.72
- mmh3==2.5.1
- pdbecif
- joblib
- tqdm
- scikit-learn
- simplejson
- MIC has been successfully installed and tested on Ubuntu 18.04, 20.04, and 22.04, Mac OS Monterey through Sonoma, and on a Windows machine running Ubuntu through WSL.
- MIC ChimeraX bundle requires Microsoft Visual C++ 14.0 or greater (see this link for instructions on how download the latest version)
To run:
mic_predict -o <outfile> -e <path/to/entry_file>[optional] </path/to/pdb_file>
Inputs
- pdb_file / pdb_dir : path to either 1) a single PDB file in the case of running one 1 file or 2) a directory containing multuple PDB files, used with --entries flag
Options:
- -e/--entries: Run individual examples from a single file or multiple PDB files. See below.
- -n/--num_processes: Number of concurrent processes to use for FP generation - default 8
- -o/--outfile: CSV created to store generated fingerprints or MIC predictions. If not provided, output is displayed in the terminal.
- -p/--preds_only: Only output predictions, not full probabilities from SVC.
- -ext/--extended_set: Generate predictions for the extended set of labels (K, Mn,Iod,Fe,Br)
- -het/--hetatm: Use MIC-HETATMC model
- -cpu/--cpu_only: Ignore available GPUs and generate all predictions on CPU.
- -co/--symexp_cutoff: Radius in angstroms around densities kept following symmetry expansion. If running a single PDB file, it is recommended to use 7 for X-ray crystallography or -1 for Cryo-EM structures. Only use other values if you are planning on training a new model - default 7
The following options are NOT recommended to change from their defaults, unless intending to train a new model. Do not change these if running predictions alone.
- -t/--fp_type: Type of fingerprint to generate and use, options are ['prune-eifp']
- -l/--length: length of the fingerprint vector to be generated - default 4096
- -sn/--shell_number: number of shells to use during fingerprint generation - default 18
- -sr/--shell_radius: shell radius to use during fingerprint generation - default 0.25
- -ub/--unblinded: flag to not zero the invariants of spherical densities
- -b/--bit: whether to generate count fingerprints or bit fingerprints - default is count
Entries are used to generate predictions for specific densities within a single PDB file or across multiple PDB files in one MIC run. To run with entries, the first argument to mic_predict should be a path to the directory containing all pdb files, while the entries argument should be a csv in the format 'pdb_id,chain,name,res_number,skip_symexpansion (must be on for Cryo-EM structures)':
4OKE,A,MG,202,
3A09,A,CA,601,
4L9P,B,ZN,601,
7aue,R,CA,501,X
7f53,R,CA,601,X
7f55,R,CA,601,X
7f54,R,CA,601,X
7f58,R,CA,602,X
7piu,R,CA,401,X
7piv,R,CA,401,X
To confirm correct installation:
pip install pytest
python -m pytest
The tests may take a 5-10 minutes to run.
The ChimeraX MIC bundle is available from ChimeraX toolshed. All code can be found in this repository under the 'chimerax' branch.
If you use this software, please cite our preprint.