Drug Discovery @ TEXAS ADVANCED COMPUTING CENTER
Powering Discoveries That Change The World
Frequently Asked Questions
What is DrugDiscovery@TACC?
This is formerly the TACC- and University of Texas Medical Branch (UTMB)-hosted web-resource for running molecular docking software. This service has been re-written, updated with new features, and moved to TACC's UTRC Portal. The application is listed as AutoDock Vina v1.2.3.
This new application provides controlled access to molecular docking software running on the Lonestar 6 supercomputer at TACC (Texas Advanced Computing Center). Approved users can login to the UTRC Portal, upload a protein coordinate file (PDB or PDBQT format) along with an active-site specification, and dock that protein against libraries containing up to millions of small, drug-like, or natural product molecules. These libraries were extracted from the ZINC database and kindly provided by Dr. John Irwin. Libraries were filtered to include only commercially-available (i.e., readily available for purchase from established chemical supply companies) compounds. This portal was originally developed by researchers at the University of Texas Medical Branch and then ported to TACC. Finally, it was re-written in 2023 and moved to the UTRC Portal.
Note:Computer docking and virtual screening are inexact, but potentially very valuable, tools. This site is intended to provide easy access to researchers wishing to perform small numbers of docking or virtual screening experiments, but who do not have the necessary computer resources and/or computational biology backgrounds. This interface can handle most protein-ligand docking experiments, but there will always be systems where this simple interface will fail. For those cases, researchers will need to develop the necessary expertise or collaborations to perform the experiment.
How do I retrieve my data from the old DrugDiscovery website?
Submit a ticket here.
How can I get access to the UTRC Portal?
Request an account from TACC via the TACC User Portal at https://portal.tacc.utexas.edu/account-request.
What docking software is used?
Autodock Vina [1], written by Drs. O. Trott and A. Olson at the Scripps Institute is used to perform the actual docking. This open-source program has been made freely available by the authors.
AutodockTools, written by Drs. G. Morris and A. Olson at the Scripps Institute is used to convert proteins and ligands to the format required by AutoDock Vina. This open-source program has been made freely available by the authors.
[1] "AutoDock Vina: Improving the Speed and Accuracy of Docking with a New Scoring Function, Efficient Optimization and Multithreading", (2010) O. Trott and A. J. Olson, J. Comp. Chem. 31, 455-461.
How good is the docking software?
Before using AutoDock Vina for this project, it was tested in-house against the Directory of Useful Decoys (DUD) Database [1], a database designed specifically for rigorously testing docking algorithms. The testing protocol was taken from an earlier paper[2] written at Wyeth Research that tested six molecular docking programs against the DUD database.
For each protein listed in the DUD database[1], the enrichment plots shown below describe the performance of AutoDock Vina against that of other molecular docking software (e.g., GLIDE HTVS, PhDOCK, ICM, DOCK, Surflex, FlexX) as well as random and ideal enrichments. Also listed are the sizes of the DUD decoy library and known binders used for virtual screening against each DUD protein. Data for the non-Vina molecular docking programs was kindly provided by Dr. Cross and was reported earlier by his group[2]. The results shown below indicated AutoDock Vina was a strong competitor against the other programs, and at the top of the pack in many cases.
[1] "Benchmarking Sets for Molecular Docking", (2006) N. Huang, B. K. Shoichet, and John Irwin J. Med. Chem. 49, 6789-6801.
[2] "Comparison of Several Molecular Docking Programs: Pose Prediction and Virtual Screening Accuracy", (2009) J. B. Cross, D. C. Thompson, B. K. Rai, J. C. Baber, K. Y. Fan, Y. Hu, and C. Humblet, J. Chem. Inf. Model. 49, 1455-1474.
What will the UTRC Portal return to me?
1ABC_receptor.pdb/pdbqt | The original protein receptor file used for the job will be copied to this directory, named with its original name (e.g. "1ABC_receptor.pdb"). If you used a PDB file, the charged PDBQT file generated in this workflow will also be returned. |
results.tar.gz | A tar/zipped folder containing the top N (user-specified; up to 1000) binders, each with their top pose in PDBQT format. Additionally, a log file listing all results in score order; best (most negative) score comes first. The ligands are referenced by ZINC codes, and these can be extracted and pasted into the ZINC web site to return chemical structures, catalog numbers, etc. Finally, it includes a single file containing the top N binders concatonated together, for ease of use in comparison with molecular visualization software. |
tapisjob.out | A log file containging details from the log run, including errors if they occurred. |
tapisjob.env/tapisjob.sh | Run scripts that were used on the cluster to run the job, and could be used again to replicate the job if desired. |
What small-molecule libraries are available?
- Test-set: A small library of ~30 molecules. Useful to double check that your input receptor file, box center coordinates, and box size values are working correctly.
- Enamine-PC: The Enamine Premium Collection contains 43,339 compounds having most favorable physicochemical properties (high Fsp3, low LogP and MW). They have been synthesized in Enamine's lead-oriented synthesis program. Including enantiomers, this library contains 84,359 molecules.
- Enamine-AC: The Enamine Advanced Collection contains 445,239 compounds that have lead-like properties with MW ≤ 350, cLogP ≤ 3, and rotB ≤ 7 and/or valuable pharmacophores such carboxylic, primary amino and amide groups. All compounds are checked with Enamine's in-house medchem filters. Including enantiomers, this library contains 876,985 molecules.
- Enamine-HTSC: The Enamine HTS Collection contains 2,141,514 diverse screening compounds. The collection encompasses versatile chemotypes developed within a couple of decades of chemical research at Enamine and its partner academic organizations. These compounds frequently have unusual structures and unique properties. The collection is particularly recommended by Enamine for the researchers looking for most diverse screening set. Including enantomers, this library contains 3,467,770 molecules.
- ZINC-fragments: This is a ZINC "fragment" subset which was filtered by the following criteria: MW ≤ 250 Da, LogP 1-4, reactivity = Anodyne. The set contains 546,003 fragments.
- ZINC-in-trials: This is the ZINC "in-trials" subset which includes "Compounds that are in clinical trials, including ones that are already drugs." The set contains 9,270 molecules.
Are any changes made to my protein before it is docked?
Before being utilized by AutoDock Vina, any uploaded PDB protein is, by necessity, converted to the PDBQT format. This involves stripping all heteroatoms (including waters), re-protonating all atoms, and reassigning Gasteiger charges to all atoms. This conversion is done using tools (prepare_receptor4.py) provided in AutoDockTools. The PDBQT formatted protein is eventually returned as part of the results for your inspection.
If you wish to avoid these automated conversion steps, provide your protein file already in PDBQT format. None of the above processing steps are performed on protein files already in the PDBQT format. Information on how to generate PDBQT files for proteins (including assigning hydrogens to His residues) can be found in the AutoDockTools manual and tutorial.
How long will the docking take?
Screening times will vary according to the library you select. The ~46,000 compound library takes around 2 hours and ~650,000 library can take up to 24 hours to complete. These times do not include the wait time in the queue of Lonestar. Queue time depends on how busy Lonstar is when you submit your job. You can check how busy Lonestar is and see if it is in maintenence on the TACC User Portal's systems monitor page.