Frequently Asked Questions

What is DrugDiscovery@TACC?

This is formerly the TACC- and University of Texas Medical Branch (UTMB)-hosted web-resource for running molecular docking software. This service has been re-written, updated with new features, and moved to TACC's UTRC Portal. The application is listed as Autodock-Vina 1.2.3.

This new application provides controlled access to molecular docking software running on the Lonestar 6 supercomputer at TACC (Texas Advanced Computing Center). Approved users can login to the UTRC Portal, upload a protein coordinate file (PDB or PDBQT format) along with an active-site specification, and dock that protein against libraries containing up to millions of small, drug-like, or natural product molecules. These libraries were extracted from the ZINC database and kindly provided by Dr. John Irwin. Libraries were filtered to include only commercially-available (i.e., readily available for purchase from established chemical supply companies) compounds. This portal was originally developed by researchers at the University of Texas Medical Branch and then ported to TACC. Finally, it was re-written in 2023 and moved to the UTRC Portal.

Note:Computer docking and virtual screening are inexact, but potentially very valuable, tools. This site is intended to provide easy access to researchers wishing to perform small numbers of docking or virtual screening experiments, but who do not have the necessary computer resources and/or computational biology backgrounds. This interface can handle most protein-ligand docking experiments, but there will always be systems where this simple interface will fail. For those cases, researchers will need to develop the necessary expertise or collaborations to perform the experiment.

How do I retrieve my data from the old DrugDiscovery website?

Submit a ticket here.

How can I get access to the UTRC Portal?

Request an account from TACC via the TACC User Portal at https://portal.tacc.utexas.edu/account-request.

What docking software is used?

Autodock Vina [1], written by Drs. O. Trott and A. Olson at the Scripps Institute is used to perform the actual docking. This open-source program has been made freely available by the authors.

AutodockTools, written by Drs. G. Morris and A. Olson at the Scripps Institute is used to convert proteins and ligands to the format required by AutoDock Vina. This open-source program has been made freely available by the authors.

[1] "AutoDock Vina: Improving the Speed and Accuracy of Docking with a New Scoring Function, Efficient Optimization and Multithreading", (2010) O. Trott and A. J. Olson, J. Comp. Chem. 31, 455-461.

How good is the docking software?

Before using Autodock Vina for this project, it was tested in-house against the Directory of Useful Decoys (DUD) Database [1], a database designed specifically for rigorously testing docking algorithms. The testing protocol was taken from an earlier paper[2] written at Wyeth Research that tested six molecular docking programs against the DUD database.

For each protein listed in the DUD database[1], the enrichment plots shown below describe the performance of Autodock Vina against that of other molecular docking software (e.g., GLIDE HTVS, PhDOCK, ICM, DOCK, Surflex, FlexX) as well as random and ideal enrichments. Also listed are the sizes of the DUD decoy library and known binders used for virtual screening against each DUD protein. Data for the non-Vina molecular docking programs was kindly provided by Dr. Cross and was reported earlier by his group[2]. The results shown below indicated Autodock Vina was a strong competitor against the other programs, and at the top of the pack in many cases.

DUD Protein	DUD Library
DUD Protein	# of decoys	# of binders
AngiotensinConvertingEnzyme	1727	49
AcetylcholineEsterase	3714	105
AdenosineDeaminase	821	23
AldoseReductase	918	26
AmpCbetaLactamase	732	21
AndrogenReceptor	2628	74
CyclinDependentKinase2	1779	50
CatecholOMethyltransferase	430	11
CycloOxygenase1	849	25
CycloOxygenase2	12464	348
DihydrofolateReductase	7145	201
EpidermalGrowthFactorReceptor	14894	444
EstrogenReceptor_agonist	2355	67
EstrogenReceptor_antagonist	1395	39
FibroblastGrowthFactorReceptor1	4205	118
CoagulationFactorXa	5095	142
GlycinamideRibonucleotideTransformylase	753	21
GlycogenPhosphorylase	1850	52
GlucocorticoidReceptor	2797	78
HIVprotease	1885	53
HIVReverseTranscriptase	1437	40
HydroxymethylglutarylCoAReductase	1241	35
HeatShockProtein90	860	24
EnoylACPReductase	3035	85
MineralocorticoidReceptor	535	15
Neuraminadase	1745	49
P38MAPKinase	8387	256
PolyADPRibosePolymerase	1176	33
Phosphodiesterase5A	1809	51
PlatletDerivedGrowthFactorReceptorB	5614	157
PurineNucleosidePhosphorylase	882	25
PPARgamma	2906	81
ProgesteroneReceptor	967	27
RXRalpha	708	20
SAdenosylHomocysteineHydrolase	1159	33
TyrosineKinaseSRC	5793	155
Thrombin	2292	65
ThymidineKinase	784	22
Trypsin	1544	44
VascularEndothelialGrowthFactorReceptor2	2641	74

[1] "Benchmarking Sets for Molecular Docking", (2006) N. Huang, B. K. Shoichet, and John Irwin J. Med. Chem. 49, 6789-6801.

[2] "Comparison of Several Molecular Docking Programs: Pose Prediction and Virtual Screening Accuracy", (2009) J. B. Cross, D. C. Thompson, B. K. Rai, J. C. Baber, K. Y. Fan, Y. Hu, and C. Humblet, J. Chem. Inf. Model. 49, 1455-1474.

What will the UTRC Portal return to me?

error.txt	An error file that will only be generated if the job is submitted with invalid input parameters. Please check this first if your job fails.
results.tar.gz	A folder containing the top N (user-specified; up to 1000) binders, each with their top pose in PDBQT format. Additionally, a log file listing all results in score order; best (most negative) score comes first. The ligands are referenced by ZINC codes, and these can be extracted and pasted into the ZINC web site to return chemical structures, catalog numbers, etc. Finally, it includes a single file containing the top N binders concatonated together, for ease of use in comparison with molecular visualization software.
Protein	A file containing the target protein in PDBQT format. It will also return the PDB file you uploaded, if applicable.
vina_1.2.3.0.sif	The container image used by our supercomputer to build the necessary runtime environment.
.agave.log	An internal log file.
job.err	A log file showing stderr.
job.out	A log file showing stdout.

What small-molecule libraries are available?

Enamine-PC: The Enamine Premium Collection contains 43,339 compounds having most favorable physicochemical properties (high Fsp3, low LogP and MW). They have been synthesized in Enamine's lead-oriented synthesis program. Including enantiomers, this library contains 84,359 molecules.
Enamine-AC: The Enamine Advanced Collection contains 445,239 compounds that have lead-like properties with MW ≤ 350, cLogP ≤ 3, and rotB ≤ 7 and/or valuable pharmacophores such carboxylic, primary amino and amide groups. All compounds are checked with Enamine's in-house medchem filters. Including enantiomers, this library contains 876,985 molecules.
Enamine-HTSC: The Enamine HTS Collection contains 2,141,514 diverse screening compounds. The collection encompasses versatile chemotypes developed within a couple of decades of chemical research at Enamine and its partner academic organizations. These compounds frequently have unusual structures and unique properties. The collection is particularly recommended by Enamine for the researchers looking for most diverse screening set. Including enantomers, this library contains 3,467,770 molecules.
ZINC-fragments: This is a ZINC "fragment" subset which was filtered by the following criteria: MW ≤ 250 Da, LogP 1-4, reactivity = Anodyne. The set contains 546,003 fragments.
ZINC-in-trials: This is the ZINC "in-trials" subset which includes "Compounds that are in clinical trials, including ones that are already drugs." The set contains 9,270 molecules.

Are any changes made to my protein before it is docked?

Before being utilized by Autodock Vina, any uploaded PDB protein is, by necessity, converted to the PDBQT format. This involves stripping all heteroatoms (including waters), re-protonating all atoms, and reassigning Gasteiger charges to all atoms. This conversion is done using tools (prepare_receptor4.py) provided in AutoDockTools. The PDBQT formatted protein is eventually returned as part of the results for your inspection.

If you wish to avoid these automated conversion steps, provide your protein file already in PDBQT format. None of the above processing steps are performed on protein files already in the PDBQT format. Information on how to generate PDBQT files for proteins (including assigning hydrogens to His residues) can be found in the AutoDockTools manual and tutorial.

How long will the docking take?

Screening times will vary according to the library you select. The ~46,000 compound library takes around 2 hours and ~650,000 library can take up to 24 hours to complete. These times do not include the wait time in the queue of Lonestar. Queue time depends on how busy Lonstar is when you submit your job. You can check how busy Lonestar is and see if it is in maintenence on the TACC User Portal's systems monitor page.

Drug Discovery @ TEXAS ADVANCED COMPUTING CENTER

Powering Discoveries That Change The World