Download the data here:
mkdir alphafold
cd alphafold
Download the tarball:
wget http://ekhidna2.biocenter.helsinki.fi/dali/AF-Digest.tar.gz
tar -zxvf AF-Digest.tar.gz
You should find populated subdirectories DAT/ and Digest/ under your current working directory.
The internal identifiers for AF-DB range from a000 to hzzz. If you import other structures to your local DaliLite database, choose identifiers that do not clash with those reserved for AF-DB.
Create a Blast database for hierarchical search:
makeblastdb -in Digest/AF.fasta -dbtype prot
~/DaliLite.v5/bin/dali.pl --hierarchical --oneway --BLAST_DB Digest/AF.fasta \
--pdbfile mystructure.pdb --db Digest/HUMAN.list --repset Digest/HUMAN_70.list \
--dat1 ./ --dat2 ./DAT/ --title "my search" --np 40
Bold parameters refer to the digest of the AlphaFold Database. The hierarchical search is rather slow.
The 70% identity subsets are significantly smaller than the full set mainly in plants (Table 1).
DaliLite imports structures giving them a four-letter identifier. Chains shorter than 30 amino acids are excluded. DaliLite results list both the four-letter identifier and the original file name, which is based on the Uniprot accession number. Note that DaliLite detects structural similarities between compact, globular domains. Searches with non-compact and non-globular AlphaFold models yield no hits with significant structural similarity.
Short | Scientific name | Common Name | FullSet | Subset70 |
---|---|---|---|---|
AF | AlphaFold Database | All models | 364717 | 241174 |
ARATH | Arabidopsis thaliana | Arabidopsis | 27400 | 22895 |
CAEEL | Caenorhabditis elegans | Nematode worm | 19645 | 18233 |
CANAL | Candida albicans | C. albicans | 5974 | 5829 |
DANRE | Danio rerio | Zebrafish | 24640 | 20023 |
DICDI | Dictyostelium discoideum | Dictyostelium | 12620 | 11484 |
DROME | Drosophila melanogaster | Fruit fly | 13432 | 13074 |
ECOLI | Escherichia coli | E. coli | 4301 | 4174 |
HUMAN | Homo sapiens | Human | 23332 | 18899 |
LEIIN | Leishmania infantum | L. infantum | 7924 | 7708 |
MAIZE | Zea mays | Maize | 39220 | 27990 |
METJA | Methanocaldococcus jannaschii | M. jannaschii | 1773 | 1740 |
MOUSE | Mus musculus | Mouse | 21558 | 18146 |
MYCTU | Mycobacterium tuberculosis | M. tuberculosis | 3979 | 3896 |
ORYSJ | Oryza sativa | Asian rice | 43581 | 38243 |
PLAF7 | Plasmodium falciparum | P. falciparum | 5186 | 5016 |
RAT | Rattus norvegicus | Rat | 21254 | 18017 |
SCHPO | Schizosaccharomyces pombe | Fission yeast | 5124 | 4961 |
SOYBN | Glycine max | Soybean | 55693 | 31054 |
STAA8 | Staphylococcus aureus | S. aureus | 2882 | 2812 |
TRYCC | Trypanosoma cruzi | T. cruzi | 19053 | 9255 |
YEAST | Saccharomyces cerevisiae | Budding yeast | 6019 | 5615 |
id | short | original file | chain B start | |
---|---|---|---|---|
cwy4 | DANRE | AF-A0A0R4II06-F1-model_v1 | 1303 | |
c8mq | RAT | AF-F1M5Q4-F1-model_v1 | 742 | |
e020 | HUMAN | AF-P02751-F1-model_v1 | 999 | |
fcn8 | HUMAN | AF-O75369-F1-model_v1 | 1035 | |
fh10 | TRYCC | AF-Q4CU46-F1-model_v1 | 1226 | |
fiaz | TRYCC | AF-Q4CTN6-F1-model_v1 | 1195 | |
finb | TRYCC | AF-Q4DVS3-F1-model_v1 | 1200 | |
fjig | TRYCC | AF-Q4DFV2-F1-model_v1 | 1200 | |
fjyd | TRYCC | AF-Q4CRW2-F1-model_v1 | 1218 | |
flus | TRYCC | AF-Q4CTC1-F1-model_v1 | 1120 | |
flw0 | TRYCC | AF-Q4DH14-F1-model_v1 | 1228 | |
fned | TRYCC | AF-Q4CSQ4-F1-model_v1 | 1151 | |
fnld | TRYCC | AF-Q4CY82-F1-model_v1 | 1243 | |
fop7 | TRYCC | AF-Q4CST2-F1-model_v1 | 1208 | |
fpak | TRYCC | AF-Q4D802-F1-model_v1 | 1240 | |
fpjj | TRYCC | AF-Q4CZ74-F1-model_v1 | 1250 | |
fpnd | TRYCC | AF-Q4CTR3-F1-model_v1 | 1209 | |
fqdk | TRYCC | AF-Q4CX92-F1-model_v1 | 1248 | |
frwd | TRYCC | AF-Q4CUD3-F1-model_v1 | 1186 | |
frx2 | TRYCC | AF-Q4CX06-F1-model_v1 | 1250 | |
fsm1 | TRYCC | AF-Q4CXH5-F1-model_v1 | 1250 | |
ftc2 | TRYCC | AF-Q4CSF7-F1-model_v1 | 1132 | |
fuhh | TRYCC | AF-Q4CSJ3-F1-model_v1 | 1231 | |
fusl | TRYCC | AF-Q4CT27-F1-model_v1 | 1236 | |
fuuc | TRYCC | AF-Q4CTS2-F1-model_v1 | 1230 | |
fu0u | TRYCC | AF-Q4CVN2-F1-model_v1 | 1210 | |
fvnn | TRYCC | AF-Q4D1P3-F1-model_v1 | 1252 | |
f91q | MOUSE | AF-Q8BTM8-F1-model_v1 | 1064 | |
ggud | MOUSE | AF-Q8R4Y4-F1-model_v1 | 1126 |