Database creation
Last updated
Last updated
1) Download plasmid sequences available in NCBI refseq.
2) Extract fasta from tar.gz.
3) Download and extract NCBI taxonomy, which will be fed to pATLAS.
4) Clone this repository:
5) Install its dependencies
6) Configure the database:
7) run MASHix.py - the output will include a filtered. fasta file (master_fasta_*.fas
).
8) run ABRicate, with CARD, ResFinder, PlasmidFinder, VFDB databases.
9) Download the card index necessary for the abricate2db.py script (aro_index.csv).
10) Update the git submodules (git submodule update --init --recursive
) and run abricate2db.py - using all the previous tsv as input.
11) dump database to a sql file.
This steps are fully automated in the nextflow pipeline pATLAS-db-creation.
If you require to add your own plasmids to pATLAS database without asking to add them to pATLAS website, you can provide custom fasta files when building the database using the -i
option of MASHix.py. Then follow the steps described above.