1) Download plasmid sequences available in NCBI refseq.
2) Extract fasta from tar.gz.
3) Download and extract NCBI taxonomy, which will be fed to pATLAS.
4) Clone this repository:
git clone https://github.com/tiagofilipe12/pATLAS
5) Install its dependencies​
6) Configure the database:
createdb <database_name>pATLAS/patlas/db_manager/db_create.py <database_name>
7) run MASHix.py - the output will include a filtered. fasta file (master_fasta_*.fas
).
8) run ABRicate, with CARD, ResFinder, PlasmidFinder, VFDB databases.
# e.g.abricate --db card <master_fasta*.fas> > abr_card.tsvabricate --db resfinder <master_fasta*.fas> > abr_resfinder.tsvabricate --db vfdb <master_fasta*.fas> > abr_vfdb.tsvabricate --db plasmidfinder <master_fasta*.fas> > abr_plasmidfinder.tsv
9) Download the card index necessary for the abricate2db.py script (aro_index.csv).
10) Update the git submodules (git submodule update --init --recursive
) and run abricate2db.py - using all the previous tsv as input.
# e.g.abricate2db.py -i abr_plasmidfinder.tsv -db plasmidfinder \-id 80 -cov 90 -csv aro_index.csv -db_psql <database_name>
11) dump database to a sql file.
This steps are fully automated in the nextflow pipeline pATLAS-db-creation.
If you require to add your own plasmids to pATLAS database without asking to add them to pATLAS website, you can provide custom fasta files when building the database using the -i
option of MASHix.py. Then follow the steps described above.