CRAM to BAM conversion¶
This workflow is used in IGSR for:
- Downloading a CRAM file from the archive (ENA, IGSR FTP, etc…)
- Convert it to BAM
- Create an index for converted BAM.
This workflow relies on the ASPERA service for the fast download of the data from the archives
Dependencies¶
- Nextflow
This pipeline uses a workflow management system named Nextflow. This software can be downloaded from:
- SAMTools
Downloadable from:
- Aspera connect software:
This ascp client can be obtained from:
http://asperasoft.com/software/transfer-clients/connect-web-browser-plug-in/
How to run the pipeline¶
First, you need to create a
nexflow.configfile that can be used by Nextflow to set the required variables. Here goes an example of one of these files:params.samtools_folder='~/bin/samtools-1.9/' // folder containin the samtools binary // params defaults for ascp client params.key_file = '/homes/ernesto/.aspera/connect/etc/asperaweb_id_dsa.openssh' // Private-key file name (id_rsa) for authentication params.transfer_rate = '900M' params.port = 33001 // TCP port used for SSH authentication
Then, you can start your pipeline by doing:
nextflow -C nextflow.config run $IGSR_CODEBASE/scripts/FILE/cram2bam.nf --file input.txt
- Where:
-Coption allows you to specify the path to thenextflow.configfile
$IGSR_CODEBASEis the folder containing the igsr codebase downloaded fromhttps://github.com/igsr/igsr_analysis.git
--fileFile with the urls pointing to the CRAM files to be converted. This file shold have a content similar to:url,dest,prefix era-fasp@fasp.sra.ebi.ac.uk:/vol1/ERZ454/ERZ454001/ERR1457180.cram,/path/in/dest/ERR1457180.cram,ERR1457180Where
urlpoints to the location of the file to be downloaded,destis the path in the local machine where it will be downloaded andprefixis used as the string used in the converted BAM file and its respective index
Pipeline output¶
This worklow will create a folder name converted/ with 2 output files:
prefix.bam
BAM file resulting after converting the downloaded CRAM file
prefix.bam.bai
The index created after runningsamtools index prefix.bam