CRAM to BAM conversion¶
This workflow is used in IGSR for:
- Downloading a CRAM file from the archive (ENA, IGSR FTP, etc…)
- Convert it to BAM
- Create an index for converted BAM.
This workflow relies on the ASPERA service for the fast download of the data from the archives
Dependencies¶
- Nextflow
This pipeline uses a workflow management system named Nextflow. This software can be downloaded from:
- SAMTools
Downloadable from:
- Aspera connect software:
This ascp client can be obtained from:
http://asperasoft.com/software/transfer-clients/connect-web-browser-plug-in/
How to run the pipeline¶
First, you need to create a
nexflow.config
file that can be used by Nextflow to set the required variables. Here goes an example of one of these files:params.samtools_folder='~/bin/samtools-1.9/' // folder containin the samtools binary // params defaults for ascp client params.key_file = '/homes/ernesto/.aspera/connect/etc/asperaweb_id_dsa.openssh' // Private-key file name (id_rsa) for authentication params.transfer_rate = '900M' params.port = 33001 // TCP port used for SSH authentication
Then, you can start your pipeline by doing:
nextflow -C nextflow.config run $IGSR_CODEBASE/scripts/FILE/cram2bam.nf --file input.txt
- Where:
-C
option allows you to specify the path to thenextflow.config
file
$IGSR_CODEBASE
is the folder containing the igsr codebase downloaded fromhttps://github.com/igsr/igsr_analysis.git
--file
File with the urls pointing to the CRAM files to be converted. This file shold have a content similar to:url,dest,prefix era-fasp@fasp.sra.ebi.ac.uk:/vol1/ERZ454/ERZ454001/ERR1457180.cram,/path/in/dest/ERR1457180.cram,ERR1457180Where
url
points to the location of the file to be downloaded,dest
is the path in the local machine where it will be downloaded andprefix
is used as the string used in the converted BAM file and its respective index
Pipeline output¶
This worklow will create a folder name converted/
with 2 output files:
prefix.bam
BAM file resulting after converting the downloaded CRAM file
prefix.bam.bai
The index created after runningsamtools index prefix.bam