Guide for first steps in RNA-seq data analysis from Vinther Lab ##### Requirements # FastQC (or similar quality control program, added to the path) # cutadapt (added to the path) # preprocessing script from RNAprobR, which can be found here: https://github.com/lkie/RNAprobBash (also contain guide for downloading and using RNAprobr including preprocessing script) # bowtie2 (added to the path) ##### Workflow # Download fastq files and genome for the provided example cd /usr/directory wget -r http://people.binf.ku.dk/jvinther/data/RNA-seq/data # Quality control for i in {17..19} do fastqc "$i".fastq.gz done # Adapter trimming with Cutadapt cutadapt --version > cutadapt.version # save the cutadapt version for future reference for i in {17..19} do mkdir data/$i cd /data/$i zcat /data/"$i".fastq.gz | cutadapt -a AGATCGGAAGAGCACACGTCT --nextseq-trim=20 --minimum-length=40 - 2> cutadapt.error | gzip > reads_trimmed.fastq.gz & done wait # Preprocessing PATH=$PATH:/path/to/RNAprobr/scripts # set the path to the scripts for i in {17..19} do cd /data/"$i" preprocessing.sh -b NNNNNNN -t 15 -1 reads_trimmed.fastq.gz gzip ./output_dir/read1.fastq done # Build index with bowtie2-build cd /data/fastafile bowtie2-build Bacillus_subtilis_168.ASM904v1.fa Bacillus_subtilis_168.ASM904v1 # Mapping with bowtie2 bowtie2 --version > bowtie2.version # save the bowtie2 version for future reference for i in {17..19} do cd /data/"$i" nice bowtie2 --quiet -p30 -N 1 -x Bacillus_subtilis_168.ASM904v1 -U ./output_dir/read1.fastq.gz | gzip > mapped.sam.gz done wait # Collapse on barcodes, output sam.gz file PATH=$PATH:/path/to/collapse/script chmod u+x collapse.sh for i in {17..19} do cd /data/"$i" ./collapse.sh /data/"$i"/mapped.sam.gz /data/"$i"/output_dir/barcodes.txt > "$i"_debarcoded_file.sam.gz done