BEDtools
doing arithmetic with sequences
Assignment
Run some basic sub-commands in BEDtools.
Coverage

For this sub command you will be using the same data as in Week 6, so you should already have it in you data directory. If not or if you deleted it…
cd ../data
curl -O https://gannet.fish.washington.edu/seashell/bu-mox/scrubbed/120321-cvBS/19F_R1_val_1_bismark_bt2_pe.deduplicated.sorted.bam
curl -O https://gannet.fish.washington.edu/seashell/bu-mox/scrubbed/120321-cvBS/19F_R1_val_1_bismark_bt2_pe.deduplicated.sorted.bam.baiYou will also need a bed file with gene information.
cd ../data
curl -O https://eagle.fish.washington.edu/Cvirg_tracks/C_virginica-3.0_Gnomon_genes.bedConvert bam to bed
/home/shared/bedtools2/bin/bedtools bamtobed \
-i ../data/19F_R1_val_1_bismark_bt2_pe.deduplicated.sorted.bam \
> ../output/08-19F.bed
Warning
This is big file. What should you not do with it?
Get coverage of sequence reads on gene regions
Default behavior
After each interval in A, bedtools coverage will report:
The number of features in B that overlapped (by at least one base pair) the A interval.
The number of bases in A that had non-zero coverage from features in B.
The length of the entry in A.
The fraction of bases in A that had non-zero coverage from features in B.
/home/shared/bedtools2/bin/bedtools coverage \
-a ../data/C_virginica-3.0_Gnomon_genes.bed \
-b ../output/08-19F.bed \
> ../output/08-gene-19F-coverage.outIntersect

Lets grab a bed file of Transposable Elements and lncRNAs
cd ../data
curl -O http://owl.fish.washington.edu/halfshell/genomic-databank/cgigas_uk_roslin_v1_gene.gff curl -O http://owl.fish.washington.edu/halfshell/genomic-databank/cgigas_uk_roslin_v1_rm.te.bed
curl -O http://owl.fish.washington.edu/halfshell/genomic-databank/cgigas_uk_roslin_v1_lncRNA.gff/home/shared/bedtools2/bin/bedtools intersect \
-a ../data/cgigas_uk_roslin_v1_gene.gff \
-b ../data/cgigas_uk_roslin_v1_rm.te.bed \
> ../output/08-gene-TE-intersect.out
head -2 ../output/08-gene-TE-intersect.outClosest

/home/shared/bedtools2/bin/bedtools closest \
-a ../data//cgigas_uk_roslin_v1_lncRNA.gff\
-b ./data/cgigas_uk_roslin_v1_gene.gff \
> ../output/08-lnc-gene-closet.out