BEDtools
doing arithmetic with sequences
Coverage
For this sub command you will be using the same data as in Week 6, so you should already have it in you data
directory. If not or if you deleted it…
cd ../data
curl -O https://gannet.fish.washington.edu/seashell/bu-mox/scrubbed/120321-cvBS/19F_R1_val_1_bismark_bt2_pe.deduplicated.sorted.bam
curl -O https://gannet.fish.washington.edu/seashell/bu-mox/scrubbed/120321-cvBS/19F_R1_val_1_bismark_bt2_pe.deduplicated.sorted.bam.bai
You will also need a bed file with gene information.
cd ../data
curl -O https://eagle.fish.washington.edu/Cvirg_tracks/C_virginica-3.0_Gnomon_genes.bed
Convert bam to bed
/home/shared/bedtools2/bin/bedtools bamtobed \
-i ../data/19F_R1_val_1_bismark_bt2_pe.deduplicated.sorted.bam \
> ../output/08-19F.bed
Get coverage of sequence reads on gene regions
Default behavior
After each interval in A, bedtools coverage
will report:
The number of features in B that overlapped (by at least one base pair) the A interval.
The number of bases in A that had non-zero coverage from features in B.
The length of the entry in A.
The fraction of bases in A that had non-zero coverage from features in B.
/home/shared/bedtools2/bin/bedtools coverage \
\
-a ../data/C_virginica-3.0_Gnomon_genes.bed \
-b ../output/08-19F.bed > ../output/08-gene-19F-coverage.out
Intersect
Lets grab a bed file of Transposable Elements and lncRNAs
cd ../data
curl -O http://owl.fish.washington.edu/halfshell/genomic-databank/cgigas_uk_roslin_v1_gene.gff curl -O http://owl.fish.washington.edu/halfshell/genomic-databank/cgigas_uk_roslin_v1_rm.te.bed
curl -O http://owl.fish.washington.edu/halfshell/genomic-databank/cgigas_uk_roslin_v1_lncRNA.gff
/home/shared/bedtools2/bin/bedtools intersect \
\
-a ../data/cgigas_uk_roslin_v1_gene.gff \
-b ../data/cgigas_uk_roslin_v1_rm.te.bed > ../output/08-gene-TE-intersect.out
head -2 ../output/08-gene-TE-intersect.out
Closest
/home/shared/bedtools2/bin/bedtools closest \
\
-a ../data//cgigas_uk_roslin_v1_lncRNA.gff\
-b ./data/cgigas_uk_roslin_v1_gene.gff > ../output/08-lnc-gene-closet.out