Geoduck Larval Transcriptome
In preparation for new proteomic analysis here is a transcriptome from the NovaSeq.
What I ran on Hyak.
#!/bin/bash
## Job Name
#SBATCH --job-name=Geoduck-trinity-01
## Allocation Definition
#SBATCH --account=srlab
#SBATCH --partition=srlab
## Resources
## Nodes (We only get 1, so this is fixed)
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=90:30:00
## Memory per node
#SBATCH --mem=500G
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/srlab/sr320/analyses/0804_1818
source /gscratch/srlab/programs/scripts/paths.sh
Trinity \
--seqType fq \
--SS_lib_type RF \
--left /gscratch/srlab/sr320/data/NR021_S8_L001_R1_001.fastq,/gscratch/srlab/sr320/data/NR021_S8_L002_R1_001.fastq \
--right /gscratch/srlab/sr320/data/NR021_S8_L001_R2_001.fastq,/gscratch/srlab/sr320/data/NR021_S8_L002_R2_001.fastq \
--CPU 50 --trimmomatic --max_memory 500G
The slurm. This is what would be spit out in terminal while running. With Trinity, there is a lot.
The fasta file (179M).
http://owl.fish.washington.edu/halfshell/bu-mox/analyses/0804_1818/trinity_out_dir/0804_Pgen_larvae.fasta
If you want to see the rest of the output look here.
[sr320@mox2 util]$ perl TrinityStats.pl /gscratch/srlab/sr320/analyses/0804_1818/trinity_out_dir/0804_Pgen_larvae.fasta
################################
## Counts of transcripts, etc.
################################
Total trinity 'genes': 145131
Total trinity transcripts: 219698
Percent GC: 36.30
########################################
Stats based on ALL transcript contigs:
########################################
Contig N10: 2651
Contig N20: 1765
Contig N30: 1263
Contig N40: 921
Contig N50: 676
Median contig length: 331
Average contig: 533.78
Total assembled bases: 117271312
#####################################################
## Stats based on ONLY LONGEST ISOFORM per 'GENE':
#####################################################
Contig N10: 2384
Contig N20: 1541
Contig N30: 1090
Contig N40: 791
Contig N50: 592
Median contig length: 324
Average contig: 500.45
Total assembled bases: 72630419
Written on August 5, 2017