June Bits

This is a running daily of all the stuff done in June.

June 27 Took gigas roslin proteins and blasted to swiss-prot

/home/shared/ncbi-blast-2.11.0+/bin/blastp \
-query ../genome-features/GCF_002022765.2_C_virginica-3.0_protein.faa \
-db ../blastdb/uniprot_sprot_r2022_02 \
-out ../output/17-Swiss-Prot-Annotation/Cvir_protein-uniprot_blastp.tab \
-evalue 1E-20 \
-num_threads 40 \
-max_target_seqs 1 \
-outfmt 6

output - https://github.com/sr320/ceabigr/blob/main/output/17-Swiss-Prot-Annotation/Cvir_protein-uniprot_blastp_sep.tab


June 26 Another update from Mac on oyster dye trial, larvae exposure - https://docs.google.com/document/d/1tnhOlAjTnE3xq_jYPnpyv-GLDOd39zkixiHknnmJKyw/edit


June 23

Pacific oyster flourescent dye exposure trial - https://d.pr/f/HgMwgH


June 21

Yesterday played a bit with Aquamine - bugs / issues with both enrichment and region analysis in C virg.

Went ahead and did a blastp on swiss-pro in an effort to maybe incorporate into Aquamine.


June 17

Trying epidiverse / snp

What a mess.

BS-SNPer not doing well with Oly v82 (100000k and bigger / 32 contigs). Not sure of issue.


June 16

Plan to look at Oly paper revisions and push ceabigr a bit.

Tried BS-snper Oly -> too many chromosomes.

Trying to tackle kallisto in ceabigr.; got it.

Ran the following to try to get conversion efficiency on Oly work

#!/bin/bash
## Job Name
#SBATCH --job-name=olylambd
## Allocation Definition
#SBATCH --account=srlab
#SBATCH --partition=srlab
## Resources
## Nodes (We only get 1, so this is fixed)
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=6-00:00:00
## Memory per node
#SBATCH --mem=100G
#SBATCH --mail-type=ALL
#SBATCH --mail-user=sr320@uw.edu
## Specify the working directory for this job
#SBATCH --chdir=/gscratch/scrubbed/sr320/061622-ecoli


#Script output available at https://gannet.fish.washington.edu/seashell/bu-mox/scrubbed/020320-oly/


# Directories and programs
bismark_dir="/gscratch/srlab/programs/Bismark-0.21.0"
bowtie2_dir="/gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/"
samtools="/gscratch/srlab/programs/samtools-1.9/samtools"
reads_dir="/gscratch/srlab/sr320/data/olurida-bs/decomp/"
genome_folder="/gscratch/srlab/sr320/data/ecoli-MG1655/"

source /gscratch/srlab/programs/scripts/paths.sh


# ${bismark_dir}/bismark_genome_preparation \
# --verbose \
# --parallel 28 \
# --path_to_aligner ${bowtie2_dir} \
# ${genome_folder}




find ${reads_dir}*_s456_trimmed.fq \
| xargs basename -s _s456_trimmed.fq | xargs -I{} ${bismark_dir}/bismark \
--path_to_bowtie ${bowtie2_dir} \
-genome ${genome_folder} \
-p 4 \
--non_directional \
${reads_dir}/{}_s456_trimmed.fq



find *.bam | \
xargs basename -s .bam | \
xargs -I{} ${bismark_dir}/deduplicate_bismark \
--bam \
{}.bam



${bismark_dir}/bismark_methylation_extractor \
--bedGraph --counts --scaffolds \
--multicore 14 \
--buffer_size 75% \
*deduplicated.bam



# Bismark processing report

${bismark_dir}/bismark2report

#Bismark summary report

${bismark_dir}/bismark2summary

Talked with L & K Will see if I can get BS snper on concat genome.


June 15

Today I finally got through the snip analysis with the Arnie sick data. A large problem was the standard out files were too big. I ended up just having to delete them because for some reason they cannot be added to the get ignore file. I posted the results of the filtered snaps search the time I get feedback on what the next step will be.

Written on June 15, 2022