Today I BLASTn -ed Blue crab transcriptome with the assembled C. bairdi transcriptome on Mox. I also finished up some more things in my BLAST to GO GO-slim notebook with uniprot/sprot. Additionally, I got some input from Steven on how best to create a poster for GSS (which is this THURSDAY) so that I can get some cool stuff on there, but not use too many words… which is always a problem I have.


I ran blastn on Mox. (GitHub Issue #484)

First I made a database with the blue crab transcriptome:

Then, I used this script to run blast on Mox:

## Job Name
#SBATCH --job-name=1113_Cbairdi_blast_01
## Allocation Definition
#SBATCH --account=srlab
#SBATCH --partition=srlab
## Resources
## Nodes
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=3-20:30:00
## Memory per node
#SBATCH --mem=100G
##turn on e-mail notification
#SBATCH --mail-type=ALL
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/srlab/graceac9/analyses/1113-Cb-blast

# Load Python Mox module for Python module availability

module load intel-python3_2017

/gscratch/srlab/programs/ncbi-blast-2.6.0+/bin/blastn \
-query /gscratch/srlab/graceac9/query/library01/query.fa \
-db /gscratch/srlab/graceac9/blastdb/bluecrab/blastdb/bluecrab \
-max_target_seqs 1 \
-outfmt 6 \
-num_threads 28 \

I just got an email saying the job finished. I used nano to look at the file. Here’s some lines copy and pasted:

BLAST to GO GOslim


Finished up some things with that basing it on Steven’s notebook.

I took the final output from that file and moved it into my grace-Cbairdi-transcriptome/analyses directory.

I then used a new R script to make a weird pie chart.


Because I’m trying to have something to put on a poster by tomorrow afternoon, I’ll just use excel, per Steven’s suggestion. Will learn the R or Jupyter notebook way once I don’t have a time crunch.

Trinity assembly stats using

I tacked this on to the end of my notebook where I used Transrate.

Info on what the stats mean: here

Will use some of these stats on my poster.

## Counts of transcripts, etc.
Total trinity 'genes':	79739
Total trinity transcripts:	143172
Percent GC: 46.43

Stats based on ALL transcript contigs:

	Contig N10: 3801
	Contig N20: 2948
	Contig N30: 2431
	Contig N40: 2032
	Contig N50: 1696

	Median contig length: 608
	Average contig: 1010.52
	Total assembled bases: 144678598

## Stats based on ONLY LONGEST ISOFORM per 'GENE':

	Contig N10: 3659
	Contig N20: 2836
	Contig N30: 2306
	Contig N40: 1901
	Contig N50: 1539

	Median contig length: 479
	Average contig: 873.95
	Total assembled bases: 69687682

### Additional stuff done at home ## BLASTn with Hematodinium make_hemat_blastdb.ipynb

Upload blast db output to owl.

Run BLASTn on Mox.

Output in /gscratch/srlab/graceac9/analyses/1113-Cb-hemat-blastn.

Head of the output file

