alignment - Trying to download a reference genome for STAR using command tools

Question

Welcome To Ask or Share your Answers For Others

alignment - Trying to download a reference genome for STAR using command tools

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

alignment - Trying to download a reference genome for STAR using command tools

I am currently at the stage of my RNA-seq workflow which involves the usage of alignment tools and for that I have chosen STAR (this was downloaded through the SSH puTTY since I am on Windows so I would be able to use it). The download was successful and the next step involved downloading the reference genome FASTA and GTF files.

I found the FASTA and GTF files from ENSEMBL, the following are the links of the two respectively: (1) ftp://ftp.ensembl.org/pub/release-102/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz (2) ftp://ftp.ensembl.org/pub/release-102/gtf/homo_sapiens/Homo_sapiens.GRCh38.102.gtf.gz

All seemed well but when I attempted to use the head command to check whether the file is viable (i.e. showing around 10 lines of data represented as bases), it shows me the letter N repeatedly (referring to the data being unknown or unreadable). This only happened for the FASTA file, the GTF file seems fine.

I'm not sure what to do next or how to fix this problem, any help would be greatly appreciated. Thank you!

The code I used is the following:

wget --directory-prefix GENOME_DIR ftp://ftp.ensembl.org/pub/release-102/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz

#To download the ftp file

ls GENOME_DIR

#To check the respective files are in the directory indicated

gunzip GENOME_DIR//Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz

#To unzip the file

ls GENOME_DIR

#To check file has been unzipped

FASTA=GENOME_DIR/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa

#To define the variable

head FASTA

#To check the viability of the file.

It is after this point that the command simply gives me a bunch of Ns and thus, I cannot continue. I also tried other links and other sources, I input -m switch in the wget command as well so I am at a loss. Again, any help would be greatly appreciated

question from:https://stackoverflow.com/questions/65951856/trying-to-download-a-reference-genome-for-star-using-command-tools

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

alignment - Trying to download a reference genome for STAR using command tools

alignment - Trying to download a reference genome for STAR using command tools

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags