Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
340 views
in Technique[技术] by (71.8m points)

alignment - Trying to download a reference genome for STAR using command tools

I am currently at the stage of my RNA-seq workflow which involves the usage of alignment tools and for that I have chosen STAR (this was downloaded through the SSH puTTY since I am on Windows so I would be able to use it). The download was successful and the next step involved downloading the reference genome FASTA and GTF files.

I found the FASTA and GTF files from ENSEMBL, the following are the links of the two respectively: (1) ftp://ftp.ensembl.org/pub/release-102/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz (2) ftp://ftp.ensembl.org/pub/release-102/gtf/homo_sapiens/Homo_sapiens.GRCh38.102.gtf.gz

All seemed well but when I attempted to use the head command to check whether the file is viable (i.e. showing around 10 lines of data represented as bases), it shows me the letter N repeatedly (referring to the data being unknown or unreadable). This only happened for the FASTA file, the GTF file seems fine.

I'm not sure what to do next or how to fix this problem, any help would be greatly appreciated. Thank you!

The code I used is the following:

wget --directory-prefix GENOME_DIR ftp://ftp.ensembl.org/pub/release-102/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz

#To download the ftp file

ls GENOME_DIR

#To check the respective files are in the directory indicated

gunzip GENOME_DIR//Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz

#To unzip the file

ls GENOME_DIR

#To check file has been unzipped

FASTA=GENOME_DIR/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa

#To define the variable

head FASTA

#To check the viability of the file.

It is after this point that the command simply gives me a bunch of Ns and thus, I cannot continue. I also tried other links and other sources, I input -m switch in the wget command as well so I am at a loss. Again, any help would be greatly appreciated

question from:https://stackoverflow.com/questions/65951856/trying-to-download-a-reference-genome-for-star-using-command-tools

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...