安装数据文件,配置说明!
1、下载database.zip文件,
压缩包的目录结构为:database
└── diamond_db
├── Animals.dmnd
├── Fungi.dmnd
├── ko.dmnd
└── Plants.dmnd
2、在软件安装目录解压缩database.zip文件。确保 VGenomics_RS/database/diamond_db 目录下存在以下4个文件:
├── Animals.dmnd
├── Fungi.dmnd
├── ko.dmnd
└── Plants.dmnd
测试数据详细信息!
测试数据:ref-based_test_rawdata.zip,
压缩包,包含文件列表如下所示:├── chr22_with_ERCC92.fa
├── chr22_with_ERCC92.gtf
├── gene_GO_anno.txt
├── gene_KEGG_anno.txt
├── HBR_Rep1_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq
├── HBR_Rep1_ERCC-Mix2_Build37-ErccTranscripts-chr22.read2.fastq
├── HBR_Rep2_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq
├── HBR_Rep2_ERCC-Mix2_Build37-ErccTranscripts-chr22.read2.fastq
├── HBR_Rep3_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq
├── HBR_Rep3_ERCC-Mix2_Build37-ErccTranscripts-chr22.read2.fastq
├── UHR_Rep1_ERCC-Mix1_Build37-ErccTranscripts-chr22.read1.fastq
├── UHR_Rep1_ERCC-Mix1_Build37-ErccTranscripts-chr22.read2.fastq
├── UHR_Rep2_ERCC-Mix1_Build37-ErccTranscripts-chr22.read1.fastq
├── UHR_Rep2_ERCC-Mix1_Build37-ErccTranscripts-chr22.read2.fastq
├── UHR_Rep3_ERCC-Mix1_Build37-ErccTranscripts-chr22.read1.fastq
├── UHR_Rep3_ERCC-Mix1_Build37-ErccTranscripts-chr22.read2.fastq
数据说明:
chr22_with_ERCC92.fa | only a single chromosome (chr22) and the ERCC spike-in, that is the human GRCh38 version of the genome from Ensembl. |
---|---|
chr22_with_ERCC92.gtf | annotations obtained from Ensembl (Homo_sapiens.GRCh38.86.gtf.gz) for chromosome 22 only. |
gene_GO_anno.txt | GO functional annotation file. |
gene_KEGG_anno.txt | KEGG functional annotation file. |
*.fastq |
The test data consists of two commercially available RNA samples: Universal Human Reference (UHR)and Human Brain Reference (HBR) . The UHR is total RNA isolated from a diverse set of 10 cancer cell lines. The HBR is total RNA isolated from the brains of 23 Caucasians, male and female, of varying age but mostly 60-80 years old. In addition, a spike-in control was used. Specifically we added an aliquot of the ERCC ExFold RNA Spike-In Control Mixes to each sample. The spike-in consists of 92 transcripts that are present in known concentrations across a wide abundance range (from very few copies to many copies). |