V-MitoSNP user manual
V-MitoSNP, a web-based interface, is designed and implemented under the SQL server database system. The database structure for mtSNPs and mtDNA sequences is downloaded from MITOMAP at http://www.mitomap.org/ ,and the mtSNP rs# ID is downloaded from NCBI dbSNP, version b123, at ftp://ftp.ncbi.nih.gov/snp/. The restriction enzyme database for RFLP genotyping is downloaded from REBASE, version 601, at http://www.rebase.org. The restriction enzymes are transformed into the MySOL format and copied to a local database.V-MitoSNP uses to different input types, a graphic input interface (described under Part I), and a search input interface (described under Part II).
Part I: The visualization input
The graphic input interface provides color-coded gene functions (Figure 1). The green color represents complex I type genes (NADH dehydrogenase), including MT-ND1, MT-ND2, MT-ND3, MT-ND4L, MT-ND4, MT-ND5, and MT-ND6. When moving the mouse cursor to any region of the mtDNA genome graph (red circle in Figure 2) a central, real-time display window (red box in Figure 2) is opened and shows the gene name, the position range for the selected gene, the total number of SNPs within, and the number of SNPs related to cancers or diseases.
The MT-ND5 gene is chosen here as an example to illustrate the general results for a gene input (Figures 3~13).
SNP information for an input gene is shown in Figure 3. The map locus, map position, shorthand, a description, the SNP number with or without cancer and disease, and the sequence for the selected SNP (red box) are shown. The number of total mtSNPs, cancer-related mtSNPs, and disease-related mtSNPs is shown above the red box. By default, mtSNP information without a report for cancers and diseases is shown in MITOMAP. By clicking on “show sequence” the flanking sequence (00bp, 250bp for each side of the SNP) can be provided.
Figure 3. Selectable options for mtSNP information.
Figure 4 shows the SNP information for the input gene, and includes the NCBI rs# ID, nucleotide position, nucleotide change, amino acid change, and RFLP availability (Yes ; No)
Figure 4. SNP results for "SNP" selection in ND5 gene.
Disease- and cancer-related mtSNPs are shown in Figure 5 and Figure 6, respectively. Both figures provide extra information in addition to Figure 4, e.g. homoplasmy and heteroplasmy. V-MitoSNP also retrieves the full name of cancers and diseases via the MITOMAP hyperlink.
Figure 5. SNP results for "Cancer" selection in ND5 gene.
Figure 6. SNP results for "Disease" selection in ND5 gene.
When clicking the check box “show sequence” in Figure 3, 4, 5, and 6, each SNP with its corresponding flanking sequence (500bp) is provided for primer design by users if needed (Figure 7). The SNP is shown in red color.
Figure 7. Results of SNP flanking sequence for "Cancer" selection in ND5 gene.
Figure 8 shows RFLP information after selecting the symbol shown in Figures 4, 5, 6, and 7. Sequence (+) and sequence (-) are used to represent the sense and antisense sequences, respectively. Sequence (+/-) contains both, the sequence (+) and sequence (-). The Sequence (+/-) = 0 and sequence (+/-) = 1 are used to represent the sequence with C in C13580G and G in C13580G, respectively. Recognition site of restriction enzymes (commercial and non-commercial) are shown in the 5’->3’direction .
If the RFLP position shows the symbol ,the RFLP restriction enzyme cannot be provided and distinguished from the target SNP. However, an artificial primer can be designed to create new RFLP enzymes (shown in Figure 13). This is one of the great advantages of V-MitoSNP , since it allows a user to design a mismatched RFLP and enzyme, even if natural ones are not available.
Figure 8. Result of ND5 RFLP.
The primer design strategy depends on the RFLP availability for the target SNP. For SNPs with RFLP enzymes, the natural primer follows the p default primer design conditions (Figures 9, 10, and 11). Three representative conditions in V-MitoSNP are shown for different mtSNPs in Figure 9, 10, and 11. Non-commercial enzymes are provided and shown in Figure 12. In addition, virtual electrophoresis results for ready-for-use natural and mismatched PCR-RFLPs are shown in Figures 9~11 and Figure 13, respectively.
Figure 9. Natural primer design and its corresponding restriction enzyme and genotype for SNP (T12384C) in ND5 gene. Restriction enzyme cut only at sequence (+/-) = 0.
Figure 10.Natural primer design and its corresponding restriction enzyme and genotype for SNP (T12441C) in ND5 gene. Restriction enzyme cut only at sequence (+/-) = 1.
Figure 11.Natural primer design and its corresponding restriction enzyme and genotype for SNP (C12815T) in ND5 gene. Restriction enzyme cut at both sequence(+/-) = 0 and sequence(+/-) = 1.
Figure 12. Recognition site CCCG (5'->3') for mtSNP C12815 in ND5 is provided by clicking the hyperlink of "Non-Commercial Detail" of Figure 11.
For SNPs without natural RFLP enzymes, V-MitoSNP provides a mismatched primer design function (Figure 13). In the forward primer, red and blue bases represent the mismatched and user-selected SNPs, respectively.
Figure 13. The mismatched primer design for mtSNP A14233G.
Part II. The search input.
Under the search input interface, (1) keywords, (2) mt range and (3) mtDNA sequences are acceptable. These inputs are briefly described below.
(1) Keywords (Figure 14) like gene locus (Figure 15, 16, and 17), disease (Figure 18), and NCBI rs# ID (Figure 19, 20, and 21) are allowed as input.
(2) Range input (Figure 22) can be selected by clicking the color-band graph twice using the “to” and “from” buttons, or by directly line feeding the range for a position (Figures 23, 24, and 25). The mtSNP within the input range are marked in red color for cancers and diseases (Figure 26).(3) By default, the mtDNA sequence input (Figure 27) in IUPAC format within a 10% mismatch range to the rCRS sequence is allowed when performing mtBLAST, which is a gene-targeted search for mtDNA sequences unlike NCBI BLAST. The results for SNPs without cancers/diseases, cancer-related and disease-related information are shown in Figures 28, 29, and 30. The mtSNPs within the input range are marked in red color for cancers and diseases (Figure 31).
(1). Keyword search module
Figure 14. Keyword search module.
(1)-1. Keywords by gene locus (Figure 15, 16, and 17)
Figure 15. Search result for MT-CO1 gene by default.
Figure 16. Search result for MT-CO1 gene by clicking "Cancer".
Figure 17. Search result for MT-CO1 gene by clicking "Disease".
(1)-2. Keywords by disease (Figure 18)
Figure 18. Search result for ADPD disease.
(1)-3. Keywords by NCBI rs# ID (Figure 19, 20, and 21)
Results for the mitochondrial SNP rs # ID, here for input rs2853516, are shown without cancers and diseases by default (Figure 19). mtSNPs with cancer- and disease-related information are shown in Figure 20 and 21, respectively.
Figure 19. Seach mtSNP ID for rs2853516.
Figure 20. Seach mtSNP ID for rs2853516 by clicking "Cancer".
Figure 21. Seach mtSNP ID for rs2853516 by clicking "Disease".
(2) mt range input.
The range position is selectable by clicking the mtDNA color-band graph, and a real-time display with the positional information is provided. In Figure 22, positions 5303~5803 on the rCRS mitomap sequence are chosen. The gene coverage from the input and the results are shown in Figures 23, 24, and 25. All information for mtSNPs within the input data is shown in the order of nucleotide positions. All mtSNPs within the input range can be graphically displayed in red color with or without cancer/disease information (Figure 26).
Figure 22. mt range selection interface.
Figure 23. Default results for mt range input from 5303 to 5803.
Figure 24. Result for mt range input from 5303 to 5803 by "Cancer".
Figure 25. Result for mt range input from 5303 to 5803 by "Disease".
Figure 26. Sequences with polymorphisms marked in the range from 5303 to 5803.
3). Search Human Mitochondrial DNA sequence
The sequence in the range from 5303 to 5703 can also be used for input in mtBLAST of V-MitoSNP (Figure 27). The example sequence used here was:
CCCACCATCA TAGCCACCAT CACCCTCCTT AACCTCTACT TCTACCTACG CCTAATCTAC TCCACCTCAA TCACACTACT CCCCATATCT AACAACGTAA AAATAAAATG ACAGTTTGAA CATACAAAAC CCACCCCATT CCTCCCCACA CTCATCGCCC TTACCACGCT ACTCCTACCT ATCTCCCCTT TTATACTAAT AATCTTATAG AAATTTAGGT TAAATACAGA CCAAGAGCCT TCAAAGCCCT C
AGTAAGTTGC AATACTTAAT TTCTGTAACA GCTAAGGACT GCAAAACCCC ACTCTGCATC AACTGAACGC AAATCAGCCA CTTTAATTAA GCTAAGCCCT TACTAGACCA ATGGGACTTA AACCCACAAA CACTTAGTTA ACAGCTAAGC ACCCTAATCA ACTGGCTTCA ATCTACTTCT CCCGCCGCCG GGAAAAAAGG CGGGAGAAGC CCCGGCAGGT TTGAAGCTGC TTCTTCGAAT TTGCAATTCA
Figure 27. Sequence search (5303-5803) for human mitochondrial DNA (mtBLAST).
Gene coverage and the results are shown in Figures 28, 29, and 30. All information for mtSNPs within the input data is shown in the order of nucleotide positions. All mtSNPs with or without cancers/diseases can be graphically displayed in the sequence with red color (Figure 31).
Figure 28. Default results for sequence input from 5303 to 5803 (same as Figure 23)
Figure 29. Results for sequence input from 5303 to 5803 after selecting “Cancer”. (same as Figure 24).
Figure 30. Results for sequence input from 5303 to 5803 after selecting “Disease”. (same as Figure 25)
Figure 31. Input sequences and sequences with polymorphisms for cancer and disease marked. (same as Figure 26)