Using high-resolution annotation of insect mitochondrial DNA to decipher tandem repeats in the control region

In this study, we used a small RNA sequencing (sRNA-seq) based method to annotate the mitochondrial genome of the insect Erthesina fullo Thunberg at 1 bp resolution. The high-resolution annotations cover both entire strands of the mitochondrial genome without any gaps or overlaps. Most of the new annotations were consistent with the previous annotations which had been obtained using PacBio full-length transcripts. Two important findings were that animals transcribe both entire strands of mitochondrial genomes and the tandem repeats in the control region of the E. fullo mitochondrial genome contains the repeated Transcription Initiation Sites (TISs) of the heavy strand. In addition, we found that the copy numbers of tandem repeats showed a great diversity within an individual, suggesting that mitochondrial DNA recombination occurs in an individual. In conclusion, the sRNA-seq based method uses 5′ and 3′ end small RNAs to annotate nuclear non-coding and mitochondrial genes at 1 bp resolution, and can be used to identify new steady RNAs, particularly long non-coding RNAs (lncRNAs). The high-resolution annotations of mitochondrial genomes can also be used to study the molecular phylogenetics and evolution of animals or to investigate mitochondrial gene transcription, RNA processing, RNA maturation and several other related topics. The complete mitochondrial genome sequence of E. fullo with the new annotations using the sRNA-seq based method is available at the NCBI GenBank database under the accession number MK374364. We publish our theories, methods, the high quality sRNA-seq and RNA-seq data (SRA: SRP174926) for extensive use.