posted on 2023-05-26, 05:40 authored by Jinsong Shi, Shuangyong Yan, Wenjing Li, Xiurong Yang, Zhongqiu Cui, Junling Li, Guangsheng Li, Yuejiao Li, Yanping Hu, Shan Gao

Chloroplast and mitochondrial DNA (cpDNA and mtDNA) are apart from nuclear DNA (nuDNA) in a eukaryotic cell. The transcription system of chloroplasts differs from those of mitochondria and eukaryotes. In contrast to nuDNA and animal mtDNA, the transcription of cpDNA is still not well understood, primarily due to the unresolved identification of transcription initiation sites (TISs) and transcription termination sites (TTSs) on the genome scale. In the present study, we characterized the transcription of chloroplast (cp) genes with greater accuracy and comprehensive information using PacBio full-length transcriptome data from Arabidopsis thaliana. The major findings included the discovery of four types of artifacts, the validation and correction of cp gene annotations, the exact identification of TISs that start with G, and the discovery of polyA-like sites as TTSs. Notably, we proposed a new model to explain cp transcription initiation and termination at the whole-genome level. Four types of artifacts, degraded RNAs and splicing intermediates deserve the attention from researchers working with PacBio full-length transcriptome data, as these contaminant sequences can lead to incorrect downstream analysis. Cp transcription initiates at multiple promoters and terminates at polyA-like sites. Our study provides new insights into cp transcription and new clues to study the evolution of promoters, TISs, TTSs and polyA tails of eukaryotic genes.


This work was supported by West Light Foundation of the Chinese Academy of Sciences to Yanping Hu, and Tianjin Science and Technology Program Project (22ZYCGSN00330) to Zhongqiu Cui. The funding bodies played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.