数据库|EPD:真核生物启动子数据库EPD in 2020: enhanced data visualization and extension to ncRNA promoters11.501Nucleic Acids Res . 2020 Jan 8;48(D1):D65-D69. doi: 10.1093/nar/gkz1014. Abstract The Eukaryotic Promoter Database (EPD), available online at https://epd.epfl.ch, provides accurate transcription start site (TSS) information for promoters of 15 model organisms plus corresponding functional genomics data that can be viewed in a genome browser, queried or analyzed via web interfaces, or exported in standard formats (FASTA, BED, CSV) for subsequent analysis with other tools. Recent work has focused on the improvement of the EPD promoter viewers, which use the UCSC Genome Browser as visualization platform. Thousands of high-resolution tracks for CAGE, ChIP-seq and similar data have been generated and organized into public track hubs. Customized, reproducible promoter views, combining EPD-supplied tracks with native UCSC Genome Browser tracks, can be accessed from the organism summary pages or from individual promoter entries. Moreover, thanks to recent improvements and stabilization of ncRNA gene catalogs, we were able to release promoter collections for certain classes of ncRNAs from human and mouse. Furthermore, we developed automatic computational protocols to assign orphan TSS peaks to downstream genes based on paired-end (RAMPAGE) TSS mapping data, which enabled us to add nearly 9000 new entries to the human promoter collection. Since our last article in this journal, EPD was extended to five more model organisms: rhesus monkey, rat, dog, chicken and Plasmodium falciparum. 启动子(Promotor)在概念上被定义为转录起始位点(TSS)或转录起始区。为了根据实验证据,提供准确的TSS注释,于1986年创建了真核生物启动子数据库EPD(https://epd.epfl.ch)。最初,EPD只是一个手动整理期刊发表结果的数据库,随着二代测序的出现,EPD也开始整合从高通量的转录本作图数据和高质量的基因注释资源中获得的启动子数据,更将数据集范围扩展到了ncRNA的启动子。更新后的数据库于2020年1月发表在知名期刊《Nucleic Acids Rsearch》上。
EPD数据库网站首页 从2017年1月至今,EPD整合了针对鸡,狗,大鼠,恒河猴和疟原虫的启动子,发布了人类、小鼠、果蝇和拟南芥等物种的新版本数据集,当前的启动子条目已经整理在下表中。随着疟原虫启动子集合的发布,EPD首次覆盖了人类病原体,这是朝着新方向迈出的重要一步。 EPD涵盖的生物体和相应的启动子总数 相对于EPD中定义的人类编码和非编码基因的TSS,核心启动子基序的频率和位置分布 |