Continuously extensible database provides insights into our cells
Our body consists of various types of cells, in which some genes are expressed only in a specific cells and others are not. In FANTOM5 (Functional Annotation of Mammalian Genome 5) project, we analyzed over 3,000 human and mouse samples and obtained various large-scale data and findings related to gene and transcriptional regulations in many kinds of cell types. The results are useful both for basic researches to use mammalian cells and for application researches targeting human health and drug discoveries if researchers can easily access to the dataset. In this research, we developed a database system, SSTAR (Semantic catalog of Samples, Transcription initiation And Regulators), to enable accesses to these useful results from the FANTOM5 project. The originalities of SSTAR is a developing method as well as easy to use of the system. Scientific data including FANTOM5 data tend to be expanding the type of data in addition to their sizes, and this requires frequent updates of the database system. To overcome this issue, the research team used an existing document management system (Semantic MediaWiki), and this enables flexible and continuous updates of a scientific data like FANTOM5 data. The achievement contributes to research communities in two aspects. It provides a useful database for research purposes and propose a new developmental method for scientific data.
Imad Abugessaisa, Hisashi Shimoji, Serkan Sahin, Atsushi Kondo, Jayson Harshbarger, Marina Lizio, Yoshihide Hayashizaki, Piero Carninci, The FANTOM consortium, Alistair Forrest, Takeya Kasukawa, and Hideya Kawaji, "FANTOM5 transcriptome catalogue of cellular states based on Semantic MediaWiki" Database (Oxford University Press), 10.1093/database/baw105
Press release in Japanese is here.