The journal is a collaboration between
BGI Shenzhen—the world’s largest genomics center— and the open access publisher,
BioMed Central. The first issue of
GigaScience includes a manuscript demonstrating the novel format of the journal. Stephan Beck’s group at the University College of London published an
article on methods for whole genome analyses of DNA methylation. This article includes a section marked “Availability of supporting data”. This section links to an entry in the article’s references which displays a
DOI link to the supporting data in GigaDB. The supporting data for this particular manuscript is 84 GB in size.
GigaDB will accept datasets up to 14 TB in size, so it is a valuable resource for authors who want to share their data or those who are required to do so by federal mandates such as the United States’
NIH Data Sharing Policy or the U.S.
NSF Data Management Plan Requirements. The combination of manuscript, supporting data and links to software or code to analyze the data will improve reproducibility of computational results in experiments involving large datasets, and will also improve the likelihood of new discoveries from repurposed/shared data. The journal also falls in line with the American
White House Office of Science and Technology Policy goals on access to "Big Data". Scientists in other countries (not just the U.S.) are also seeking means of linking data to manuscripts and are also seeking quality open access publication venues. The fact that
GigaScience’s support database is hosted by an international organization (BGI) attests to the universal importance and demand for such a publication model. Many readers may think that this journal is a genomics journal, but
GigaScience is meant to be more general than just genomics. The journal’s
Aims & Scope page states: “Our scope covers not just 'omic' type data and the fields of high-throughput biology currently serviced by large public repositories, but also the growing range of more difficult-to-access data, such as imaging, neuroscience, ecology, cohort data, systems biology and other new types of large-scale sharable data.” The first issue of
GigaScience also includes an editorial on GigaDB, a few commentaries, and an interesting
review on the future of DNA sequence archiving, which had already been accessed over 2300 times within two weeks of its publication. This manuscript advocates for different levels of DNA sequence compression for archiving purposes: with unique and expensive-to-reproduce data stored at a low compression rate, while easy-to-reproduce data is compressed at a higher factor. The first issue of GigaScience also includes an editorial piece by Jonathan Eisen, evolutionary biologist and Open Access advocate. The article is entitled
“Badomics words and the power and peril of the ome-meme”. It is simultaneously funny but true, and cautions the reader to avoid the creation and propagation of “badomes”: “omics” words that are nonsensical, misleading or just plain silly. Why is this journal important? It provides a valuable service, since many researchers may not have other means for online deposit of large datasets such as BAM files, FASTQ files and other large files produced by their experiments. Many other journals do provide access to supporting documents or supplemental files, but do not provide means for storage of files of such a large size. It is my hope that the journal proves successful and sustainable, and that other journals consider duplicating this service.