GigaScience: open access to manuscripts plus datasets and codes!

General Public

On July 12, a new peer-reviewed open access journal, titled GigaScience, published its first set of articles. GigaScience represents a novel format for science journals: manuscripts in GigaScience are published with links to software tools used for analysis and to corresponding datasets in the journal’s integrated database GigaDB, allowing authors to store and share huge amounts of data.


GigaScience, a novel, peer-reviewed open access journal
The journal is a collaboration between BGI Shenzhen—the world’s largest genomics center— and the open access publisher, BioMed Central. The first issue of GigaScience includes a manuscript demonstrating the novel format of the journal. Stephan Beck’s group at the University College of London published an article on methods for whole genome analyses of DNA methylation. This article includes a section marked “Availability of supporting data”. This section links to an entry in the article’s references which displays a DOI link to the supporting data in GigaDB. The supporting data for this particular manuscript is 84 GB in size. GigaDB will accept datasets up to 14 TB in size, so it is a valuable resource for authors who want to share their data or those who are required to do so by federal mandates such as the United States’ NIH Data Sharing Policy or the U.S. NSF Data Management Plan Requirements. The combination of manuscript, supporting data and links to software or code to analyze the data will improve reproducibility of computational results in experiments involving large datasets, and will also improve the likelihood of new discoveries from repurposed/shared data. The journal also falls in line with the American White House Office of Science and Technology Policy goals on access to "Big Data". Scientists in other countries (not just the U.S.) are also seeking means of linking data to manuscripts and are also seeking quality open access publication venues. The fact that GigaScience’s support database is hosted by an international organization (BGI) attests to the universal importance and demand for such a publication model. Many readers may think that this journal is a genomics journal, but GigaScience is meant to be more general than just genomics. The journal’s Aims & Scope page states: “Our scope covers not just 'omic' type data and the fields of high-throughput biology currently serviced by large public repositories, but also the growing range of more difficult-to-access data, such as imaging, neuroscience, ecology, cohort data, systems biology and other new types of large-scale sharable data.” The first issue of GigaScience also includes an editorial on GigaDB, a few commentaries, and an interesting review on the future of DNA sequence archiving, which had already been accessed over 2300 times within two weeks of its publication.  This manuscript advocates for different levels of DNA sequence compression for archiving purposes: with unique and expensive-to-reproduce data stored at a low compression rate, while easy-to-reproduce data is compressed at a higher factor.  The first issue of GigaScience also includes an editorial piece by Jonathan Eisen, evolutionary biologist and Open Access advocate.  The article is entitled “Badomics words and the power and peril of the ome-meme”.  It is simultaneously funny but true, and cautions the reader to avoid the creation and propagation of “badomes”: “omics” words that are nonsensical, misleading or just plain silly. Why is this journal important?  It provides a valuable service, since many researchers may not have other means for online deposit of large datasets such as BAM files, FASTQ files and other large files produced by their experiments. Many other journals do provide access to supporting documents or supplemental files, but do not provide means for storage of files of such a large size. It is my hope that the journal proves successful and sustainable, and that other journals consider duplicating this service.  

About the Author

Pamela L. Shaw (@Bioscibrarian) is a Biosciences Librarian at Northwestern University in Chicago, IL. She applies her research background to support basic science research at the Feinberg School of Medicine. A former lab technician with almost 20 years of bench experience, she understands the process of scientific research and publication and is deeply interested in communication technologies. She writes a Biosciences Blog with news and resources of interest to the biosciences and research community.

To know more about GigaScience, see:

Pamela L. Shaw (2012). GigaScience: open access to manuscripts plus datasets and codes! MyScienceWork

There are no comments for this post. Be the first to comment...
Comment on this article