Academic Torrents: Bringing P2P Technology to the Academic World

Knowledge dissemination made incredibly easy

Knowledge dissemination is both a trending and a challenging topic. When it comes to articles, the best solution for publishing and diffusing them is still not very clear.  As for data sharing, that comes with its own set of hurdles. Help may have arrived in the form of Academic Torrents, born of two computer science PhD students, Joseph Paul Cohen and Henry Z Lo, who have chosen to apply the widely known Peer2Peer technology (P2P) and, more specifically, BitTorrents to academic production.

Knowledge dissemination is both a trending and a challenging topic. When it comes to articles, the best solution for publishing and diffusing them is still not very clear.  As for data sharing, that comes with its own set of hurdles. Help may have arrived in the form of Academic Torrents, born of two computer science PhD students, Joseph Paul Cohen and Henry Z Lo, who have chosen to apply the widely known Peer2Peer technology (P2P) and, more specifically, BitTorrents to academic production.

Image: academictorrents.com

 

Wondering what could be done to make knowledge dissemination easier, Joseph Paul Cohen thought about using P2P to leverage the means within reach of researchers to share their content. Usually, sharing the products of research requires servers, big, institutional software, or reliance on third parties. This can cost a lot and has numerous flaws: How sure is it that your paper will be available tomorrow? Bandwidth can be limited, and what guarantees that your website will be available without interruption? After discussing with his colleague Henry Z Lo, they realized that their solution could help with disseminating articles, and fit the needs of data sharing, as well.

While a paper can take up a few megabytes, 100 MB of research data is a minimum for most fields and reaching gigabytes is common. For more specific fields or applications, like genetic research, space studies or physics, terabytes of data is the minimum. Even in the age of big data, you still need to get a copy of the files before analyzing the data. "Sending files over a hard drive is still common," comments Joseph Paul Cohen. P2P is about letting peers exchange part of a file they are downloading (or that they have already downloaded). This fundamental point enables very fast information transmission and avoids having only one server supporting downloads that can take days, or more. Moreover, even if we hope, for the scientific community, that the project will last a very long time, the Academic Torrents website does not even need to be online to let users download data, since there is no single point of failure in such a P2P system.

Image: academictorrents.com

The idea for sharing data among peers was already there with a website called BioTorrents, which the founders of Academic Torrents discovered after they had started their development. BioTorrents didn't seem to have raised enough interest to become a sustainable project; only one dataset every year seems to be uploaded. The new idea, born in November 2013, is much broader, targeting both publications and research data, and has already received much interest from the community. Following the founders’ announcement, a post on Hacker News attracted the attention of 50,000 unique visitors, in three days, who downloaded 200 TB of data (the equivalent of more than 300,000,000 average-sized MP3s)—a volume that has been easily handled by Academic Torrents, using simply a $25-per-month hosting service.

While the project is still a newborn of the tech world, the founders, who have done all this without funding up to now (apart from their PhD grant), have encountered what seems to be the beginning of a great success. Although they have met with no legal problems so far, they even thought about this concern, mentioning that all submissions should be clear about their license and contain a "reshare" option.

Last week, MyScienceWork was live tweeting from the #dataGFII day, where one of the key points about open research data was the need to provide resilient means of storing data. Maybe Academic Torrents offers part of the answer, since it provides an infrastructure to solve this resiliency problem; the storage itself remains to be provided by peers.