Sharing & Preserving Beautiful Graphs With Your Data

To see the interactive versions of these charts, head over to our blog. If you’re inclined to share, here is the link: http://blog.plot.ly/post/104937857072/sharing-preserving-beautiful-graphs-with-your

We painstakingly craft beautiful, complex, important graphs. Then the data is lost on an old hard drive, desktop, spreadsheet, or email. We can’t reproduce experiments or build on research. It may cast doubt on the research. Plotly solves this problem, uniting your data, graphs, and code online. Read on to learn more about lost data and how to preserve data with our free cloud-based product or Plotly on-premise, on your servers.

Two recent studies looked at how hard it is to track down data from published research. The authors of “The Dawn of Open Access to Phylogenetic Data” examined the publishing journal’s impact, looking at the influence on their ability to access partial datasets (top two plots) and complete datasets (bottom two) from online archives (left two) and asking for data (right two). The shaded sections show a 95% confidence interval. They concluded:

Generally, studies published in journals with a higher impact factor are more likely to both deposit the corresponding (partial or complete) datasets in online archives and to provide those data upon direct request.

Image for post
Image for post
See the interactive plot

In another study, data was sought from 516 studies between 2 and 22 years old. They concluded that:

  • The odds of a data set being reported as extant fell by 17% per year.
  • Broken e-mails and obsolete storage devices were the main obstacles to data sharing.
Image for post
Image for post
See the interactive plot

Thus age can serve as a predictor for data loss. Older papers also get more citations; they have had more time to accumulate citations. Newer papers are more likely to archive data.

Image for post
Image for post
See the interactive plot

The authors of the study plotted above also ran regressions for each publication year individually. They noted a bump in citations for papers that shared data.

The citation benefit was greatest for data published in 2004 and 2005, at about 30%.

Image for post
Image for post
See the interactive plot

Data behind published studies should be available, especially if government funding went into the research. The plot below from figshare shows a trend towards open access publication in the Web of Science, a scientific citation indexing service.

Image for post
Image for post
See the interactive plot

We’re excited to publish graphs, data, and code together, but publishing research and data together isn’t a new idea. The Journal des sçavans first published research and data together in 1665. A 1914 academic report published this figure advocating publishing data with figures.

Image for post
Image for post

Instead of emailing authors for data, we can jointly publish figures, data, and code to reproduce our work. As a recent blog post by Jure Triglav notes,

[T]he problem of the way we create scientific charts and figures should simply be recognized. The mistakes of flattening each figure and compressing it, mangling the data, converting vector illustrations into raster images — all of those should be recognized and addressed.

Tools at various levels of the analysis, visualization, and publication workflow are enabling collaboration and tapping into the potential of the web.

Image for post
Image for post

Let’s use the plot below to illustrate how Plotly solves the problem. We’ve added an HTML link to the source. Plots are interactively rendered with D3.js, a Javascript visualization library, and shared at a URL: https://plot.ly/~Dreamshot/640/. Our embedded figures in blogs, Notebooks, and websites can link back to our work using iframes. Iframes allow you to link to and embed live Plotly graphs throughout the internet on other pages.

<iframe width="710" height="750" frameborder="0" seamless="seamless" scrolling="no" src="https://plot.ly/~cimar/250.embed?width=710&height=750"></iframe>
Image for post
Image for post
See the interactive plot

From the URL, we can export the data or plot as a PDF, PNG, or SVG file to embed the plot in a paper, PowerPoint, or email. We can see code to make the plot using Python, MATLAB, R, and Julia. We can see the JavaScript Object Notation (JSON) description of the figure, a human-readable format Plotly uses to describe graphs. Anytime we share one of those elements of the plot — exported image, code, or interactive plot — we can include the URL and share the whole figure but highlight one part of it:

Proprietary, paid, or complex software that only runs on a particular OS or browser can bottleneck data and reproducibility. In Plotly, it’s free and easy. If someone wants to edit the plot or data in Plotly, they can do so without downloading, installing, or paying. It’s all online (or on-premise if you use Plotly on your servers). Here are a few screenshots showing how it works.

Image for post
Image for post
Image for post
Image for post
Image for post
Image for post

Plotly is bringing powerful scientific and technical computing tools to the web. Our interoperable APIsfor Python, R, MATLAB, Julia, and Excel let you and your team collaborate, live-stream data, make plots with LaTeX, and craft 3D plots.

We can use LaTeX and add fractions, equations, and symbols into our plots by wrapping them in a $ sign. Here is a tutorial.

Image for post
Image for post
See the interactive plot

Our next plot shows a streaming 3D plot. Made in an IPython Notebook, the plot simulates a chaotic solution to the Lorenz system, also known as the Lorenz attractor or butterfly. You can stream new data to your plots every day, every minute, or every 50 ms. You and your team can see the same updated stream of data and graphs.

Image for post
Image for post
See the interactive plot

For more, see our collection of 3D plots.

Image for post
Image for post

We’re @plotlygraphs and on GitHub. We welcome your feedback, thoughts, and suggestions. Many thanks to Marieke Guy of OKFN and Grant R. Vousden-Dishington for helpful suggestions.

Written by

The leading front-end for ML & data science models in Python, R, and Julia.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store