Arctic Data Explorer

Problem

Before the Arctic Data Explorer (ADE), researchers needed to know which datasets were in which repositories. Since datasets are not discoverable through tools like Google, scientists and graduate students had to do considerable legwork to find and share reusable datasets. In fact, scientists often spent months planning data collection expeditions just to get their hands on data that already existed, but couldn’t be found.

Constraints

  • All tools and code must be open source
  • The stack must be shared with the internal search tool at the National Snow and Ice Data Center (NSIDC)
  • The repositories must have a microservice or API feed that meets minimum metadata requirements (title, summary, link to data, and geolocation coverage)
  • The search interface must be scalable to handle hundreds of thousands of datasets from several dozen repositories
screen-shot-2015-03-30-at-2-43-28-pm
Architecture – midway through the project

Team

Leadership Team – Mark Parsons, Lynn Yarmey

Development Team – Brendan Billingsley, Jonathan Kovarik, Stuart Reed, Michael Brandt, Chris Chalstrom,  Kate Heightley, Matt Savoie, and Danielle Harper

I played multiple roles on this product. Originally my role was as UX Researcher, but eventually I added the role of tactical Product Owner.

Process

screen-shot-2016-07-06-at-4-37-14-pmTesting began with the ACADIS Advisory Committee in 2012. We used card sorting to determine the most important searchable aspects of a dataset then built requirements around them.

2013 was filled with pre-launch research of semi-structured interviews, contextual inquiry, surveys, and an A/B test. Each round resulted in developer stories that improved aesthetics, map interaction, and basic search functionality.

The post-launch research in 2014 included a heuristic evaluation, online semi-structured interviews with our international users, and keyword search analytics through Libre and Google Analytics. The results were improved error handling, better documentation (especially the internal API Swagger implementation), greater scan-ability of search results, and a redesign of search facet options.
screen-shot-2015-03-24-at-9-14-23-am

Key Findings

Ideally, we wanted the feed from the repository to be automatically parsed. Our intent was to use open source software called GI-CAT, but the implementation ultimately slowed down the site and make the search tool prone to sluggishness and bugs. In 2014, we decided to remove GI-CAT and use internal data translators (and sometimes manual labor from devs) to transform data feeds into formats we needed. Instead, we documented our APIs and microservices with Swagger and waited for technology to catch up to where we wanted it to be.

peoplequotes
Visualization of statements from Contextual Inquiry and Think Aloud interviews

Results

A more detailed account of the process can be found on the NSIDC website. You may also be interested in the FAQs.

The image below is an overview poster presented at the Research Data Alliance plenary in 2014. Other successes the Arctic Data Explorer can boast about are that the National Science Foundation used the software exclusively for an international data visualization workshop, the Jet Propulsion Laboratory and CalTech uses it for software development classes, it is an early adopter of data and software Digital Object Identifiers (DOI), and it is fully open source software including a Ruby gem.

screen-shot-2015-03-25-at-3-56-12-am

What are Scientists Saying about ADE?

  • The variety of data available are “unique” and “gives ideas of things to research next”
  • “What I’ve found is great. This is something I would have had a hard time finding. And to note, I didn’t even know this data was floating around anywhere!”
  • “Being able to scan the spatial coverage of each dataset is very exciting especially since I didn’t even do a spatial search… Also, the map image broke up the visual and need to read all that text.”
  • The Arctic Data Explorer “instantly brought up some stuff for me that I wouldn’t have even bothered to try to look for online because I wouldn’t have had the first idea. See, I don’t know what the Earth Observing Laboratory is. It’s not a repository I’m familiar with, so I wouldn’t have gone looking for this data. I wouldn’t assume that it was out there anywhere.”
  • The results of my search were “much more comprehensive than I was looking for – in a good way!”
  • “Most of the time, I have to find data through the rumor mill and pound the pavement making calls and emails. It seems most data I’m looking for or either not released or just released. Without a network, I wouldn’t be able to find the data I need.”
  • The Arctic Data Explorer “is so much more inclusive (than other ways of searching) – I can find things without going all over. One place is nice.”
  • “It’s incredible that you guys are doing this!”

slide1

newacadisbannerlogo3

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s