Extracting Tables in Sysrev

Sysrev "Group" Labels allow reviewers to extract spreadsheet like data from documents


What is a Group Label?

Put simply, a Group Label is a label-of-labels.  Group Labels allow reviewers to extract multiple sets of information on an as-needed basis.  

If you imagine labels as the column headers in a table, then the output of most reviews is a table correlating the extracted data to its source document.  But what happens when a single source document has too much information for one row in the table to capture.  While such techniques as comma-delimited data are an option, it introduces another complexity to formatting.  This can be particularly cumbersome in large, collaborative projects.  

Group Labels allows reviewers to extract multiple rows of data from a single source document.

Creating a Group Label

Screenshot of new Manage Labels UI. A new Group Label "Health Effects" has been created

The functionality for creating a Group Label is a direct extension of existing label functionality.  Once Add Group Label has been selected, a new label dialogue box will appear.  Within the Group Label, users can create as many Boolean, Categorial, and String labels as the task requires.  

Review & Data Extraction

Screenshot of Project 31871 Review Dashboard

Shown above is a the Review Dashboard for Project 31871, which contains two Group Labels: "Health Effects" and "LD50/LC50".  Once a Group Label has been selected, a Table UI will appear above the document.

Screenshot of Project 31871 Review Dashboard with partial data extraction of "Health Effects 2"

As data is extracted, the Table UI will automatically update.  In this way, reviewers are able to see their data in real time, increasing their ability to fix any irregularities.  Switching between Group Labels will change which Table is displayed.

Screenshot of Project 31871 Review Dashboard with partial data extraction of "LD50/LC50 1"

Data

Once a document has been reviewed, one can easily look at the data on the individual article's page.  See how the demo's extracted data for an example. Note, you will have to join the project to see the underlying PDF.  

Export Options

Two methods can be used to export group label data.  First, the JSON export captures all of the project’s data in a json format. Alternatively, you can export individual Group Labels as CSVs. In our example, this means I can export the data associated with either the Health Effects OR LD50/LC50 Group Label, but not both at once.

New Group Label specific export options. Top: JSON full project data export. Bottom: CSV export of individual Group Labels.

Subscribe to Sysrev Blog

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe