How do you know who to trust in a review?

Sysrev just launched a new analytics dashboard.  This is still a work in progress, but if you like to play with new toys (and don't mind when they break) then feel free to poke around!  Dashboards are only available for public projects and follow the url pattern analytics.sysrev.com/?project=<project_id>.

The analytics dashboard for the NIEHS Hallmarks Project (sysrev.com/p/3588) can be found at analytics.sysrev.com/?project=3588.

The first tab on this dashboard is "concordance".  Concordance is a measure of agreement between reviewers.  The inclusion IRR statistic (upper left) tracks how often reviewers agree with each other for article screening.  A value of 50% is random, 100% is perfect.  Concordance values can also be seen for each label with more than 50 double reviewed articles.  If you click on a label you can view how often concordant answers are given for each possible label value (lower right).  

Clicking on a label will populate the inclusion concordance plots that are farther down on the dashboard.   I selected the gold "include" label above to evaluate user inclusion concordance.

Scrolling down the page on analytics.sysrev.com/?project=3588 shows user concordance values for the selected label (previous figure).

Calculation

Concordance is calculated for each user by:

  1. R = Set of articles reviewed by the user and at least one other reviewer.
  2. C = Subset of R where all reviewers agreed
  3. Concordance = count(C) / count(R)

Interpretation

The naive interpretation of concordance is that users who frequently disagree with others are not to be trusted.  Things get more complicated when you consider priors on trust.  

If you are the administrator of a review then you might trust your own reviews more than your collaborators.  So what does it mean if you have a low concordance?  Are you untrustworthy?  Are all your collaborators untrustworthy?

Clearly better metrics are needed to capture trust.  This is Sysrev's first visualization of reviewer trust.  If you have an idea for better metrics or features to help manage reviewer trust let us know!