Researchers are more and more concerned that the absence of reproducibility in research may result in, among other things, inaccuracies that decelerate scientific output and reduce public trust in science.
Currently, a group of scientists expresses that developing a prediction market, where artificially intelligent (AI) agents deliver predictions — or bets — on supposed replication studies, could result in an explainable, scalable method to predict confidence in published academic work.
Replication of studies and experiments, a vital step in the scientific process, helps give confidence in the outcomes and specifies whether they can generalize across contexts, according to Penn State University’s (PSU) assistant professor in information sciences and technology, Sarah Rajtmajer.
As experiments are becoming more multifaceted, expensive and laborious, researchers progressively lack the resources for strong replication efforts — what is frequently referred to as a "replication crisis."
As scientists, we want to do work, and we want to know that our work is good. Our approach to help address the replication crisis is to use AI to help predict whether a finding would replicate if repeated and why.
Sarah Rajtmajer, Assistant Professor in Information Sciences and Technology, Penn State University
Rajtmajer is also a research associate at the Rock Ethics Institute and an Institute for Computational and Data Sciences associate.
Crowdsourced prediction markets can be compared to betting shops to help estimate everyday life events, instead of football game or horse race results. These markets are already in use to help predict a wide range of things from elections to infectious virus spreads.
What inspired us was the success of prediction markets in precisely this task — that is, when you place researchers in a market and give them some cash to bet on outcomes of replications, they’re pretty good at it. But human-run prediction markets are expensive and slow. And ideally, you should run replications in parallel to the market so there is some ground truth on which researchers are betting. It just doesn’t scale.
Sarah Rajtmajer, Assistant Professor in Information Sciences and Technology, Penn State University
A bot-based method scales and provides certain explainability of its findings according to trading patterns and the characteristics of the articles and claims that affected the behavior of the bots.
In the team’s method, bots are taught to identify core features in scholarly research articles — such as the statistics, authors and institutions, and downstream mentions, linguistic cues, and parallel studies in the literature — and then make evaluations about the confidence that the study is adequately strong to replicate in forthcoming studies.
Similar to a human betting on the results of a sporting event, the bot places a bet on its level of confidence. The AI-driven bots’ outcomes are akin to the bets placed in human predictions.
C. Lee Giles, the David Reese Professor at the College of Information Sciences and Technology, PSU, said that although prediction markets based on human contributors are popular and have been used effectively in a number of fields, prediction markets are unique in scrutinizing research outcomes.
“That's probably the interesting and unique thing we're doing here,” said Giles, who is also an ICDS associate. “We have already seen that humans are pretty good at using prediction markets. But, here, we're using bots for our market, which is a little unusual and sort of fun.”
According to the scientists, who reported their results at a recent conference for the Association for the Advancement of Artificial Intelligence, the system offered confidence scores for around 68 of the 192 papers — or around 35% — of the articles that were in due course reproduced, or ground truth replication research. On that set of articles, the correctness was around 90%.
Since humans tend to perform better at predicting research reproducibility, and bots can perform at scale, Giles and Rajtmajer propose that a hybrid method — human and bots partnering together — may provide the best of both worlds: a system that would comprise higher accurateness but stay scalable.
Maybe we can train the bots in the presence of human traders every so often, and then deploy them offline when we need a quick result, or when we need replication efforts at scale. Moreover, we can create bot markets that also leverage that intangible human wisdom. That is something we are working on right now.
Sarah Rajtmajer, Assistant Professor in Information Sciences and Technology, Penn State University
PIs on the project are as follows: Christopher Griffin, Applied Research Laboratory; James Caverlee; professor of computer science at Texas A&M; Jian Wu, assistant professor in computer science at Old Dominion University; Anthony Kwasnica, professor of business economics; Anna Squicciarini, Frymoyer Chair in Information Sciences and Technology; and David Pennock, director of DIMACS and professor of computer science, Rutgers University.
The study received funding from DARPA’s Systematizing Confidence in Open Research and Evidence (SCORE) program.
Source: https://www.psu.edu