JSTOR Enabled Data Mining Project Signals Next Wave in Research
A team of researchers led by Jevin West and Carl Bergstrom of the University of Washington released today the results of an 18-month long study of gender inequality among authors of academic papers. The study is based on an analysis of the authors of more than 1.8 million published research articles available through the not-for-profit digital library, JSTOR.
This project exemplifies the kind of research made possible by new digital technologies that JSTOR has supported for more than a decade and that was first publicized in 1999 by the work of Yale University legal scholar and law librarian Fred Shapiro. Shapiro used data from JSTOR to document first uses of words that pre-dated the Oxford English Dictionary.
Fast forward to 2008 when JSTOR launched its self-service Data for Research website enabling anyone in the world to explore its holdings and to freely create datasets for use in their research. Today the site sees about 700 datasets created and downloaded annually. Larger scale projects like the one undertaken by West, Bergstrom and their co-authors: Jennifer Jacquet, Molly King, Shelley Correll, and Theodore Bergstrom are handled upon request and in close collaboration with JSTOR’s Advanced Technologies Research team.
“It’s beyond exciting to see the digital library we have spent years creating being tapped into by computer scientists, digital humanists, and other researchers around the world,” said Ronald Snyder, Director of the Advanced Technologies Research team.
"By providing us information about millions of papers published over centuries, these data allow us to ask questions about the structure of scholarly communication on unprecedented scales,” said Bergstrom.” “ We see the gender project as just the beginning,” added West. “The data really is a gold mine, and we are excited to continue to work with JSTOR and utilize this powerful research environment."
While the research itself is ground-breaking, the benefits of projects like the one just released by the West-Bergstrom team can reach beyond the findings themselves. The West-Bergstrom team also created an interactive tool that allows others to explore the underlying content based on the work they have done. This demonstrates how sharing large corpora of data can also lead to the creation of new ways of exploring and discovery scholarship – effectively giving researchers another lens through which to view the published literature.
“Enabling new scholarship that was previously impossible, or nearly so, is at the very heart of our mission to advance education through the use of new technologies,” said Laura Brown, JSTOR Managing Director. “As more scholars and students across disciplines are trained in data mining and textual analysis, we look forward to supporting and advancing their work through our Data for Research Program.”