At JSTOR, our mission is to improve access to knowledge and education for people around the world. We’re excited to share several new initiatives leveraging cutting-edge technology to make scholarly materials more accessible, interactive, and engaging for our users. Here’s a detailed look at the latest advancements we recently shared at ALA 2024 in San Diego.

Project Odyssey: Revolutionizing digital collection stewardship with AI

British Tabulating Machine Company Limited. Hollerith 45 Column Horizontal Electrical Sorting Machine, 1920-1930. 1920-1930. Science Museum.

Building on our growing suite of services to support digital collection stewardship, This innovative ITHAKA initiative, currently under the temporary working name of Project Odyssey, explores the use generative AI to assist libraries and archives in processing their digital collections. This project, still in its proof-of-concept stage, aims to address the significant bottleneck in creating descriptive metadata, which is crucial for preserving and making collections discoverable. By automating the initial stages of metadata generation, Project Odyssey can drastically reduce the time and resources required for librarians and archivists to process large collections. Early testing has shown that this tool can process documents in a fraction of the time it would take a human, thus potentially transforming the way digital collections are managed and accessed​.

The project emerged from extensive research and interviews with over 60 library deans and directors worldwide. These discussions identified the primary challenges in digital collections stewardship, including the labor-intensive process of creating descriptive metadata and the substantial backlog of unprocessed materials. On average, less than 5% of print archives and special collections are digitized. Project Odyssey aims to unlock the potential of these collections by making them more accessible and discoverable. By using AI to support the initial processing work, libraries can focus on enhancing metadata, preserving collections, and ultimately increasing the accessibility of valuable scholarly materials​​.

Jason Przybylski, speaking at the ALA conference, emphasized the significance of this initiative by demonstrating the tool’s capabilities. He highlighted that Project Odyssey could process 1,500 documents in just under three hours, a task that would typically take around 250 hours if done manually. This drastic improvement not only addresses the backlog issue but also allows librarians and archivists to focus on more advanced metadata creation and preservation tasks. The goal is to shift the focus from tedious manual processing to higher-order work, ultimately enhancing the preservation, discovery and accessibility of digital collections​​.

JSTOR’s interactive research tool (beta): Enhancing research with AI

JSTOR’s interactive research tool (beta) is designed to revolutionize the way users interact with scholarly content. Leveraging advanced technologies such as OpenAI’s GPT-3.5 and Anthropic’s Claude 3 Haiku, the tool provides customized summaries, recommends related topics and content, and answers user questions based on the content being viewed. This AI-powered tool aims to improve the experience of evaluating and understanding content, making it easier for researchers to find relevant information quickly. Currently available for journal articles, book chapters, and research reports, the tool’s capabilities are continually expanding based on user feedback and ongoing development​​.

Beth LaPensee, a Senior Product Manager at ITHAKA, described the development process of this tool, emphasizing the guiding principles that ensured it supported and strengthened research abilities rather than detracting from them. The tool helps users by providing contextual summaries, inline references, and interactive features that allow users to ask questions and receive detailed answers based solely on the document they are viewing. This capability transforms how novice researchers and experienced scholars alike engage with academic texts, enabling deeper understanding and more efficient research processes​​. Safeguarding the content on JSTOR was also key to the development of the tool – to that end, content is only processed internally and with the OpenAI API, whose data security practices guarantee it is only temporarily stored and not used for training.

Feedback from beta users has been overwhelmingly positive. Over 23,000 active users from 5,500+ unique institutions across 149 countries have engaged with the tool, appreciating its ability to summarize content, provide contextual information, and suggest further reading. For novice researchers, this means gaining a deeper understanding of texts, while experienced researchers can discover new conceptual relationships across different subjects. The tool’s design ensures that users can trace back answers to their original sources, maintaining transparency and trustworthiness​.

Constellate: Taking research to new heights with text analysis

Constellate, ITHAKA’s text analysis teaching and learning platform, is designed to help educators and students develop essential data skills. This platform provides access to a large corpus of scholarly content and integrates powerful tools for text analysis, allowing users to build datasets, analyze data using Python or R, and access a wealth of tutorials and webinars. Constellate supports learners at all stages, from beginners to advanced users, and offers reusable and editable learning materials that facilitate both teaching and research. By empowering users with these tools, Constellate aims to bridge the gap between data literacy and academic research.

Amy Gay, Senior Digital Humanities Outreach Manager, highlighted how Constellate addresses the varying needs of learners by offering bite-sized learning modules and comprehensive educational resources. These include key concept tutorials that do not require coding skills, allowing users to learn text analysis concepts through an intuitive interface. Additionally, Constellate provides classes and webinars on fundamental and advanced topics, creating a collaborative environment where educators and students can continuously enhance their skills. This approach ensures that users can effectively integrate text analysis into their research and teaching practices.

The platform also features a collaborative community space, where educators and students can engage in ongoing discussions, share insights, and support each other’s learning journeys. This community-driven approach fosters a supportive environment, making it easier for users to navigate the complexities of text analysis. Constellate’s integration of JupyterLab for Python and RStudio for R in a cloud-based environment further enhances its utility, providing a comprehensive toolkit for data analysis and research.

Hypothesis and JSTOR: Strengthening engagement with social annotation

In collaboration with Hypothesis, JSTOR offers LMS-integrated social annotation capabilities, transforming how users interact with digital content. Social annotation allows users to add interactive margin notes, highlights, and comments, turning solitary reading into a collaborative experience. This feature fosters deeper understanding and engagement with course materials by enabling real-time discussions and collaborative reading. Students and educators can highlight text, ask questions, and share insights, making learning more interactive and engaging​​.

Joe Ferraro, CEO of Hypothesis, explained how social annotation bridges the gap between static content and interactive learning. By integrating Hypothesis with JSTOR, students engage with course materials more frequently and meaningfully, which has been shown to improve comprehension and retention. Furthermore, by transforming static content into interactive learning experiences, Hypothesis fosters a deeper connection to the material and enhances overall academic performance​.

See for yourself

At JSTOR, we are committed to leveraging technology to enhance the accessibility and engagement of scholarly materials. Whether through AI-driven research tools, text analysis platforms, or social annotation capabilities, our initiatives aim to support the evolving needs of researchers, educators, and students. We invite you to explore these tools and join us in shaping the future of digital learning and research.

  • Sign up for JSTOR updates to stay informed on Project Odyssey, and explore our growing suite of services to support digital collection stewardship.
  • Sign up to try our interactive generative AI-powered research tool in beta.
  • Teach, learn, and perform text analysis with Constellate.
  • Learn how to use Hypothesis with JSTOR in your LMS.

Together, we can continue to break down barriers to knowledge and foster a richer academic experience.

Note: Project Odyssey is a working name for the project exploration only. The final name and branding will align with ITHAKA’s infrastructure services after research and discovery are complete.

Related content

About the author

Victoria Spitz is the Senior Digital Marketing Manager at ITHAKA, where she champions the organization’s mission to expand access to knowledge and education globally. With degrees in public humanities, focusing on nonprofit marketing and museum education, and the history of art and architecture, Victoria brings a unique blend of expertise to her role. As a first-generation college student, she is deeply committed to ITHAKA’s mission, which resonates strongly with her personal and professional values.