Pioneering the future of digital collections stewardship: Highlights from Charleston 2024
At the Charleston Conference 2024, JSTOR presented its vision for managing and preserving digital collections into the future. “Not Just Another AI Session” highlighted how cutting-edge tools—including JSTOR’s digital collection processing prototype and interactive research tool (beta)—can transform access to archival materials while staying true to JSTOR’s nonprofit mission.
Libraries worldwide face overwhelming backlogs of digital collections. “Not Just Another AI Session” highlighted the potential of innovations in digital collections stewardship to bridge gaps and amplify the value of rare collections.
Guiding principles for the integration of AI and other advanced technologies
JSTOR’s application of AI and other advanced technologies is grounded in values that prioritize thoughtful implementation and collaboration with the academic community. JSTOR views technology as a tool to enhance the human aspects of research and education. Its approach emphasizes empowering archivists, researchers, and educators, rather than replacing their expertise.
Ensuring equitable access to tools and bridging the digital divide are fundamental priorities. JSTOR is particularly focused on avoiding the creation of an “AI divide” that could deepen existing inequities in academia. By iteratively developing and testing tools in collaboration with the scholarly community, JSTOR builds systems that are aligned with its mission to support education and discovery.
Tackling challenges in digital stewardship
As noted above, libraries around the world face overwhelming backlogs of digital collections, with up to 95% of materials remaining unprocessed. Rebecca Seger, JSTOR’s Vice President of Institutional Participation, described how this problem is compounded by fragmented workflows, limited resources, and the rapid obsolescence of file formats. Many institutions lack the infrastructure to process their collections, relying on piecemeal tools or a lone staff member for digitization and metadata creation.
JSTOR’s research, which included more than 100 in-depth interviews with library leaders, revealed that some libraries measure their backlogs in decades, not months.Without efficient workflows to address these issues, materials often languish on shelves or hard drives, unavailable to the communities that could benefit most from their use.
This urgent need inspired JSTOR to develop tools that target these problems earlier in the workflow, from metadata creation to long-term preservation and discovery.
JSTOR’s suite of solutions
To address these challenges, JSTOR introduced a suite of tools designed to streamline workflows and expand access to digital collections:
- JSTOR Forum: A flexible, cloud-based platform for cataloging, managing, and supporting access to digital collections.
- Digital collection processing prototype: A generative AI-powered tool that accelerates metadata creation, helping clear backlogs and improve efficiency.
- Preserved Collections: Powered by Portico, this service ensures the long-term preservation of digital materials while safeguarding their accessibility.
- Interactive research tool: A tool that enhances user engagement by integrating AI-driven features to facilitate discovery and learning.
These tools work individually and cohesively to create a comprehensive support system for libraries’ digital stewardship efforts.
The digital collection processing prototype
Jason Przybylski, JSTOR’s Lead Manager, Infrastructure Services Outreach, presented the digital collection processing prototype, which leverages generative AI to help process, organize, and create descriptive metadata for digital materials. This tool represents a significant advancement in addressing the metadata bottleneck, offering a fast and scalable solution to process thousands of items in hours rather than months.
Unlike traditional workflows, the prototype generates item-level metadata fields, assigns confidence scores, and provides summaries of entire collections in hours or even minutes. This functionality enables librarians and archivists to incorporate the tool into their workflows and make informed decisions about their collections with greater efficiency. The tool can even process complex materials, such as handwritten documents or annotated images, providing meaningful context and insights that amplify the expertise of those handling the collections.
By assisting with processing materials and creating high-quality metadata, the tool accelerates the digital collection workflow, paving the way for these collections to be preserved and made accessible to broader audiences.
JSTOR’s interactive research tool
Beth LaPensee, JSTOR’s Principal Product Manager, introduced JSTOR’s interactive research tool, which was developed to to change how users interact with scholarly content. Currently focused on secondary sources like journal articles and book chapters, the tool has helped users evaluate, understand, and discover materials more effectively.
Key features include:
- Contextual overviews and insights: The tool surfaces key points and arguments from texts to support content relevance assessment, and describes how users’ search terms relate to selected documents.
- Topic recommendations: Keywords and concepts to help users explore their research topics further.
- Related content discovery: A list of similar documents based on conceptual connections, enabling users to deepen their understanding.
- Conversational interaction: Users can engage directly with documents by asking questions, receiving answers tied to specific sections of the text.
Since its beta launch in August 2023, the tool has attracted over 40,000 users across 150 countries, with nearly a million interactions. Its ability to support novice researchers and experienced academics has made it a vital resource in the research process. Soon, the tool will be extended to text-based primary sources, then additional primary source formats over time. Building on the success seen with secondary sources, the goal is to enable users to engage more deeply with primary sources while seamlessly connecting across content types.
Building the future of digital access
JSTOR aspires to do more than simply create tools, though. We aim to unlock the full value of archival materials. Kevin Guthrie, President, ITHAKA, likened this moment to JSTOR’s early days, when digitizing journal back issues transformed their accessibility and perceived value. Similarly, these AI-driven tools promise to revolutionize access to primary source materials, allowing libraries to make previously inaccessible collections discoverable at an unprecedented scale.
The transformative potential of these tools lies in their ability to save time and resources while amplifying the impact of collections. Whether it’s reducing the time to process 1,500 documents from months to mere hours, or enabling deeper engagement with scholarly texts, JSTOR’s innovations are reshaping the possibilities of digital stewardship.
Looking ahead
JSTOR’s work is a testament to the power of thoughtful, community-driven innovation. By addressing critical challenges in digital collections management and introducing tools like the digital collection processing prototype and interactive research tool, JSTOR helps libraries, researchers, and educators expand opportunities for preservation and discovery.
As JSTOR continues to refine these tools, we invite your feedback and collaboration. Institutional decision-makers interested in exploring or contributing to these initiatives are encouraged to request early access.
About the author
Maria Papadouris is a Content and Community Engagement Manager at ITHAKA, where she works on bringing the JSTOR community together under the common goal of championing access to knowledge (and having a fun time doing it!). A first-generation Greek American and first-generation college student, Maria studied political science and creative writing, bringing an interdisciplinary approach to issues in the humanities. She is also looking to pursue graduate studies in English literature.