Accelerate impact with purpose-built AI for stewardship
JSTOR Seeklight helps you advance collections processing responsibly—from description to transcription and analysis—while keeping your expertise at the center. Co-created and used by librarians and archivists, it helps you enhance discovery and accessibility for your unique collections.

A tool for practitioners
Pair your expertise with JSTOR Seeklight’s advanced technology to strengthen and streamline workflows while keeping you in control. Created by JSTOR—a mission-driven nonprofit—and recognized with the Society of American Archivists’ C.F.W. Coker Award for innovation in archival practice, it empowers you to fast-track impact while ensuring accurate and accountable processing and publishing.
- Surface items for review with tags for AI-generated data
- Edit and approve outputs with intuitive, integrated workflows
- Keep data protected with built-in, privacy-first practices
Created through close collaboration with our charter participants, including:
Transform discoverability with transcripts for text-based materials
- Enable full-text search and discovery for handwritten, typed, and mixed-media items
- Support accessibility with screen-reader-ready transcripts
- Stay in control by editing or reprocessing any time


Describe collections at scale to move from intake to impact faster
- Create structured, standards-aligned descriptive metadata at the item level
- Efficiently batch-process images, texts, or compound objects across multiple formats
- Edit with integrated review workflows—prioritize work with confidence scores and track progress with tags
Surface collection insights to plan, scope, and describe efficiently
- Identify themes and patterns across collections with concise, structured summaries
- Export insights to enrich finding aids and reports
- Regenerate summaries as collections evolve

Join our community
Collaborate with a global network of institutions shaping JSTOR Seeklight through the JSTOR Digital Stewardship Services Tier 3 charter program. Work alongside your peers and the JSTOR team to test new capabilities, share real-world insights, and guide priorities for the responsible use of AI in digital stewardship.
- Join working groups focused on key topics
- Share outcomes and practices across our community
- Inform product evolution with insights from the field

Reflections from the community
Explore how JSTOR Seeklight is evolving with our community
Frequently asked questions
How does JSTOR Seeklight work?
JSTOR Seeklight is our AI-powered technology that accelerates digital collections processing with tools for metadata generation, text transcription, and collection-level insights. Built on JSTOR’s trusted infrastructure and engineering expertise, and informed by deep community collaboration, JSTOR Seeklight features integrated workflows that ensure expert review and institutional control.
File analysis and grouping
Upon upload, JSTOR Seeklight analyzes filename syntax and folder structure to determine if any of the files are meant to be grouped together into compound objects. If so, it will group the requisite files together and describe them as one singular item.
Classification and routing
Next, JSTOR Seeklight intelligently sorts materials into precise categories and directs each one to the technologies and workflows that can extract the richest insights. Images are handled differently from text, and mixed media follows its own path. Even within text files, processing varies: legal documents demand one approach, while student periodicals require another—each type receives specialized treatment to maximize the information uncovered.
The classification and routing step is essential in ensuring that JSTOR Seeklight has the right instructional context and specialized prompts to generate high-quality descriptive metadata. This approach ensures that AI is applied with the care of a real-world archivist and optimized to account for the nuance of a diverse array of materials.
Processing workflow
Metadata is generated across a core set of descriptive metadata fields (built off of Dublin Core) and presented to the user for review. Users retain full control, refining AI-generated suggestions from JSTOR Seeklight to meet institutional and scholarly standards. Learn more about how JSTOR Seeklight is being developed.
Transcript generation
Moving beyond metadata, JSTOR Seeklight can now generate high-quality machine readable transcripts for materials with text. Users can choose to create transcripts individually, or at scale, and use our rich text editor to apply the right format and styling directly within the interface. Transcripts are available to download in both .txt and .rtf formats.
Which large language models (LLM) is JSTOR Seeklight using?
We are currently using several LLMs, including GPT-4.1 from OpenAI and Google Gemini 2.5. This allows us to optimize for specific tasks appropriately, and is a reflection of our intentional technical architecture, which allows us to swap for better models as needed. We continuously evaluate the offerings of other service providers, as well as more bespoke alternatives, to ensure that we’re delivering the highest quality output.
Will my data be used to train the LLM?
No. Our enterprise agreements with large language model (LLM) providers provide that they do not use data uploaded through their APIs (such as your uploaded materials) to train their LLMs. This is stated in OpenAI’s API terms (see the OpenAI Trust Portal) as well as those from Google Gemini.
How is my content protected?
We employ strict security measures to protect uploaded materials and metadata:
- AI provider data handling: OpenAI stores uploaded data for a maximum of 30 days to check for abuse or misuse before deleting it entirely. Your content is never used to train AI models.
- Data rights and deletion: You retain full rights to your uploaded content and generated metadata. If you delete your project, all associated data is permanently removed from JSTOR’s systems.
- Security and compliance: JSTOR maintains physical, technical, and administrative safeguards to prevent unauthorized access or malicious use. As a SOC2 compliant organization, our data security practices undergo annual independent third-party audits.
- Metadata storage and use: JSTOR stores generated metadata and user interactions to improve system performance.
How does JSTOR Seeklight manage handwritten, historical, and/or non-English language materials?
- Handwritten and historical documents: JSTOR Seeklight is designed to effectively process handwritten documents and historical materials, including generating both descriptive metadata and transcripts. While accuracy may vary depending on factors such as condition, language, legibility, and format, our technology is continuously improving in order to provide the best possible results.
- Non-English language materials: Our LLM service providers handle multiple languages by default, but optimization of JSTOR Seeklight for specific non-English languages is still in progress.
How does JSTOR Seeklight contribute to accessibility?
Transcripts can be used as an accessible text alternative for images of text, such as handwritten letters. This is a technique to meet WCAG SC 1.4.5 Images of Text (AA). You can learn more about SC 1.4.5 on W3C’s website, and more about accessibility requirements, including the April 2026 federal web content mandate for higher education, on the National Archive’s Federal Register.
RTF and TXT files are machine-readable, which means your downloaded transcripts are in an accessible file format and ready to be presented as an accessible alternative wherever you would like to display them. If you choose to convert them into an alternate format, such as a PDF, be sure to check accessibility guidelines for the corresponding file format.
Visit our support site for recommendations on how you can use JSTOR Seeklight-generated transcripts or your own uploaded transcripts to increase your items’ accessibility on JSTOR.
Does JSTOR retain any rights to the files, file contents, metadata, and/or transcripts uploaded to or generated through JSTOR Seeklight?
Institutions will own the rights to the files and content they upload, as well as the transcripts and metadata generated through JSTOR Seeklight. JSTOR is granted limited permissions to leverage the content, metadata, and transcripts to provide, develop, and improve JSTOR Seeklight for accuracy, and for the general benefit of our licensees.
Can users review and edit AI-generated metadata and/or transcripts from JSTOR Seeklight?
Yes. Users have full control over metadata and transcripts generated by JSTOR Seeklight, and can edit AI-powered outputs before finalizing. During review, they can mark records as “Reviewed” to indicate completion. While unreviewed records can still be published, users will receive a prompt reminding them to review their items before finalizing publication. Learn more.
Can institutions customize the AI metadata and/or transcript generation rules in JSTOR Seeklight?
JSTOR Seeklight can be used “out of the box” without requiring any kind of customized prompting. Additionally, contributors are able to add additional context, to ensure that JSTOR Seeklight is processing metadata and/or transcripts with as much existing knowledge as possible. Additional customization options are currently being explored, and we’ll provide more information as plans are developed.
What formats are supported for JSTOR Seeklight?
Currently, JSTOR Seeklight supports text and image-based materials, with the following file types:
- Text: PDFs (.pdf)
- Images: JPEG/JPG (.jpeg, .jpg)
Within these formats, the tool can generate metadata and transcripts for a diverse range of content, including:
- Typed documents, handwritten letters, newspapers, books, and periodicals
- Historical photographs, maps, posters, postcards, and advertisements
We are exploring future expansions to support metadata generation for additional file types, such as audio and video, in the future.
What is JSTOR doing to offset the environmental impacts of AI?
We recognize the environmental impact of AI and closely follow emerging research in this area. Our approach prioritizes making informed, responsible choices that minimize our footprint while supporting our commitment to advancing research and learning through new technologies. Wherever possible, we prioritize thoughtful implementation over scale for scale’s sake. Here are some of the ways we reduce our impact:
- We don’t train our own LLMs. Model training is resource-intensive, so we rely on existing models and focus on using them efficiently.
- We choose the right model for the task. Rather than defaulting to the largest model, we select the most appropriate option, using smaller models when possible to reduce energy use while maintaining quality.
- We keep prompts lean. We craft our system prompts to be as concise as possible while still providing the necessary context to ensure high-quality results. Smaller inputs mean less computational overhead.
What is JSTOR’s overall approach to AI and other emerging technologies?
Technology has always been an incredible accelerator for our mission to improve access to knowledge and education. As a trusted provider of scholarly materials, we have a responsibility to leverage content, technology, and deep subject matter expertise that enables a safe, effective, and affordable path forward for the use of AI.
- We honor our values first and foremost. JSTOR provides users with a credible, scholarly research and learning experience. Our use of technology must enhance that credibility, not undermine it.
- We listen closely and proceed with care. We recognize the concerns associated with AI and other emerging technologies and pursue this work in close collaboration with our community.
- We empower people, not replace them. These tools should not be used to “do the work.” They are designed to help people deepen and expand their work.
- We enable our systems to interact with users in ways that are intuitive. Traditionally, it has been the users’ responsibility to adapt to restricted language and structures to provide computers with inputs. Today, computers can interact effectively with users in natural language, and we should take advantage of that, while balancing the need to support user and institutional preferences.
This work is ongoing. As we learn and grow, we welcome your engagement to help shape AI tools that reflect shared values and advance meaningful access to scholarship.
Learn more about how we are exploring emerging technologies on the JSTOR platform.
Ready to see JSTOR Seeklight in action?
Join us in shaping responsible, AI-assisted digital stewardship—and find out how JSTOR Seeklight can accelerate and enhance your collections work.
Note: Items marked with * are required.
Explore JSTOR Seeklight
Curious about how JSTOR Seeklight works?
Share a few details and we’ll connect you with our team to show how JSTOR Seeklight accelerates responsible, AI-assisted stewardship.
Note: Items marked with * are required.
View image credits from this page

Nicholas Chevalier. Sketch at Maxwell’s Redoubt. From: Journal Made on a Visit to the Disturbed District of Wanganui. 6 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://jstor.org/stable/community.27023216.
Nicholas Chevalier. Sergeant Shadwick and Four Studies of Soldiers. From: Journal Made on a Visit to the Disturbed District of Wanganui. 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023214.
John Gibson. Letter from John Gibson to John Udny, Containing Information for Henry Farnum. January 9, 1850. Part of Open: The Metropolitan Museum of Art, Artstor. https://www.jstor.org/stable/community.18604581.

Nicholas Chevalier. 3 Sketches. From: Journal Made on a Visit to the Disturbed District of Wanganui. 5 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023215.
Nicholas Chevalier. Sketches of Major Noake and Col. Nedville. From: Journal Made on a Visit to the Disturbed District of Wanganui. 7 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023221.
Nicholas Chevalier. Sketch of Soldiers Manning a Redoubt. From: Journal Made on a Visit to the Disturbed District of Wanganui. 7-8 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023228.
Nicholas Chevalier. Sketch at Maxwell’s Redoubt. From: Journal Made on a Visit to the Disturbed District of Wanganui. 6 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://jstor.org/stable/community.27023216.

Nicholas Chevalier. Sergeant Shadwick and Four Studies of Soldiers. From: Journal Made on a Visit to the Disturbed District of Wanganui. 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023214.
Nicholas Chevalier. Sketch at Maxwell’s Redoubt. From: Journal Made on a Visit to the Disturbed District of Wanganui. 6 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://jstor.org/stable/community.27023216.
Nicholas Chevalier. Sketch of Soldiers Manning a Redoubt. From: Journal Made on a Visit to the Disturbed District of Wanganui. 7-8 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023228.
Nicholas Chevalier. Sketches of Major Noake and Col. Nedville. From: Journal Made on a Visit to the Disturbed District of Wanganui. 7 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023221.
Nicholas Chevalier. 3 Sketches. From: Journal Made on a Visit to the Disturbed District of Wanganui. 5 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023215.

Unknown. Huia Tail Feathers, Mounted on Gold and Pounamu. circa 1900. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27027947.
Unknown. Portrait of Lady Bledisloe, Dr Cockayne and Others. December 1930. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27113986.
Nicholas Chevalier. Sketch at Maxwell’s Redoubt. From: Journal Made on a Visit to the Disturbed District of Wanganui. 6 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://jstor.org/stable/community.27023216.
Nicholas Chevalier. Sketch of Soldiers Manning a Redoubt. From: Journal Made on a Visit to the Disturbed District of Wanganui. 7-8 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023228.
Nicholas Chevalier. Sketches of Major Noake and Col. Nedville. From: Journal Made on a Visit to the Disturbed District of Wanganui. 7 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023221.
Watson Hazell and Viney Ltd. Poster, “Soldiers” Separation Allowances’. March 1915. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27028327.

Chinese. Hall of Quiet Study. 19th century. Part of Open: The Metropolitan Museum of Art, Artstor.

