Accelerate impact with purpose-built AI for stewardship

JSTOR Seeklight helps you advance collections processing responsibly—from description to transcription and analysis—while keeping your expertise at the center. Co-created and used by librarians and archivists, it helps you enhance discovery and accessibility for your unique collections.

Woman working on a laptop in a bright office, surrounded by scanned historical documents and a JSTOR interface showing a handwritten letter with a ‘Generate Transcript’ button; icons below label actions such as Describe, Transcribe, Summarize, Review, and Discover.

A tool for practitioners

Pair your expertise with JSTOR Seeklight’s advanced technology to strengthen and streamline workflows while keeping you in control. Created by JSTOR—a mission-driven nonprofit—and recognized with the Society of American Archivists’ C.F.W. Coker Award for innovation in archival practice, it empowers you to fast-track impact while ensuring accurate and accountable processing and publishing.

  • Surface items for review with tags for AI-generated data 
  • Edit and approve outputs with intuitive, integrated workflows
  • Keep data protected with built-in, privacy-first practices

Created through close collaboration with our charter participants, including:


Transform discoverability with transcripts for text-based materials

  • Enable full-text search and discovery for handwritten, typed, and mixed-media items
  • Support accessibility with screen-reader-ready transcripts 
  • Stay in control by editing or reprocessing any time
Stack of handwritten pages with an interface bubble labeled ‘Create transcript,’ displaying a snippet of transcribed text and an ‘Edit’ button with a cursor selecting it.
Digitized handwritten documents with an interface showing AI-generated metadata tags such as ‘Swiss,’ ‘ca. 1868,’ and ‘Museum of New Zealand,’ next to a ‘Generate Metadata’ button and a confirmation message reading ‘3 records successfully added.’

Describe collections at scale to move from intake to impact faster

  • Create structured, standards-aligned descriptive metadata at the item level
  • Efficiently batch-process images, texts, or compound objects across multiple formats
  • Edit with integrated review workflows—prioritize work with confidence scores and track progress with tags

Surface collection insights to plan, scope, and describe efficiently

  • Identify themes and patterns across collections with concise, structured summaries
  • Export insights to enrich finding aids and reports
  • Regenerate summaries as collections evolve
Collage of historical items—including handwritten letters, a photograph, a printed pamphlet, and a fan—alongside a digital ‘Project Summary’ form with fields for scope, extent, and language; a cursor clicks a button labeled ‘Download Summary.’

Join our community

Collaborate with a global network of institutions shaping JSTOR Seeklight through the JSTOR Digital Stewardship Services Tier 3 charter program. Work alongside your peers and the JSTOR team to test new capabilities, share real-world insights, and guide priorities for the responsible use of AI in digital stewardship.

  • Join working groups focused on key topics
  • Share outcomes and practices across our community
  • Inform product evolution with insights from the field
Nineteenth-century Chinese painting showing an interior view of a scholar’s study. The open room reveals men writing at desks, surrounded by scrolls, books, and decorative furniture, with potted plants lining a balcony in the foreground.

Reflections from the community

We used JSTOR Seeklight to draft descriptions in minutes. With our ‘AI drafts, people decide’ workflow, staff verified and published in just hours. That shift—from authoring every word to reviewing and governing at scale—is the difference between backlog and discovery.

JSTOR Seeklight is a notable example of innovative and forward-thinking development within archival description… Merging archival principles, technological innovation, and a commitment to sustainable stewardship, JSTOR Seeklight exemplifies the [C.F.W. Coker] award’s criteria.

With JSTOR Seeklight, we’ve processed more than 500 photographs in under 45 minutes. The scale is really life altering in terms of workload, but also impact—we can make materials discoverable faster, better and in places where people are already working.

JSTOR Seeklight lets us imagine better descriptions—faster, and with more depth—than we could achieve otherwise.

Explore how JSTOR Seeklight is evolving with our community

A group of attendees sits at round tables in a warmly lit event space with exposed wooden beams and string lights. A panel of four speakers is seated at the front behind a white table, with a presentation screen behind them. Many attendees have red tote bags hanging from their chairs with a white “J 30 Years” logo, celebrating JSTOR’s 30th anniversary.

Sustainable books models and digital stewardship at Charleston 2025

At this year’s Charleston Conference, JSTOR hosted two lunch events. Highlights included updates on Path to Open, Publisher Collections, and JSTOR Seeklight’s growing role in AI-assisted metadata generation.

An intricately carved ivory sundial and compass in the form of a folding book. The open surface features engraved figures of men in period dress holding banners, with a central compass needle pointing north. Fine calibration lines and Roman numerals mark the sundial’s face, while a plumb line and detailed scene of workers appear on the raised lid.

What is a collections processing tool?

Roger Schonfeld introduces the concept of a collections processing tool—a new, community-driven system that reimagines how special collections are described and discovered. With JSTOR Seeklight, this approach uses digitization and intelligent recognition to make archives more accessible and impactful.

Handwritten letter on blue paper dated “Rome 9th Jan.y 1850” from sculptor John Gibson. He acknowledges a prior letter and quotes prices for his statues of Aurora (£450) and Cupid disguised as a shepherd (£300). The page shows brown stains and folds and is signed “John Gibson.” At the bottom is a pen sketch of a standing male figure with dotted measurement lines and notes about height and a dark line in the marble.

Beyond description: Introducing transcript generation in JSTOR Seeklight

JSTOR Seeklight now generates transcripts for typed and handwritten items, making every word searchable and accessible while keeping editors in control. Available now for Tier 3 participants.

A cluttered metal shelf filled with stacks of old papers and folders. A blue sticky note taped to the shelf reads “SHELVES OF WOE, CONT’D” in black marker.

Illuminating possibility: Early reflections on JSTOR Seeklight from our charter community

This summer, charter participants in JSTOR Digital Stewardship Services shared their early enthusiasm for JSTOR Seeklight, our AI-powered tool for accelerating collection processing. We share their feedback, ideas, and a glimpse of what’s ahead.

Red–orange circular arrow encircling a grey, turbine-like spiral with radial pins and cloudlike shading on a pale background.

What we talk about when we talk about content management

JSTOR Digital Stewardship Services does more than store your collections. Explore content management that drives discovery, use, and impact.

Image of an open book with a magnifying glass over an illustration titled "A useful stream", showing local residents and their livestock fetching water from a stream. Elementary hygiene for the tropics, p. 89.

Building evolving guardrails: How JSTOR Seeklight helps address bias and harm in descriptive metadata

Bias is inherent to description, but description can and should be revised over time. We’re evolving JSTOR Seeklight’s safeguards against bias and harmful content to reflect our values, our engineering ethos, and our commitment to transparency.

Frequently asked questions

How does JSTOR Seeklight work?

JSTOR Seeklight is our AI-powered technology that accelerates digital collections processing with tools for metadata generation, text transcription, and collection-level insights. Built on JSTOR’s trusted infrastructure and engineering expertise, and informed by deep community collaboration, JSTOR Seeklight features integrated workflows that ensure expert review and institutional control. 

File analysis and grouping

Upon upload, JSTOR Seeklight analyzes filename syntax and folder structure to determine if any of the files are meant to be grouped together into compound objects. If so, it will group the requisite files together and describe them as one singular item. 

Classification and routing

Next, JSTOR Seeklight intelligently sorts materials into precise categories and directs each one to the technologies and workflows that can extract the richest insights. Images are handled differently from text, and mixed media follows its own path. Even within text files, processing varies: legal documents demand one approach, while student periodicals require another—each type receives specialized treatment to maximize the information uncovered.

The classification and routing step is essential in ensuring that JSTOR Seeklight has the right instructional context and specialized prompts to generate high-quality descriptive metadata. This approach ensures that AI is applied with the care of a real-world archivist and optimized to account for the nuance of a diverse array of materials. 

Processing workflow

Metadata is generated across a core set of descriptive metadata fields (built off of Dublin Core) and presented to the user for review. Users retain full control, refining AI-generated suggestions from JSTOR Seeklight to meet institutional and scholarly standards. Learn more about how JSTOR Seeklight is being developed.

Transcript generation

Moving beyond metadata, JSTOR Seeklight can now generate high-quality machine readable transcripts for materials with text. Users can choose to create transcripts individually, or at scale, and use our rich text editor to apply the right format and styling directly within the interface. Transcripts are available to download in both .txt and .rtf formats.

Which large language models (LLM) is JSTOR Seeklight using?

We are currently using several LLMs, including GPT-4.1 from OpenAI and Google Gemini 2.5. This allows us to optimize for specific tasks appropriately, and is a reflection of our intentional technical architecture, which allows us to swap for better models as needed. We continuously evaluate the offerings of other service providers, as well as more bespoke alternatives, to ensure that we’re delivering the highest quality output.

Will my data be used to train the LLM?

No. Our enterprise agreements with large language model (LLM) providers provide that they do not use data uploaded through their APIs (such as your uploaded materials) to train their LLMs. This is stated in OpenAI’s API terms (see the OpenAI Trust Portal) as well as those from Google Gemini.

How is my content protected?

We employ strict security measures to protect uploaded materials and metadata:

  • AI provider data handling: OpenAI stores uploaded data for a maximum of 30 days to check for abuse or misuse before deleting it entirely. Your content is never used to train AI models.
  • Data rights and deletion: You retain full rights to your uploaded content and generated metadata. If you delete your project, all associated data is permanently removed from JSTOR’s systems.
  • Security and compliance: JSTOR maintains physical, technical, and administrative safeguards to prevent unauthorized access or malicious use. As a SOC2 compliant organization, our data security practices undergo annual independent third-party audits.
  • Metadata storage and use: JSTOR stores generated metadata and user interactions to improve system performance.
How does JSTOR Seeklight manage handwritten, historical, and/or non-English language materials?
  • Handwritten and historical documents: JSTOR Seeklight is designed to effectively process handwritten documents and historical materials, including generating both descriptive metadata and transcripts. While accuracy may vary depending on factors such as condition, language, legibility, and format, our technology is continuously improving in order to provide the best possible results.
  • Non-English language materials: Our LLM service providers handle multiple languages by default, but optimization of JSTOR Seeklight for specific non-English languages is still in progress.
How does JSTOR Seeklight contribute to accessibility?

Transcripts can be used as an accessible text alternative for images of text, such as handwritten letters. This is a technique to meet WCAG SC 1.4.5 Images of Text (AA). You can learn more about SC 1.4.5 on W3C’s website, and more about accessibility requirements, including the April 2026 federal web content mandate for higher education, on the National Archive’s Federal Register.

RTF and TXT files are machine-readable, which means your downloaded transcripts are in an accessible file format and ready to be presented as an accessible alternative wherever you would like to display them. If you choose to convert them into an alternate format, such as a PDF, be sure to check accessibility guidelines for the corresponding file format.

Visit our support site for recommendations on how you can use JSTOR Seeklight-generated transcripts or your own uploaded transcripts to increase your items’ accessibility on JSTOR.

Does JSTOR retain any rights to the files, file contents, metadata, and/or transcripts uploaded to or generated through JSTOR Seeklight?

Institutions will own the rights to the files and content they upload, as well as the transcripts and metadata generated through JSTOR Seeklight. JSTOR is granted limited permissions to leverage the content, metadata, and transcripts to provide, develop, and improve JSTOR Seeklight for accuracy, and for the general benefit of our licensees.

Can users review and edit AI-generated metadata and/or transcripts from JSTOR Seeklight?

Yes. Users have full control over metadata and transcripts generated by JSTOR Seeklight, and can edit AI-powered outputs before finalizing. During review, they can mark records as “Reviewed” to indicate completion. While unreviewed records can still be published, users will receive a prompt reminding them to review their items before finalizing publication. Learn more.

Can institutions customize the AI metadata and/or transcript generation rules in JSTOR Seeklight?

JSTOR Seeklight can be used “out of the box” without requiring any kind of customized prompting. Additionally, contributors are able to add additional context, to ensure that JSTOR Seeklight is processing metadata and/or transcripts with as much existing knowledge as possible. Additional customization options are currently being explored, and we’ll provide more information as plans are developed.

What formats are supported for JSTOR Seeklight?

Currently, JSTOR Seeklight supports text and image-based materials, with the following file types:

  • Text: PDFs (.pdf)
  • Images: JPEG/JPG (.jpeg, .jpg)

Within these formats, the tool can generate metadata and transcripts for a diverse range of content, including:

  • Typed documents, handwritten letters, newspapers, books, and periodicals
  • Historical photographs, maps, posters, postcards, and advertisements

We are exploring future expansions to support metadata generation for additional file types, such as audio and video, in the future.

What is JSTOR doing to offset the environmental impacts of AI?

We recognize the environmental impact of AI and closely follow emerging research in this area. Our approach prioritizes making informed, responsible choices that minimize our footprint while supporting our commitment to advancing research and learning through new technologies. Wherever possible, we prioritize thoughtful implementation over scale for scale’s sake. Here are some of the ways we reduce our impact:

  • We don’t train our own LLMs. Model training is resource-intensive, so we rely on existing models and focus on using them efficiently. 
  • We choose the right model for the task. Rather than defaulting to the largest model, we select the most appropriate option, using smaller models when possible to reduce energy use while maintaining quality.
  • We keep prompts lean. We craft our system prompts to be as concise as possible while still providing the necessary context to ensure high-quality results. Smaller inputs mean less computational overhead.
What is JSTOR’s overall approach to AI and other emerging technologies?

Technology has always been an incredible accelerator for our mission to improve access to knowledge and education. As a trusted provider of scholarly materials, we have a responsibility to leverage content, technology, and deep subject matter expertise that enables a safe, effective, and affordable path forward for the use of AI. 

  • We honor our values first and foremost. JSTOR provides users with a credible, scholarly research and learning experience. Our use of technology must enhance that credibility, not undermine it. 
  • We listen closely and proceed with care. We recognize the concerns associated with AI and other emerging technologies and pursue this work in close collaboration with our community.
  • We empower people, not replace them. These tools should not be used to “do the work.” They are designed to help people  deepen and expand their work.
  • We enable our systems to interact with users in ways that are intuitive. Traditionally, it has been the users’ responsibility to adapt to restricted language and structures to provide computers with inputs. Today, computers can interact effectively with users in natural language, and we should take advantage of that, while balancing the need to support user and institutional preferences.

This work is ongoing. As we learn and grow, we welcome your engagement to help shape AI tools that reflect shared values and advance meaningful access to scholarship.

Learn more about how we are exploring emerging technologies on the JSTOR platform.

Ready to see JSTOR Seeklight in action?

Join us in shaping responsible, AI-assisted digital stewardship—and find out how JSTOR Seeklight can accelerate and enhance your collections work.

Note: Items marked with * are required.

Explore JSTOR Seeklight

Curious about how JSTOR Seeklight works?

Share a few details and we’ll connect you with our team to show how JSTOR Seeklight accelerates responsible, AI-assisted stewardship.

Note: Items marked with * are required.

View image credits from this page
Woman working on a laptop in a bright office, surrounded by scanned historical documents and a JSTOR interface showing a handwritten letter with a ‘Generate Transcript’ button; icons below label actions such as Describe, Transcribe, Summarize, Review, and Discover.

Nicholas Chevalier. Sketch at Maxwell’s Redoubt. From: Journal Made on a Visit to the Disturbed District of Wanganui. 6 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://jstor.org/stable/community.27023216.

Nicholas Chevalier. Sergeant Shadwick and Four Studies of Soldiers. From: Journal Made on a Visit to the Disturbed District of Wanganui. 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023214.

John Gibson. Letter from John Gibson to John Udny, Containing Information for Henry Farnum. January 9, 1850. Part of Open: The Metropolitan Museum of Art, Artstor. https://www.jstor.org/stable/community.18604581.

Stack of handwritten pages with an interface bubble labeled ‘Create transcript,’ displaying a snippet of transcribed text and an ‘Edit’ button with a cursor selecting it.

Nicholas Chevalier. 3 Sketches. From: Journal Made on a Visit to the Disturbed District of Wanganui. 5 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023215.

Nicholas Chevalier. Sketches of Major Noake and Col. Nedville. From: Journal Made on a Visit to the Disturbed District of Wanganui. 7 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023221.

Nicholas Chevalier. Sketch of Soldiers Manning a Redoubt. From: Journal Made on a Visit to the Disturbed District of Wanganui. 7-8 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023228.

Nicholas Chevalier. Sketch at Maxwell’s Redoubt. From: Journal Made on a Visit to the Disturbed District of Wanganui. 6 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://jstor.org/stable/community.27023216.

Digitized handwritten documents with an interface showing AI-generated metadata tags such as ‘Swiss,’ ‘ca. 1868,’ and ‘Museum of New Zealand,’ next to a ‘Generate Metadata’ button and a confirmation message reading ‘3 records successfully added.’

Nicholas Chevalier. Sergeant Shadwick and Four Studies of Soldiers. From: Journal Made on a Visit to the Disturbed District of Wanganui. 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023214.

Nicholas Chevalier. Sketch at Maxwell’s Redoubt. From: Journal Made on a Visit to the Disturbed District of Wanganui. 6 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://jstor.org/stable/community.27023216.

Nicholas Chevalier. Sketch of Soldiers Manning a Redoubt. From: Journal Made on a Visit to the Disturbed District of Wanganui. 7-8 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023228.

Nicholas Chevalier. Sketches of Major Noake and Col. Nedville. From: Journal Made on a Visit to the Disturbed District of Wanganui. 7 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023221.

Nicholas Chevalier. 3 Sketches. From: Journal Made on a Visit to the Disturbed District of Wanganui. 5 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023215.

Collage of historical items—including handwritten letters, a photograph, a printed pamphlet, and a fan—alongside a digital ‘Project Summary’ form with fields for scope, extent, and language; a cursor clicks a button labeled ‘Download Summary.’

Unknown. Huia Tail Feathers, Mounted on Gold and Pounamu. circa 1900. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27027947.

Unknown. Portrait of Lady Bledisloe, Dr Cockayne and Others. December 1930. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27113986.

Nicholas Chevalier. Sketch at Maxwell’s Redoubt. From: Journal Made on a Visit to the Disturbed District of Wanganui. 6 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://jstor.org/stable/community.27023216.

Nicholas Chevalier. Sketch of Soldiers Manning a Redoubt. From: Journal Made on a Visit to the Disturbed District of Wanganui. 7-8 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023228.

Nicholas Chevalier. Sketches of Major Noake and Col. Nedville. From: Journal Made on a Visit to the Disturbed District of Wanganui. 7 December 1868. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27023221.

Watson Hazell and Viney Ltd. Poster, “Soldiers” Separation Allowances’. March 1915. Part of Open: Museum of New Zealand – Te Papa Tongarewa, Artstor. https://www.jstor.org/stable/community.27028327.

Nineteenth-century Chinese painting showing an interior view of a scholar’s study. The open room reveals men writing at desks, surrounded by scrolls, books, and decorative furniture, with potted plants lining a balcony in the foreground.

Chinese. Hall of Quiet Study. 19th century. Part of Open: The Metropolitan Museum of Art, Artstor.