JSTOR, a nonprofit service from ITHAKA, invites you to explore the beta release of a generative AI-powered research tool. Developed in collaboration with our community, this interactive tool leverages innovative technology and JSTOR’s trusted corpus to empower people to deepen and expand their research.

Sign up for the beta

Do you have questions, comments, or concerns? We want to hear from you! Please email support@jstor.org to share your feedback.

About JSTOR’s AI research tool (beta)

Our generative AI-powered research tool is designed to help people work more efficiently and effectively. This beta feature will appear on the content page for journal articles, book chapters, and research reports, and as an alternative to JSTOR’s standard keyword search. The tool helps you do the following:

Assess content relevance

The tool generates a summary of what you’re reading to help you quickly assess its relevance, and lets you know how it relates to your search terms.

Deepen your research

Discover related topics, enrich your reading with similar content from the JSTOR corpus, and try new ways of searching.

Be conversational

Use natural, conversational language to ask questions and get quick answers about what you’re reading or researching.

ITHAKA’s approach to generative AI

ITHAKA offers a portfolio of nonprofit services, including JSTOR, Portico, and Constellate, aligned around a shared mission to improve access to knowledge for people around the world as affordably and sustainably as possible. Technology plays a pivotal role in how ITHAKA achieves this aim. Through the research of Ithaka S+R, teachings from Constellate‘s text analysis experts, and continuous improvements to the research and learning experience on JSTOR, we are actively exploring the use of generative AI in education and scholarship.

Approaching generative AI together

JSTOR and ITHAKA’s founding president, Kevin Guthrie, shares his thoughts about our mission-driven approach to deploying new technologies to improve the learning and research experience for JSTOR users.

Our approach

Making AI generative for higher education

Ithaka S+R is collaborating with 18 colleges and universities on a multi-year research project to chart a productive path in the use of generative AI in higher education. We will be publishing three public reports related to the project’s findings, as well as news along the way.

Read about this project

Empowering research with generative AI on JSTOR

Just getting started with AI on JSTOR, or need a refresher? Visit our blog for an overview of what the tool can help you do.

Visit the blog

Frequently asked questions

  • What does JSTOR’s generative AI research tool do?

    By incorporating generative AI features into the JSTOR platform, we aim to equip students, faculty, researchers, and librarians with innovative tools that facilitate engagement with complex content and enrich research and learning. This early release harnesses the power of generative AI to offer the following capabilities:

    • Generate a summary with key points and arguments from the text itself, helping users quickly determine if content is relevant to their research
    • Suggest topics and show related content within the JSTOR corpus that is relevant to the text, enabling exploration of additional possible paths of inquiry
    • Answer questions posed by users based only on the content of the document being viewed
    • Search JSTOR in a new way with a semantic search-powered capability that works better for natural language queries than traditional keyword search

    JSTOR has previously applied machine learning and artificial intelligence technologies to optimize the research experience. For example, we have created a citation graph to link all articles on JSTOR, and used machine learning to improve the relevance of search results and recommendations. As we extend our knowledge and application of new technologies to generative AI, we expect to iterate and evolve as we learn. By volunteering for our limited beta test, you will help us define the long-term scope of this exciting new initiative.

  • What content can the tool be used with?

    At present, the tool can be used with journal articles, book chapters, and research reports found on JSTOR. Images and text-based primary sources on JSTOR are not yet included.

  • What data sources is the tool drawing from to generate content?

    To start, we are using only the contents of the document being viewed to generate responses. Over time, as we learn from and improve upon the accuracy of responses, we might extend this to use the content of other relevant documents in the JSTOR corpus.

  • Which large language models are JSTOR using for this generative AI-powered tool?

    To jumpstart the learning and experimentation process, the beta release of our generative AI-powered tool uses gpt-3.5-turbo from OpenAI and the open source all-MiniLM-L6-v2 sentence transformer model. We are actively exploring alternatives and expect to evolve the models we use as our environment develops.

  • How do you plan to measure or ensure accuracy?

    We are monitoring and continuously improving accuracy using the following methods:

    • Subject matter experts from a range of academic disciplines conduct in-depth, ongoing evaluations of the tool’s output. These evaluations help us ensure that generated content is useful and accurate.
    • Users provide in-tool feedback that we use to identify areas for improvement. All interactions with the beta tool offer users the opportunity to provide a thumbs up or thumbs down rating. For thumbs down ratings, the user can provide further detail to explain their response.
    • We assess model performance for our specific use cases using industry-standard metrics for Machine Learning (ML) and Natural Language Processing (NLP). Additionally, we continuously integrate new evaluation metrics specifically developed for Large Language Model (LLM) use cases.
  • How will my information be kept, shared, and/or used?

    JSTOR handles all personal information, including information provided to this tool, in accordance with our privacy policy. JSTOR does not sell user data, nor does it share content or user data from its platform for the purposes of training third-party large language models.

    Any data you provide to this tool, such as question prompts and other conversation data, will be stored in JSTOR’s internal systems and used in de-identified form to maintain and improve the tool. Your prompts and some/all of the text of the content being viewed is also sent to OpenAI to generate the response. OpenAI does not use this to further train their models, nor do they retain the data for more than 30 days, in accordance with OpenAI’s API data usage policies.

    Your data will be used in this way only if you opt in to the beta testing program.

  • Will there be fees associated with the generative AI tool?

    As a nonprofit service, JSTOR’s financial model is designed to recover our costs and support sustainable growth to meet the emerging needs of the education community. As we learn more about what it costs to build and maintain generative AI on our platform, we will evaluate how to bring these powerful capabilities forward as equitably and sustainably as possible.

  • How is content protected from unauthorized or malicious use?

    JSTOR maintains physical, technical, and administrative safeguards to protect the content we hold. We are a SOC2 compliant organization whose data security practices and measures are audited annually by independent third parties.

    Content is only processed internally and with the OpenAI API. Please note the OpenAI API only temporarily stores data for the purpose of processing and does not use such submitted data to train their models or improve their service offering. For more information on OpenAI’s data security practices, please consult the OpenAI Trust Portal.

  • What is JSTOR’s overall approach to generative AI?

    Technology has always been an incredible accelerator for ITHAKA’s mission to improve access to knowledge and education. As a trusted provider of scholarly materials, we have a responsibility to leverage our content, technology, and deep subject matter expertise to chart a path forward that makes the use of AI safe, effective, and affordable for our constituents.

    • We honor our values first and foremost. JSTOR provides users with a credible, scholarly research and learning experience. Generative AI must enhance that credibility, not undermine it.
    • We will listen closely and proceed cautiously. We recognize the concerns associated with generative AI and are pursuing this work mindful of the very real considerations at hand. Our first step is to deepen our collective understanding through research and by doing – and as always, in close collaboration with our community. We will use these tools safely and well.
    • We empower people, we do not replace them. These tools should not be used to “do the work.” They should be designed to help people, especially students, learn and do their work more efficiently and effectively.
    • We will enable our systems to interact with users in ways that are comfortable. Traditionally, it has been the users’ responsibility to adapt to restricted language and structures to provide computers with inputs; computers can now interact effectively with users in natural language, and we should take advantage of that.
    • We will lead with care. We will deeply consider the aspirations and trepidations of the many communities we serve.

    As we learn about and pursue this latest technology, we look forward to your engagement and insights to ensure we continue to deliver high-quality, trusted, impactful services that improve access to knowledge and education for people everywhere.

Legal notices

Please keep the following in mind as you explore generative AI on JSTOR.

Data collection and use

By using JSTOR’s generative AI tool you consent to JSTOR collecting any data that you choose to share with the tool. JSTOR retains your conversation history in our logs and uses it in de-identified form, in accordance with our privacy policy, to maintain and improve the tool. We won’t ask you for any personal information, and request that you not share any in your conversations.

Any data that is sent to OpenAI (which includes your prompt as well as some or all of the text of the content being viewed) is used only for generating the response. They do not use this to further train their models, nor do they retain the data for more than 30 days, in accordance with OpenAI’s own API data usage policies.

Beta limitations

This AI tool is currently in a beta state, therefore its features are subject to change without notice as we develop it. Current user experience may vary and may not be reflective of the tool’s future capabilities. Users should not rely on its outputs without conducting independent research. The tool should not be used to seek professional advice, including but not limited to legal, financial, and medical advice.

Offensive materials and language

This evolving beta feature is built on JSTOR’s digital archive, which includes millions of items, spanning centuries and representing a wide range of ideas and perspectives. Given the historical nature of the materials on JSTOR, content items will reflect the era in which they were produced and/or the perspectives of the content creators, including the language, ideas or other cultural standards of the time. Some content items may consist of or contain outdated language and ideas that are no longer in use and may be considered offensive, and the beta feature may repeat this when it summarizes or describes that content. Such responses reflect the views of the original content creators through the underlying content and not the views of JSTOR or its employees.

Feedback

Any suggestions, ideas, or other information you would like to share regarding the tool may be used to improve, enhance, or develop its features. By submitting user feedback, you agree that such feedback becomes the sole property of JSTOR, and waive any rights, including intellectual property rights, related to the feedback.