|
Abstract:
AI-assisted thematic survey and translation tool for 29,000+ Bahá’í texts, enabling discovery via categorized extracts, source files, and integrated chat exploration.
Notes:
Note: You will need to login with your own Gmail (Google) account to be able to access this Notebook tool, on any device (desktop, laptop, tablet, phone).
The mindmap (see section "Other features" at the bottom of this page) is an interactive tool. Try to click on a subject of interest and it will lead to new related a chat conversation. See more details and links in the project overview. Tip: It is possible to exclude (some) talks and pilgrims notes, which may not be authenticated, before starting the chat. Click on "Sources" at the left side, and then unselect the 3 files that start with "ABU" (utterances of 'Abdu'l-Bahá) halfway the list. This page is adapted from the file "0_readme.txt" (last update June 6, 2025), which can be found at the Explorer link below. The Subjects are online in two formats: here and here. |
Source guide This notebook is a discovery tool for the Writings of the Central Figures of the Baha'i Faith. It presents the results of the first AI-assisted, corpus-wide translation and thematic survey of over 29,000 of the works cataloged in the "Partial Inventory of the Writings of the Central Figures of the Baha'i Faith" (available at https://blog.loomofreality.org/?page_id=252 and intended to be used as a companion to this notebook). The contents of this notebook are accessed either directly, by reading the source files in the left hand panel (which contains both thematic compilations of extracts as well as bundles of entire translated Tablets), or indirectly, by using the chat window at center (the answers to which contain links back to the source files). The source files refer to the partial inventory numbers (PINs - "BH00001" etc.) which uniquely identify each work in the Partial Inventory. Collectively the PINs comprise the coordinate system that facilitates navigation through the texts, including in particular access to the original Persian/Arabic sources. The PIN format shows in a compact manner the author (BH, BB, or AB for Baha'u'llah, the Bab or Abdu'l-Baha respectively), whether it is a reported utterance (U), and something about the relative word count length (lower inventory numbers are usually longer). Authorized translations were used where possible but the sources are dominated by machine translations (courtesy of Claude 3.5 Sonnet), which may contain inaccuracies and infelicities. Authorized translations of the Baha'i Writings and official thematic compilations can be found at bahai.org/library. The machine translations available here may not conform to Baha'i style. Diacriticals have been dropped. The purpose is not to present polished texts but to surface potentially interesting material as a starting point for further research. Authorized translations are not distinguished from either human or machine generated provisional translations. Follow up using the "Partial Inventory" for more information on the source of each quotation and its translation status (if the translation does not exist in the Partial Inventory, it is probably a machine translation). A formal presentation putting this project in context can be found here: https://youtu.be/59GGen0fl3U. Further details follow.
Subject categories (preceded by I.A.1 etc.) have been selected to cover a broad range of topics addressed in the Baha'i Writings. The associated files have been populated with relevant quotations, partially by AI and partially by hand: the separator ========= marks the transition from AI-generated to human generated; the latter are scraped from "Loom of Reality", a thematic compilation begun as a personal project in 1993 and available at loom.loomofreality.org. The latter have not yet been matched to Partial Inventory codes, and may also contain a small amount of material from non-Baha'i sources. Matching these quotations to the Partial Inventory codes may be done in future; in the meantime redundancies remain. However it may be instructive to compare, above and below the separator line, what an AI agent running across the entire corpus can yield versus a careful reading of available texts in English translation. Owing to file constraints in NotebookLM only around 250 of the 651 subject categories are available in the left panel as separate files; most of the remainder are bundled in files titled "zAdditional_Subjects_01.txt" etc. Subject categories have been assigned according to the following top-level schema:
The complete list is online at bahai-library.com/inventory/subjects and bahai-library.com/phelps_loom_reality_topics. See also a spreadsheet showing topic frequency by author, which topics are single-author dominated, and which ones were (subjectively) chosen to include separately in the NotebookLM sources panel owing to the 300 file limit: csv and pdf. Extracts are listed from shortest to longest to encourage browsing and discovery of individual topics, the idea being to start from an accessible spot on the shoreline (shorter and generally pithier quotations) before heading into deeper waters. Note that the AI may have omitted important context before or after a quotation; some follow-up may be needed. Many subject categories are partially redundant or overlapping; conceptual orthogonality was not a goal of the categorization. Each category word or phrase is a net of a different shape, that catches a different kind of fish when swept through the ocean. This overlap is particularly noticeable in topics relating to personal spiritual growth and transformation. A rigid classification scheme would reduce redundancies but at the cost of leaving out important material.
Source files (with names "zSources_BH00001-BH00019" etc.) are full translations from texts, in the order in which they are listed in the Partial Inventory. There are more than 15,000,000 words in more than 29,000 translations bundled in 44 source files. There are many lacunae, untranslated sections, improperly interpreted diacriticals, and missing texts; apart from the translations themselves which may be misleading or inferior to human produced translations. In the source files, the string "[...]", which often appears in some longer texts, indicates a cut point in the translation (translations were done in blocks of 450 words for technical reasons, to minimize timeout errors in the API calls). Words or entire sentences may be repeated around the cut point; check against the original source to confirm. Collections of principal writings and letters of Shoghi Effendi and the House of Justice have been included; many quotations from these sources are also present in the thematic compilations. The primary focus of this notebook however is to be a discovery engine for the works of the Central Figures.
The chat within NotebookLM, accessible from the middle panel, is constrained to answer from the sources within the notebook. The advantage of this is that answers always cite the source texts, and thus provide a window onto those texts complementary to the thematic compilations. "Hallucinations" and inaccurate quotations, which are a feature of general purpose chatbots, are thus eliminated. The disadvantage is that the depth and quality of answers (the chat is powered by Google Gemini) may not be equal to those of systems like ChatGPT and Claude. Thus a suggested practice is to regularly compare answers within this sandboxed data set to those provided by other large language models. Because the chat engine is accessing both the source files and the thematic compilations that are largely based on the source files, answers may reference key quotations more than once. The quality and focus of the answers can be improved by creating a custom chat configuration (accessible from the center panel, 500 character maximum). Example: "You are an expert on the Baha'i Teachings. The more passages you can quote from the sources in your answers, the better. The best answers highlight profound concepts, make unexpected connections between ideas, or shed new light on well-understood principles. Don't flatter me or varnish any uncomfortable facts. Avoid flowery language." Sample questions:
2) Give me a comprehensive compilation from the sources on the topic of (...). 3) What are the ten most thought-provoking questions you can ask of the sources? And for each question, what are ten references from the sources that help to answer it? 4) Map out all the interconnected ideas around the concept of (...). What other topics, assumptions, or implications does it silently touch upon, challenge, or depend on? 5) Create a week-long course on the topic of (...) built around extensive reference to the sources and thoughtful questions for group discussion. 6) Construct a debate on the topic of (...) between two imaginary scholars who interpret this concept in opposing ways. What evidence would each one cite from the sources to support their view? 7) Compare and contrast Baha'i and Christian (Islamic, Buddhist, etc.) theology. 8) Recursively connect each source to one another as you give a top down overview. 9) How do the sources contrast the nature and acquisition of human knowledge, particularly in philosophy and conventional sciences, with the innate, universal knowledge imparted by Divine Manifestations, and what implications does this distinction hold for the independent investigation of truth? 10) What it is that causes the soul to sing?
There are other [read-only] features in the right panel - podcasts, mind maps, study guides, timelines - that may be useful for exploring the content from different angles.
|
METADATA | |
Views | 304 views since posted 2025-08-13; last edit 2025-09-03 16:39 UTC; previous at archive.org.../phelps_partial_inventory_explorer |
Language | English |
Permission | compiler |
Share | Shortlink: bahai-library.com/7007 Citation: ris/7007 |
|
|
Home
![]() ![]() ![]() search Author ![]() ![]() ![]() Adv. search ![]() ![]() Links ![]() ![]() ![]() ![]() |