Discover TEI-encoded documents from GitHub public repositories.
Last indexed | Repository | Description | Languages | Matching files |
---|---|---|---|---|
30 May 2020 05:30 UTC | christiancasey/iip-word-lists | Python utility for creating word lists from epidoc files | arc, grc, heb, lat | 8729 |
30 May 2020 05:30 UTC | cltk/chinese_text_cbeta_01 | Chinese Buddhist scriptures from CBETA | san, zho, eng | 8749 |
30 May 2020 05:30 UTC | IraPS/Tolstoy_letters_and_diaries | - | - | 8811 |
30 May 2020 05:30 UTC | cltk/chinese_text_cbeta_02 | Chinese Buddhist scriptures from CBETA | pli, san, eng, zho, x-unknown | 8982 |
30 May 2020 05:30 UTC | BD2K-Aztec/Aztec-TextSummarizing | - | eng, nor | 9929 |
30 May 2020 05:30 UTC | uvalib/valleyshadow | Migration of valleyshadow.lib from Tomcat/Coccoon to Go | eng | 10117 |
05 Apr 2023 11:45 UTC | bodleian/medieval-mss | Medieval Manuscripts in Oxford Libraries: TEI catalogue descriptions | lat, eng, ita, deu, fra, nld, grc, spa, zxx, gle, cym, ell, cor, chu, cat, ces, heb, pro, por, cop, egy, ara, rus, isl, fry, und, cai, gmh, goh, ang, enm, xno | 10595 |
30 May 2020 05:30 UTC | jawalsh/tei_text | - | eng, fra, deu, lat, grc, spa, afr, ita, nld, por, heb, hin, ara, gai, haw, ota, nai, sco, tur, zho, nor, pol, rus, ell, msa, ton, arc, ang, tam, tah, jpn, gae, rom, sve, alg, urd, dan, fij, isl, pli, cym | 10915 |
02 Apr 2023 19:45 UTC | Brown-University-Library/iip-texts | IIP inscriptions encoded in Epidoc XML and supporting files | arc, grc, heb, lat, phn, kat, syc, xcl | 11011 |
31 Dec 2020 08:59 UTC | welfare-state-analytics/riksdagen-corpus-old | Preprocess the proceedings of the Swedish parliament | eng, slv | 11208 |
14 May 2022 17:45 UTC | srophe/srophe-app-data | Repository for Syriaca.org TEI data, used by srophe-eXist-app. | ara, cop, chu, deu, eng, spa, fra, gez, grc, hye, ita, kat, lat, nld, por, rus, sog, syr, ell | 12026 |
20 Jan 2023 13:45 UTC | welfare-state-analytics/riksdagen-corpus | - | eng, slv | 12030 |
15 Jan 2023 05:43 UTC | tolstoydigital/TEI | All of Tolstoy in TEI/XML | rus | 12119 |
30 May 2020 05:30 UTC | lascivaroma/pompei-inscriptions | Corpus of Inscriptions from Pompei, including graffitis. | lat | 12209 |
20 Apr 2021 12:59 UTC | rism-ch/onstage-tei | TEI files fed into onstage | - | 12548 |
30 May 2020 05:30 UTC | CivilWarGovernorsOfKentucky/TestDocuments | Repository for testing automatic Github integration | - | 12776 |
30 Mar 2023 19:46 UTC | BetaMasaheft/Persons | Authority files for each person | eng, gez, fra, ara, heb, grc, cop, por, ita, rus, amh, deu, tir, lat, syr | 13765 |
05 Apr 2023 16:55 UTC | fihristorg/fihrist-mss | Fihrist TEI Catalogue | ara, fas, jpr, pus, und, tur, urd, eng, pan, heb, ota, hin, kur, deu, lat, chg, fra, gez, syr, syc, cop, grc, map, snd, san, mal, jrb, uig, prs, ave, ita, swa, msa, por, hau, inc, amh, zxx | 13967 |
30 May 2020 05:30 UTC | TEI-EAJ/auto_aozora_tei | 青空文庫テキストのTEI自動化プロジェクト | - | 15156 |
30 May 2020 05:30 UTC | obdurodon/CollateOS | Machine-assisted collation and alignment of diplomatic transcriptions of medieval Slavic manuscripts | - | 15872 |
05 Apr 2023 10:47 UTC | BetaMasaheft/Manuscripts | Manuscripts descriptions | eng, gez, amh, ara, tir, lat, ita, ces, deu, fra, heb, cop, grc, rus, syr, spa, hye | 16260 |
21 Jun 2022 15:49 UTC | acdh-oeaw/schnitzler-tagebuch-data | Quelldaten zum Tagebuch (1879–1931) von Arthur Schnitzler | - | 16426 |
11 Oct 2021 20:37 UTC | cbeta-git/CBR2X-XML | CBETA CBReader 2X XML | eng, zho, pli, san, x-unknown | 20392 |
30 May 2020 05:30 UTC | BetaMasaheft/coordinates | Records with coordinates from previous gazetteer. Spelling and transcription rules not usable | eng, gez | 21846 |
23 Jun 2021 20:36 UTC | sul-dlss/dlme-metadata | Harvested metadata for the Digital Library of the Middle East project | ara, heb, jrb, syc, arc, jpr, lad, grc, lat, fas, yej, yid | 23744 |
30 May 2020 05:30 UTC | pruizf/disco-ms | Scripts to reproduce results of our DSH paper about the DISCO corpus | spa | 24605 |
06 Apr 2023 05:46 UTC | CivilWarGovernorsOfKentucky/Documents | CWGK Documents in TEI-XML Format | - | 24956 |
05 Dec 2022 20:45 UTC | DARIAH-SI/siParl | Slovenian parliamentary corpus | slv, eng | 25471 |
28 Oct 2020 08:39 UTC | korakot/corpus | Mirror of CC-licensed corpora from AI FOR THAI | - | 27445 |
28 Dec 2024 06:57 UTC | Anterotesis/historical-texts | Collections of english historical texts and data relating to them | lat, eng, sco, fra, cym, frm, roa, deu, ita, mul, zxx, grc, fro, nld, spa, pau | 32851 |
19 May 2022 14:45 UTC | cceh/papyri-wl-data | Ausgangsdaten der Papyri-Wörterlisten; Input-Workflow aus FileMaker-XML | deu, eng, grc, lat | 35085 |
25 Feb 2021 08:46 UTC | textcreationpartnership (all repos) | (textcreationpartnership uses one repository per text. To make this table smaller they have been aggregated into one entry) | eng | 39344 |
08 Jan 2022 08:44 UTC | ELTE-DH/verskorpusz | Magyar versek TEI XML korpusza gépi annotációval | hun | 45471 |
30 May 2020 05:30 UTC | santheo/FLPS-data | - | - | 50423 |
30 May 2020 05:30 UTC | simondschweitzer/aed-tei | Corpus of Egyptian Texts for the AED - Ancient Egyptian Dictionary | deu, egy, zxx | 55798 |
08 Sep 2022 05:02 UTC | earlyprint/epmetadata | EarlyPrint project standoff metadata | - | 60331 |
14 Dec 2021 20:39 UTC | epigraphic-database-heidelberg/data | Data Dumps of Epigraphic Database Heidelberg | ara, eng, fra, deu, grc, ell, heb, ita, lat | 79201 |
19 May 2022 23:44 UTC | CasebooksProject/casebooks-data | - | eng, lat, und, ell, fra | 79897 |
05 Apr 2023 15:48 UTC | papyri/idp.data | Data from the Integrating Digital Papyrology project | fra, eng, deu, ita, spa, lat, ell, grc, egy, sem, ara, fas, arc, und, heb, cop, syr, pal, syc, kat, got, gcr | 186130 |