Discover TEI-encoded documents from GitHub public repositories.
Last indexed | Repository | Description | Languages | Matching files |
---|---|---|---|---|
05 Apr 2023 15:48 UTC | papyri/idp.data | Data from the Integrating Digital Papyrology project | fra, eng, deu, ita, spa, lat, ell, grc, egy, sem, ara, fas, arc, und, heb, cop, syr, pal, syc, kat, got, gcr | 186130 |
19 May 2022 23:44 UTC | CasebooksProject/casebooks-data | - | eng, lat, und, ell, fra | 79897 |
14 Dec 2021 20:39 UTC | epigraphic-database-heidelberg/data | Data Dumps of Epigraphic Database Heidelberg | ara, eng, fra, deu, grc, ell, heb, ita, lat | 79201 |
08 Sep 2022 05:02 UTC | earlyprint/epmetadata | EarlyPrint project standoff metadata | - | 60331 |
30 May 2020 05:30 UTC | simondschweitzer/aed-tei | Corpus of Egyptian Texts for the AED - Ancient Egyptian Dictionary | deu, egy, zxx | 55798 |
30 May 2020 05:30 UTC | santheo/FLPS-data | - | - | 50423 |
08 Jan 2022 08:44 UTC | ELTE-DH/verskorpusz | Magyar versek TEI XML korpusza gépi annotációval | hun | 45471 |
25 Feb 2021 08:46 UTC | textcreationpartnership (all repos) | (textcreationpartnership uses one repository per text. To make this table smaller they have been aggregated into one entry) | eng | 39344 |
19 May 2022 14:45 UTC | cceh/papyri-wl-data | Ausgangsdaten der Papyri-Wörterlisten; Input-Workflow aus FileMaker-XML | deu, eng, grc, lat | 35085 |
26 Dec 2024 13:54 UTC | Anterotesis/historical-texts | Collections of english historical texts and data relating to them | lat, eng, sco, fra, cym, frm, roa, deu, ita, mul, zxx, grc, fro, nld, spa, pau | 32851 |
28 Oct 2020 08:39 UTC | korakot/corpus | Mirror of CC-licensed corpora from AI FOR THAI | - | 27445 |
05 Dec 2022 20:45 UTC | DARIAH-SI/siParl | Slovenian parliamentary corpus | slv, eng | 25471 |
06 Apr 2023 05:46 UTC | CivilWarGovernorsOfKentucky/Documents | CWGK Documents in TEI-XML Format | - | 24956 |
30 May 2020 05:30 UTC | pruizf/disco-ms | Scripts to reproduce results of our DSH paper about the DISCO corpus | spa | 24605 |
23 Jun 2021 20:36 UTC | sul-dlss/dlme-metadata | Harvested metadata for the Digital Library of the Middle East project | ara, heb, jrb, syc, arc, jpr, lad, grc, lat, fas, yej, yid | 23744 |
30 May 2020 05:30 UTC | BetaMasaheft/coordinates | Records with coordinates from previous gazetteer. Spelling and transcription rules not usable | eng, gez | 21846 |
11 Oct 2021 20:37 UTC | cbeta-git/CBR2X-XML | CBETA CBReader 2X XML | eng, zho, pli, san, x-unknown | 20392 |
21 Jun 2022 15:49 UTC | acdh-oeaw/schnitzler-tagebuch-data | Quelldaten zum Tagebuch (1879–1931) von Arthur Schnitzler | - | 16426 |
05 Apr 2023 10:47 UTC | BetaMasaheft/Manuscripts | Manuscripts descriptions | eng, gez, amh, ara, tir, lat, ita, ces, deu, fra, heb, cop, grc, rus, syr, spa, hye | 16260 |
30 May 2020 05:30 UTC | obdurodon/CollateOS | Machine-assisted collation and alignment of diplomatic transcriptions of medieval Slavic manuscripts | - | 15872 |
30 May 2020 05:30 UTC | TEI-EAJ/auto_aozora_tei | 青空文庫テキストのTEI自動化プロジェクト | - | 15156 |
05 Apr 2023 16:55 UTC | fihristorg/fihrist-mss | Fihrist TEI Catalogue | ara, fas, jpr, pus, und, tur, urd, eng, pan, heb, ota, hin, kur, deu, lat, chg, fra, gez, syr, syc, cop, grc, map, snd, san, mal, jrb, uig, prs, ave, ita, swa, msa, por, hau, inc, amh, zxx | 13967 |
30 Mar 2023 19:46 UTC | BetaMasaheft/Persons | Authority files for each person | eng, gez, fra, ara, heb, grc, cop, por, ita, rus, amh, deu, tir, lat, syr | 13765 |
30 May 2020 05:30 UTC | CivilWarGovernorsOfKentucky/TestDocuments | Repository for testing automatic Github integration | - | 12776 |
20 Apr 2021 12:59 UTC | rism-ch/onstage-tei | TEI files fed into onstage | - | 12548 |
30 May 2020 05:30 UTC | lascivaroma/pompei-inscriptions | Corpus of Inscriptions from Pompei, including graffitis. | lat | 12209 |
15 Jan 2023 05:43 UTC | tolstoydigital/TEI | All of Tolstoy in TEI/XML | rus | 12119 |
20 Jan 2023 13:45 UTC | welfare-state-analytics/riksdagen-corpus | - | eng, slv | 12030 |
14 May 2022 17:45 UTC | srophe/srophe-app-data | Repository for Syriaca.org TEI data, used by srophe-eXist-app. | ara, cop, chu, deu, eng, spa, fra, gez, grc, hye, ita, kat, lat, nld, por, rus, sog, syr, ell | 12026 |
31 Dec 2020 08:59 UTC | welfare-state-analytics/riksdagen-corpus-old | Preprocess the proceedings of the Swedish parliament | eng, slv | 11208 |
02 Apr 2023 19:45 UTC | Brown-University-Library/iip-texts | IIP inscriptions encoded in Epidoc XML and supporting files | arc, grc, heb, lat, phn, kat, syc, xcl | 11011 |
30 May 2020 05:30 UTC | jawalsh/tei_text | - | eng, fra, deu, lat, grc, spa, afr, ita, nld, por, heb, hin, ara, gai, haw, ota, nai, sco, tur, zho, nor, pol, rus, ell, msa, ton, arc, ang, tam, tah, jpn, gae, rom, sve, alg, urd, dan, fij, isl, pli, cym | 10915 |
05 Apr 2023 11:45 UTC | bodleian/medieval-mss | Medieval Manuscripts in Oxford Libraries: TEI catalogue descriptions | lat, eng, ita, deu, fra, nld, grc, spa, zxx, gle, cym, ell, cor, chu, cat, ces, heb, pro, por, cop, egy, ara, rus, isl, fry, und, cai, gmh, goh, ang, enm, xno | 10595 |
30 May 2020 05:30 UTC | uvalib/valleyshadow | Migration of valleyshadow.lib from Tomcat/Coccoon to Go | eng | 10117 |
30 May 2020 05:30 UTC | BD2K-Aztec/Aztec-TextSummarizing | - | eng, nor | 9929 |
30 May 2020 05:30 UTC | cltk/chinese_text_cbeta_02 | Chinese Buddhist scriptures from CBETA | pli, san, eng, zho, x-unknown | 8982 |
30 May 2020 05:30 UTC | IraPS/Tolstoy_letters_and_diaries | - | - | 8811 |
30 May 2020 05:30 UTC | cltk/chinese_text_cbeta_01 | Chinese Buddhist scriptures from CBETA | san, zho, eng | 8749 |
30 May 2020 05:30 UTC | christiancasey/iip-word-lists | Python utility for creating word lists from epidoc files | arc, grc, heb, lat | 8729 |
28 Aug 2020 08:32 UTC | srophe/draft-data | Repository for TEI records in development. | eng, zho, ara, syr, fra, lat, deu, rus, cop, chu, spa, gez, grc, hye, ita, kat, nld, por, sog, ell | 8684 |
30 May 2020 05:30 UTC | swift-poems-project/tei-transcripts | - | eng | 8669 |
30 May 2020 05:30 UTC | hinrikur/MLT201F-stylometry | Stylometric analysis of short anonymous texts in Icelandic. | - | 8254 |
05 Apr 2023 13:11 UTC | Handrit/Manuscripts | Icelandic Manuscript descriptions using TEI P5 | isl, lat, nor, dan, eng, deu, non, swe, nds, ita, dum, nld | 8166 |
03 Oct 2022 09:03 UTC | hlapin/eRabbinica | - | eng, heb, cop, ara | 8033 |
30 May 2020 05:30 UTC | grtkachenko/SimpleSearchEngine | - | - | 7972 |
30 May 2020 05:30 UTC | OpenGreekAndLatin/Teubner3-grc-dev | - | - | 7946 |
30 May 2020 05:30 UTC | swift-poems-project/swift-transcripts | Transcripts for the Swift Poems Project | eng | 7835 |
08 Feb 2023 06:49 UTC | dsldk/herman-bang | Data til projektet Herman Bangs breve. | dan, fra | 7789 |
28 Dec 2022 10:45 UTC | dsldk/diplomatarium-danicum | Data sources for Diplomatarium Danicum | lat, gmh, xda, eng, deu, nld, xno, fra, gml, nil, gda, sme, dan, reg, swe, dum | 7474 |
12 Aug 2021 12:58 UTC | opentorah/alter-rebbe.org | Digital archive of the early history of Chabad; 19 Kislev Archive. | rus, heb, deu, pol, fra, lit, lat, yid | 7267 |