TEIhub
Discover TEI-encoded documents from GitHub public repositories.

Last indexed Repository Description Languages Matching files
30 May 2020 05:30 UTC Tech-​Leaderboard/​nips_​scraper Scrape from https://papers.nips.cc/ eng, som, spa, sqi, dan, por 7234
30 May 2020 05:30 UTC MorielV/​Digital-​Humanities-​-​-​Ass2 parsing song lyrics in python. heb 7202
30 May 2020 05:30 UTC ananana/​scientific_​authorship_​data - eng 7195
30 May 2020 05:30 UTC providedh/​ACDH_​Salzburg_​recipes Parser for the XML Recipes deu 7038
30 May 2020 05:30 UTC giladax/​digi-​proj-​GUI - heb 6854
30 May 2020 05:30 UTC srophe/​persons Public Respository for Syriaca persons projects, including authorities, hagiography, and prosopography syr, eng, ara, fra, deu, lat, rus, ita, ell, grc 6711
30 May 2020 05:30 UTC uvalib/​ead-​utils Tools used to process and ingest EAD xml finding aids into the repository and solr. eng, spa, fra 6687
25 Feb 2023 09:45 UTC pruizf/​disco Diachronic Spanish Sonnet Corpus. Canonical and minor authors in Spanish (Europe, America and Asia): 15th to 19th century spa 6616
30 May 2020 05:30 UTC centre-​for-​humanities-​computing/​grundtvig-​data Data repository for all data related to the grundtvig center dan 6551
09 Feb 2023 08:49 UTC telota/​jean_​paul_​briefe Daten der digitalen Edition "Jean Paul – Sämtliche Briefe digital" - 6437
11 Mar 2023 13:02 UTC christopheparisse/​evalang Données partagées pour le projet Evalang fra, eng 6403
11 Feb 2023 04:49 UTC BetaMasaheft/​Places Any place mentioned in the catalogue eng, orm, ara, gez, amh, lat, oro, fra, som, grc, deu, heb, ita, swa, tir, rus, spa, swe, roh, nor 6309
30 May 2020 05:30 UTC OpenGreekAndLatin/​Teubner1-​grc-​dev Raw OCR of out-of-copyright Teubner editions - 6089
30 May 2020 05:30 UTC EAGLE-​BPN/​Inscriptions-​from-​Dacia draft website for inscriptions contributed to EAGLE from UBB Cluj Napoca ara, eng, fra, deu, grc, ell, heb, ita, lat 5904
30 May 2020 05:30 UTC MariaBarrett/​Datamining-​project - eng 5524
30 May 2020 05:30 UTC utda/​text - - 5513
21 Mar 2023 20:45 UTC ISicily/​ISicily EpiDoc files for the I.Sicily project eng, ita, grc, lat, heb, phn, xpu, osc, xly, scx, sxc 5431
03 Apr 2023 02:51 UTC peterwebster/​henson Master data store for the Hensley Henson Journals project, and issue tracker. The application code is kept elsewhere. - 5349
30 May 2020 05:30 UTC blumenbach/​blumenbach-​tei Blumenbach TEI Datenbank deu, eng, fra, nld, dan, ita, rus 5318
30 May 2020 05:30 UTC sros-​UNED/​disco Diachronic Spanish Sonnet Corpus. Canonical and minor authors in Spanish (Europe and America): 15th to 19th century - 5289
30 May 2020 05:30 UTC OpenGreekAndLatin/​Teubner2-​grc-​dev - - 5236
14 Mar 2021 20:38 UTC svakulenk0/​uva-​lsdp-​course Language, Speech and Dialogue Processing course @ University of Amsterdam 2021 nld, eng 5198
30 May 2020 05:30 UTC sims-​mss/​openn-​xml TEI-XML files from OPenn - 5174
27 Jan 2023 11:45 UTC bncolorado/​CorpusSonetosSigloDeOro Corpus of Spanish Golden-Age Sonnets (with metrical annotation) / Corpus de Sonetos del Siglo de Oro (con anotación métrica) - 5078
30 May 2020 05:30 UTC ldkhanh/​CSCI-​544-​ClassProject Spanish Poetry Generation using Recurrent Neural Networks - 5077
30 May 2020 05:30 UTC stevenly/​pix2poem Spanish Poetry Generation using Recurrent Neural Networks - 5077
30 Mar 2023 09:47 UTC BetaMasaheft/​Works Ethiopian Literature edited in TEI eng, gez, ara, lat, grc, amh, syr, kat, pal, ita, heb, cop, tir, deu, fra 5022
05 Apr 2023 15:48 UTC cbeta-​git/​xml-​p5a CBETA XML P5a 版本 eng, zho, pli, san, x-unknown 4869
20 Feb 2023 20:44 UTC cbeta-​org/​xml-​p5 CBETA XML P5 版本 eng, zho, pli, san, x-unknown 4866
15 Jun 2022 11:42 UTC DILA-​edu/​word-​segment - - 4830
30 May 2020 05:30 UTC thsh77/​textbase A collection of markdown texts - 4801
30 May 2020 05:30 UTC OpenGreekAndLatin/​septuagint-​dev Machine-corrected version of Henry Barclay Swetes Septuagint. ell, lat 4736
11 Apr 2023 10:46 UTC 84000/​data-​tei TEI files of the translations san, bod, zho, eng, pli, lat, jpn 4719
30 May 2020 05:30 UTC Chenlisk/​xml-​p5a-​new 因應「CB校註」所轉換的新版 xml-p5a eng, zho, san, pli, x-unknown 4718
30 May 2020 05:30 UTC cbeta-​org/​xml-​p5-​2018 CBETA XML P5 版本 (2013 - 2018) eng, zho, san, pli, x-unknown 4717
30 May 2020 05:30 UTC cbeta-​git/​xml-​p5a-​2018 CBETA XML P5a 版本 (2013 - 2018) eng, zho, san, pli, x-unknown 4717
29 Mar 2023 11:45 UTC DARIAH-​ERIC/​lexicalresources Data space of the DARIAH Lexical Resources Working Group por, lat, fra, spa, eng, deu, bar, mix, gsw, swe, kat, nds, nor, slv, gmh, mig, miy, miz, smd, pit, pie, ine, ell, pol, bul, ibe, rus, ara, und, dan, chu, grc 4706
09 Nov 2022 17:49 UTC whitmanarchive/​whitman-​correspondence Data Repo | Whitman Correspondence TEI - 4604
27 Dec 2021 20:40 UTC newtfire/​newtfire-​site - eng, ita, fra, lat 4507
30 May 2020 05:30 UTC 84000/​data 84000 XML data files bod, san, zho, pli, eng, lat, jpn 4442
30 May 2020 05:30 UTC heavenchou/​xml_​p4 CBETA XML P4 pli, san, eng, zho 4429
15 Dec 2022 19:44 UTC Antonomaz/​Corpus Collection de mazarinades encodées en XML-TEI. fra 4422
30 May 2020 05:30 UTC utkdigitalinitiatives/​tdh-​migration TEI migration from P2 SGML/XML to P5. - 4403
30 May 2020 05:30 UTC cbeta-​git/​xml_​p4 CBETA XML P4 pli, san, eng, zho 4399
30 May 2020 05:30 UTC JamesWolfe753/​Patrologia-​Latina-​Corrected - grc, lat, heb 4241
20 Sep 2020 16:32 UTC pminhtam/​entity-​fishing-​custom - eng, fra, deu, spa, ita, pol 4239
30 May 2020 05:30 UTC grasshoff/​vorlesung2019 - eng, fra, dan, ita, spa, ces 4202
30 May 2020 05:30 UTC martinmueller39/​TCP2ESTC Experimental relabeling of TCP texts by decade with aligment to ESTC, including four decades at 40year intervals eng, unk 4139
30 May 2020 05:30 UTC WesScivetti/​Phonesthemes-​Project - eng 4052
24 Nov 2021 01:39 UTC antonkarl/​iceErrorCorpus An Icelandic Error corpus, annotated for mistakes related to spelling, grammar, and other issues. - 4046