Discover TEI-encoded documents from GitHub public repositories.
| Last indexed | Repository | Description | Languages | Matching files |
|---|---|---|---|---|
| 27 Aug 2021 04:48 UTC | grace-reed/coralfloofs | - | eng | 965 |
| 29 Jun 2021 20:36 UTC | Kabongosalomon/task-dataset-metric-nli-extraction | This program produces the test data for classification over a set of predefined task#dataset#metrics#software labels. Given input a pdf file, it scrapes the text from the file using the Grobid parser, subsequently generating the test data file for input to the neural network classifier. | eng | 959 |
| 31 Aug 2021 12:58 UTC | grace-reed/smalldb | - | eng | 950 |
| 03 Apr 2023 19:47 UTC | wellcomecollection/wellcome-collection-tei | Manuscript Descriptions encoded according to the Text Encoding Initiative | ara, fas, eng, san, pra, hin, pka, guj, tam, spa, heb, msa, jpn, jav, grc, gez, bbc, btd, syr, zho, egy | 948 |
| 30 May 2020 05:30 UTC | JohannGillium/modern_english_ed | - | eng, fra, lat, ell, deu, grk, grc, heb, ita, rus, san | 948 |
| 24 Mar 2023 15:46 UTC | eeditiones/vangogh | Demo for TEI Publisher based on data of the edition "Vincent van Gogh - The Letters" | nld, fra, eng | 935 |
| 30 May 2020 05:30 UTC | EAGLE-BPN/ISicily | Inscriptions from Sicily | ara, eng, fra, deu, grc, ell, heb, ita, lat | 927 |
| 14 Dec 2022 21:45 UTC | dig-eg-gaz/content | TEI-encoded contents of the Egyptian Gazette | fra, deu, ita, ara | 926 |
| 15 Mar 2026 20:16 UTC | lb42/tei-fr | Automatically exported from code.google.com/p/tei-fr | frm, lat, ita, spa, deu, sco, mul, ell, oci, eus, nld, dan, heb, jpa, ara, oar, pcd, fra, por, bre, tup, eng, cat, grc, bul, lit, bel, pol, srp, sqi, zho, rus, und, isl, kat, all, cel, non, jpn, xml | 925 |
| 30 May 2020 05:30 UTC | scotartt/PerseusDL_canonical | - | eng, lat, grc, fra, deu, ita, tur | 923 |
| 05 Oct 2022 15:57 UTC | eeditiones/eltec | Demo app for data from European Literary Text Collection (ELTeC) | deu, eng, fra, hun, pol, por, ron, slv, spa, lat, gig, ita, ara, ell, glg, nld, cat, eus, grc, nor, jpn | 916 |
| 26 Mar 2026 12:37 UTC | srophe/manuscripts | Repository for Syriac Manuscript Projects | syr, ara, eng, fra, deu, lat, grc, cop, aii | 915 |
| 30 May 2020 05:30 UTC | dramacode/dramacode.github.io | Exposition des formats détachables, wiki commun | fra | 901 |
| 26 Jul 2022 20:47 UTC | bodleian/genizah-mss | Genizah TEI Catalogue | heb, kat, arc, jrb, jpr, ara, spa, lad, und, syc | 900 |
| 05 Apr 2023 10:47 UTC | arthur-schnitzler/arthur-schnitzler-arbeit | Working Environment for the edition of the correspondences of Arthur Schnitzler. No documentation, not intended for a wider audience | deu | 890 |
| 01 Apr 2022 06:50 UTC | PatrickHelling/transform-DHd-IoDHC | - | deu | 885 |
| 30 May 2020 05:30 UTC | Chartes-TNAH/theses | Positions des thèses de l’École des chartes | fra | 871 |
| 09 Sep 2022 11:47 UTC | KONDE-AT/thun-data | XML/TEI encoded transcriptions of the correspondence of Leo Thun Hohenstein. | - | 871 |
| 30 May 2020 05:30 UTC | innocentbadshah/pos-tagging | - | - | 865 |
| 30 May 2020 05:30 UTC | git-kale/POStagging | - | - | 865 |
| 30 May 2020 05:30 UTC | knielbo/LINK | Literary Network sprint repository | eng, mis, unk, spa, lat | 859 |
| 30 May 2020 05:30 UTC | MedKhem/grobid-dictionaries_data | Training Data for GROBID-DIctionaries | - | 853 |
| 21 May 2021 13:05 UTC | lascivaroma/latin-lemmatized-texts | Tagged corpora with metadata | - | 849 |
| 30 May 2020 05:30 UTC | cligs/theatreclassique | Texts in XML-TEI from the théâtre classique collection. | - | 848 |
| 14 Jun 2022 17:01 UTC | AaronFive/StageHyperpieces | Code written for my M2 Internship with Philippe Gambette and Céline Fournial | fra | 847 |
| 30 May 2020 05:30 UTC | demery/fornaldarsogur-map | Geolocating Icelandic manuscripts | isl, non, lat, spa, dan, fra, swe, eng, deu, nld | 843 |
| 14 Dec 2021 04:50 UTC | claraimc/poeMAS | - | - | 839 |
| 30 May 2020 05:30 UTC | djbpitt/raising | Flattening and unflattening XML markup | - | 828 |
| 17 Feb 2022 12:53 UTC | dQIOG/pez-letters | Daten Repo für Pez-Korrespondenz-Edition | - | 826 |
| 30 May 2020 05:30 UTC | dramacode/tcp5 | Théâtre Classique en balises TEIP5 régulières et corrigées | fra | 815 |
| 30 May 2020 05:30 UTC | dig-eg-gaz/jat18b | - | fra, deu, ita, ara | 804 |
| 30 May 2020 05:30 UTC | lb42/difdepo | ANR projet DIFDEPO : supports et conversions TEI | fra, eng | 800 |
| 27 Jul 2021 12:57 UTC | jdmartin/best-practices-in-coding-for-dh | - | - | 794 |
| 19 Oct 2022 23:58 UTC | alhuber1502/TGA | Archive of files relating to the Thomas Gray Archive, edited by Alexander Huber | eng, grc, lat, fra, ita | 794 |
| 09 Sep 2020 12:34 UTC | rettinghaus/Monatsberichte | Hofmeister Monatsberichte | deu, fra, ita | 793 |
| 04 Apr 2023 14:47 UTC | BetaMasaheft/Institutions | Institutions where Manuscripts are preserved | eng, gez, ita, fra, ara, deu, lat, amh, swe, rus | 791 |
| 06 Apr 2022 18:47 UTC | umd-mith/mishnah-data | TEI data for the Digital Mishnah Project | heb, eng | 788 |
| 31 May 2021 13:45 UTC | wujastyk/GRETIL-mirror | Snapshots of the GRETIL repository of South Asian (Sanskrit, Pali, etc.) etexts | eng, san, hin, tam, xct, deu | 784 |
| 19 Nov 2022 15:46 UTC | moravianlives/ML | Top level repository for Moravian Lives project | eng, deu | 780 |
| 30 May 2020 05:30 UTC | cdli-gh/Semantic-Role-Labeler | A semantic role labeling system for the Sumerian language. A Google Summer of Code '18 initiative. | sux, akk, eng, nld, deu, fra, rus, tur, ita, heb | 778 |
| 30 May 2020 05:30 UTC | srophe/bethqatraye-data | Data repository for bethqatraye | ara, cop, chu, deu, eng, spa, fra, gez, grc, hye, ita, kat, lat, nld, por, rus, sog, syr | 778 |
| 30 May 2020 05:30 UTC | BrotherGrilka/Tomcat7 | - | sux, akk, eng, nld, deu, fra, rus, tur, ita, heb | 778 |
| 30 May 2020 05:30 UTC | edwardclem/nidaba | Sumerian NLP Research | sux, akk, eng, nld, deu, fra, rus, tur, ita, heb | 777 |
| 30 May 2020 05:30 UTC | niekveldhuis/Digital-Assyriology | Tools and Examples for Computational Text Analysis for Assyriologists. | sux, akk, eng, nld, deu, fra, rus, tur, ita, heb | 777 |
| 30 May 2020 05:30 UTC | niekveldhuis/compass | Computational Assyriology | sux, akk, eng, nld, deu, fra, rus, tur, ita, heb | 777 |
| 30 May 2020 05:30 UTC | Mmyrmidons/ManfreddsAirport | - | - | 775 |
| 30 May 2020 05:30 UTC | NedinVa/king_carter_repo | also http://waynegraham.github.io/king_carter_repo/ | eng | 764 |
| 27 Jul 2021 17:01 UTC | npedrazzini/best-practices-for-coding-in-dh | Turing RSE-DH Summer School practical | - | 758 |
| 30 May 2020 05:30 UTC | cltk/grc_text_perseus | Collected Greek files from the Perseus Digital Library | eng, lat, deu, fra, ita, tur, pie | 753 |
| 30 May 2020 05:30 UTC | BrillForward/greek_text_perseus | - | eng, lat, deu, fra, ita, tur, pie | 753 |