Discover TEI-encoded documents from GitHub public repositories.
| Last indexed | Repository | Description | Languages | Matching files |
|---|---|---|---|---|
| 28 Jun 2020 08:33 UTC | djakacki/ML | Top level repository for Moravian Lives project | eng, deu | 751 |
| 30 Jun 2020 12:32 UTC | cpirmann/ML | Top level repository for Moravian Lives project | eng, deu | 751 |
| 09 Jun 2021 08:46 UTC | BuddhaNexus/segmented-sanskrit | - | eng, san | 749 |
| 18 Mar 2023 18:46 UTC | scta-texts/plaoulcommentary | - | eng, lat | 749 |
| 30 May 2020 05:30 UTC | eric-muller/tc | - | - | 737 |
| 30 May 2020 05:30 UTC | mvkushnir/TranscriptionRepair | Using NLP to fix transcription gaps in old books | eng, unk, lat | 737 |
| 30 May 2020 05:30 UTC | conmaol/Fieldwork-XML | Gold Standard | eng, pol | 725 |
| 30 Jun 2020 12:32 UTC | schaumju/ML | Top level repository for Moravian Lives project | eng, deu | 723 |
| 05 Feb 2025 20:55 UTC | aso2101/satavahana-inscriptions-data | Data for Sātavāhana Inscriptions project | ara, cop, chu, deu, eng, spa, fra, gez, grc, hye, ita, kat, lat, nld, por, rus, sog, syr, pra, san, mar, tel, kan, und | 714 |
| 30 May 2020 05:29 UTC | JULIELab/alz | This repository provides the compiled full-text corpus of the Allgemeine Literatur-Zeitung (General Literature Gazette) published from 1785 to 1849 | deu | 714 |
| 20 Dec 2021 20:40 UTC | anaistack/cefr-asag-corpus | A corpus of short answers written by learners of English and graded with CEFR levels | - | 708 |
| 30 May 2020 05:30 UTC | charlietaylor98/vangogh-gang | - | eng | 703 |
| 18 Nov 2020 16:49 UTC | azunig/kul-corpus-lin-course | - | eng, lat, cel, spa, deu, fra, ita, rus, pol, jpn, ell, swa, isl, nor, sco, san, grc | 700 |
| 14 Jul 2021 12:59 UTC | fholstege/NLP_project | Code for our NLP project | eng | 696 |
| 27 Mar 2023 14:51 UTC | arthur-schnitzler/pollaczek-data | Source files for Clara Katharina Pollaczek – Arthur Schnitzler und ich. In Development | deu | 688 |
| 30 May 2020 05:30 UTC | paregorios/srpdemo1 | - | eng, syr, ara, fra, deu, lat | 687 |
| 20 Jan 2023 11:46 UTC | PerseusDL/canonical-latinLit | XML Canonical resources for Latin Literature | eng, lat, deu, fra, ita, grc | 686 |
| 23 Apr 2022 05:41 UTC | cceh/capitularia | Digital Edition of the Frankish Capitularies | deu, eng | 686 |
| 30 May 2020 05:30 UTC | cligs/textbox | Text collections made available by the CLiGS group. | fra, lat, eng, deu, ita, glg, cat, nld, por | 685 |
| 16 Mar 2023 18:50 UTC | Karl-Barth/kbga-edition-data | TEI Data of the Karl Barth-Gesamtausgabe | deu, ell, eng, fra | 683 |
| 10 Apr 2023 18:48 UTC | TST-Project/mss | Woking repository for the TST project | san, tam, fra, por, lat, eng, hin, bod, mal, mar | 678 |
| 30 May 2020 05:30 UTC | dlina/project | all data and scripts, see | - | 671 |
| 11 Apr 2023 10:46 UTC | arthur-schnitzler/schnitzler-briefe-data | Source files with correspondence pieces from and to Arthur Schnitzler, encoded in TEI-XML. Data for https://schnitzler-briefe.acdh.oeaw.ac.at/ | deu | 664 |
| 12 Sep 2022 14:09 UTC | kb-dk/SKS_tei | Søren Kierkegaard Skrifter (SKS) in Text Encodig Initiative XML. | - | 654 |
| 25 Mar 2023 11:45 UTC | romanticcircles/rc-tei | - | - | 651 |
| 23 Aug 2022 07:02 UTC | Beth-Mardutho/hugoye-data | Data repository for Hugoye TEI records. | ara, cop, chu, deu, eng, spa, fra, gez, grc, hye, ita, kat, lat, nld, por, rus, sog, syr | 649 |
| 15 Jan 2022 17:01 UTC | wlpotter/csv-to-srophe | A set of XQuery modules for converting CSV data to Srophe-compliant TEI XML records. Developed for Syriaca.org | eng | 641 |
| 27 Mar 2022 04:50 UTC | himmeproject/places | Place data for the Historical Index of the Medieval Middle East | - | 637 |
| 10 Apr 2023 17:47 UTC | cceh/pessoa | Digital Edition of Fernando Pessoa | eng, por, deu, fra | 636 |
| 09 Jun 2025 11:58 UTC | srophe/bibl | - | - | 636 |
| 30 Dec 2020 13:00 UTC | 7h3f0x/Part-of-Speech-Tagging | - | - | 635 |
| 19 Nov 2020 08:32 UTC | karmadolkar/AI-Project | This is the course project for AI | - | 635 |
| 10 Dec 2020 08:51 UTC | aseem09/POS-Tagger | - | - | 635 |
| 07 Dec 2020 12:55 UTC | rishichordia/POS-Tagging | - | - | 635 |
| 01 Dec 2020 08:38 UTC | Padmapriya-09/POS-tagger_BNC-Corpus | - | - | 635 |
| 19 Nov 2020 16:53 UTC | as1ngh/AI-assignment-V-sem | - | - | 635 |
| 24 Apr 2022 11:40 UTC | srophe/e-gedsh | This repository is for the electronic edition of the Gorgias Encyclopedia of the Syriac Heritage. Data posted here is copyright 2016 Beth Mardutho: The Syriac Institute and licensed under the Creative Commons Attribution-NonCommercial 4.0 International Public License (CC-BY-NC 4.0) https://creativecommons.org/licenses/by-nc/4.0/legalcode. | ara, cop, chu, deu, eng, spa, fra, gez, grc, hye, ita, kat, lat, nld, por, rus, sog, syr | 630 |
| 14 Sep 2022 14:53 UTC | IreneVagionakis/CretanInscriptions | An EFES EpiDoc collection of inscriptions pertaining to Cretan institutions (VII-I century BC) | eng, ita, grc, rus, chu, ara, fra, deu, ell, heb, lat | 627 |
| 08 Mar 2023 09:46 UTC | medieval-source-book/texts | TEI-XML files of texts in the GMS. | lat, deu | 625 |
| 24 Mar 2023 16:53 UTC | BucknellDSC/suzette | - | fra, eng | 625 |
| 10 Nov 2022 17:49 UTC | DesenrollandoElCordel/engravings-catalogue | Catalogue of engravings, extracted from Spanish chapbooks (19th c.) and encoded with the standard XML-TEI | - | 620 |
| 03 Sep 2021 12:56 UTC | radardenker/gretil-corpus-tei | - | eng, san | 620 |
| 12 Apr 2021 08:48 UTC | calzada/PARLAMINT-ES-MC | - | spa, eng | 619 |
| 30 May 2020 05:30 UTC | csae8092/dhd-boas-data | dhd-boas-data stands for DHd Book of Abstracts Data and is a first attempt to collectivetly collect the data of the book of abstracts of the past and current yearly DHd conferences. | eng, deu, fra, ita, spa, por, afr, pol, cat, ind, nld, slv, dan, nor, ron, swe, ces, lit, som, est, swa, vie | 615 |
| 30 May 2020 05:30 UTC | evelyne96/PresentationGen | - | eng, ita | 611 |
| 30 May 2020 05:30 UTC | grmek/oo-projekt-1 | - | slv, eng | 610 |
| 30 May 2020 05:30 UTC | Gucekpuhar/parlament | - | slv, eng | 610 |
| 30 May 2020 05:30 UTC | Data-Science-for-Linguists/Native_and_Non-native_English | Katherine Kairis LING 1340 Term Project | eng, ara, bul, zho, ces, dan, nld, est, fin, fra, deu, hun, ita, jpn, kor, lat, lav, lit, mlt, nor, pol, por, ron, rus, slk, slv, spa, swe, tur, und | 609 |
| 30 May 2020 05:30 UTC | sintakticniSladkorcek/slovenski_parlament | Interesting facts about the sessions of the Slovenian Parliament between the years 1990 and 1992 collected in one place. | slv | 609 |
| 07 Dec 2020 17:07 UTC | katabase/DTS | - | fra, eng | 606 |