NLP for Historical Languages

Joined December 2018
6 Photos and videos
Pinned Tweet
The CLTK has released a new major version. (1.0). For a quick introduction to our API: github.com/cltk/cltk/blob/ma… Docs for new code at: docs.cltk.org/ Old docs: legacy.cltk.org/

8
18
Classical Language Toolkit retweeted
#ML4AL starts tomorrow @aclmeeting!💫 We thank our sponsors @GoogleDeepMind @scrollprize @athenaRICinfo, all the authors presenting, our Program Organising Committees for their outstanding work. We received the highest number of workshop submissions at ACL2024! 💎 @iassael & John
1
2
6
1,435
Classical Language Toolkit retweeted
Working on a way to merge the best parts of @CLTKorg , @spacy_io , and GliNER right now. I nearly have a working prototype for Latin. The benefit of this is flexibility. You can drop in your own spaCy or GLiNER models easily. For the spaCy pipeline, I'm using @diyclassics 's LatinCy, but you could easily switch in your own spaCy pipeline as well! You can disable and activate pipes when it makes sense for your project too!
4
6
24
2,453
Classical Language Toolkit retweeted
Each day there is a new development for Latin machine learning, it seems. Check out this LLM pretrained exclusively on Latin. @CLTKorg
the first LLM pretrained exclusively on Latin! @WilliamGao1729 and i are merging modernity with antiquity in a series of experiments we found a dataset of ~500 billion Latin tokens and couldn't resist we pretrained using the GPT2 architecture and an H100 from @akashnet
1
9
498
One of our maintainers will attend #LT4HALA2024 @LrecColing 🖐️
I'm going to attend #LT4HALA2024 online on Saturday @LrecColing. If you want to talk about @CLTKorg or other tools usable for ancient languages, I'll be available!
1
1
242
The maintenance of CLTK is quite slow because we don't have enough volunteers to code, to review code, to analyze issues, etc. @clemsciences made the new pre-release in his free time but it is not enough. Who wants to join?
1
1
2
150
Classical Language Toolkit retweeted
#johdnews The #CfP is now open for our new special collection "Representing the Ancient World through data" 📢 We invite submissions of #datapapers describing your work on Ancient World data📚📊 🗓️Deadline 1 September 2023 Full Call for Papers at 👉 shorturl.at/im157 1/2
1
16
39
6,583
Classical Language Toolkit retweeted
LatinCy is an amazing new @spacy_io pipeline for parsing Latin texts natively with the spaCy Python Library. It was created by Patrick J. Burns (@diyclassics). LatinCy: spacy.io/universe/project/la… Here is a quick video on it: youtube.com/watch?v=4vOlGZGZ…
1
16
51
7,459
If you want to participate improving LatinCy, it's here 👇
3
369
One of the CLTK maintainers has a small role in #ALP2023 conference ancientnlp.com/alp2023/

3
323