#ML4AL starts tomorrow @aclmeeting!💫
We thank our sponsors @GoogleDeepMind@scrollprize@athenaRICinfo, all the authors presenting, our Program Organising Committees for their outstanding work. We received the highest number of workshop submissions at ACL2024! 💎 @iassael & John
Working on a way to merge the best parts of @CLTKorg , @spacy_io , and GliNER right now. I nearly have a working prototype for Latin. The benefit of this is flexibility. You can drop in your own spaCy or GLiNER models easily. For the spaCy pipeline, I'm using @diyclassics 's LatinCy, but you could easily switch in your own spaCy pipeline as well!
You can disable and activate pipes when it makes sense for your project too!
the first LLM pretrained exclusively on Latin!
@WilliamGao1729 and i are merging modernity with antiquity in a series of experiments
we found a dataset of ~500 billion Latin tokens and couldn't resist
we pretrained using the GPT2 architecture and an H100 from @akashnet
I'm going to attend #LT4HALA2024 online on Saturday @LrecColing. If you want to talk about @CLTKorg or other tools usable for ancient languages, I'll be available!
A new alpha release is out
github.com/cltk/cltk/release…. CLTK is now available from Python 3.9 to Python 3.12. Tests as well as contributions are welcome!
The maintenance of CLTK is quite slow because we don't have enough volunteers to code, to review code, to analyze issues, etc. @clemsciences made the new pre-release in his free time but it is not enough. Who wants to join?
#johdnews
The #CfP is now open for our new special collection "Representing the Ancient World through data" 📢
We invite submissions of #datapapers describing your work on Ancient World data📚📊
🗓️Deadline 1 September 2023
Full Call for Papers at 👉 shorturl.at/im157
1/2
The @CLTKorg package is used by digital philologists. I sketched a way to search patterns inside texts. The idea is described here github.com/cltk/cltk/discuss…