We solved *Semantic Search-as-you-Type* ⌨️
It is faster than embedding inference ⚡
It is Open-Source 👐
It is in Rust 🦀
We use it for our docs 📜
Read more how 👇
Dites-moi si vous les avez déjà lus 😎
Idem si vous en recommandez d’autres ou si vous en voulez plus !
ALT - The Coaching Habit, Michael Bungay Stanier
- Architecture Patterns with Python, Harry J.W. Percival & Bob Gregory
- The Lean Startup, Eric Ries
- The Man who Solved the Market, Gregory Zuckerman
- Hacker’s Delight, Henry S. Warren Jr.
New settings on this platform, by default it switched to only allow messages from verified users. This sucks if you want people to reach out to you. Also sucks if you want to reach out to people who haven't seen this change.
Ah yes, the well known "except you, FAANG" clause that's so common in *open source* licenses like GPL, MIT, BSD, Apache2, ...
Here I go again, this can't be for real lol
This is huge: Llama-v2 is open source, with a license that authorizes commercial use!
This is going to change the landscape of the LLM market.
Llama-v2 is available on Microsoft Azure and will be available on AWS, Hugging Face and other providers
Pretrained and fine-tuned models are available with 7B, 13B and 70B parameters.
Llama-2 website: ai.meta.com/llama/
Llama-2 paper: ai.meta.com/research/publica…
A number of personalities from industry and academia have endorsed our open source approach: about.fb.com/news/2023/07/ll…
It's so stupid to try to teach teenagers "entrepreneurship" by having them pitch made-up startup ideas. You make them focus on the one thing they shouldn't be focusing on, and then you rate them by their appeal to investors, instead of users.
this is wild — kNN using a gzip-based distance metric outperforms BERT and other neural methods for OOD sentence classification
intuition: 2 texts similar if cat-ing one to the other barely increases gzip size
no training, no tuning, no params — this is the entire algorithm:
ALT for (x1, _) in test_set:
Cx1 = len(gzip.compress(x1.encode()))
distance_from_x1 = []
for (x2, _) in training_set:
Cx2 = len(gzip.compress(x2. encode())
x1x2 = " ".join([x1, x2])
Cx1x2 = len(gzip.compress(x1x2. encode())
ncd = (Cx1x2 - min(Cx1,Cx2)) / max(Cx1, Cx2)
distance_from_x1.append(ncd)
sorted_idx = np.argsort(np.array(distance_from_x1))
top_k_class = training_set[sorted_idx[:k], 1]
predict_class = max(set(top_k_class), key=top_k_class.count)
this paper's nuts. for sentence classification on out-of-domain datasets, all neural (Transformer or not) approaches lose to good old kNN on representations generated by.... gzip aclanthology.org/2023.findin…
A 14-line Python script using gzip outperforming a 345m parameter transformer model is probably the most hilarious result I've seen all year.
aclanthology.org/2023.findin…
RIP create-react-app. The current meta in web apps changes so quickly.
When I need to throw together a simple app I always need to figure it out again.
A good combo for me right now seems to be vitejs.dev/ with daisyui.com/.
"Hey {name}".format(name="twitter")
Templates written by users are not safe when using something like jinja or the standard `format`.
If you know of a #python library that can help with this, please share !
With AI prompts everywhere, I'm sure this is an issue.
I had some problems deploying monorepos to something that "just runs it".
Turns out railway.app/ has been really great at doing exactly that. You don't even need to create an account to try it out. Just spin up a database and let it run some code it gets from @github
Python has a different meaning for "precision" than me. A more accurate one actually :-)
It's interesting how I completely missunderstand documentation sometimes...
ALT Screenshot of a python interpreter showing:
>>> import decimal
>>> context = decimal.Context(prec=2)
>>> context.create_decimal_from_float(123.456)
Decimal('1.2E 2')
>>> # I expected 123.45, what about you?
ALT Screenshot of a python interpreter showing:
>>> context.create_decimal_from_float(
... 123.456
... ).quantize(decimal.Decimal(10) ** -2)
Decimal('123.46')
>>> # It's not 123.45 but I'll take it.
My favorite HTTP client for python is github.com/encode/httpx
A trick that did wonders for me is combining it with pydantic to load HTTPX PROXY settings from the environment through JSON. It's extremely flexible.
ALT A python dictionary containing configuration for complex routing to different HTTP proxies. There are examples with different protocols, domains, ports and combination of those.