Joined November 2008
157 Photos and videos
Blaise Madeline retweeted
1 Mar 2025
THREAD - Dans les dernières 24 heures, près de 35 000 comptes francophones ont parlé de la rencontre Trump/Zelensky sur X. Nos analyses ont permis de mettre en évidence de nombreux comportements coordonnés visant à manipuler l'opinion publique en faveur de la Russie ⤵️
120
917
1,847
314,900
Blaise Madeline retweeted
[CP] Législatives 2024 : le Parti Pirate vote le soutien au Nouveau Front Populaire contre l'extrême droite
283
863
5,118
476,173
Blaise Madeline retweeted
11 May 2024
Haha 😂🤔😭
Tumbleweeds.
41
179
2,309
293,531
Blaise Madeline retweeted
You know what the biggest problem with pushing all-things-AI is? Wrong direction. I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do my laundry and dishes.
563
20,502
92,657
3,339,173
Blaise Madeline retweeted
10 Mar 2024
Mais quelle pépite !!! Ils ont refait ma journée 🤩🤣 Quand les goguettes s'attaquent musicalement à Hanouna et Bolloré, c'est vraiment jubilatoire ! A écouter sans modération 😁
294
4,110
10,349
806,291
Blaise Madeline retweeted
The perfect quote to describe LLMs can be found in a 1946 Jean Cocteau movie -- "Réfléchissez pour moi, je réfléchirai pour vous" (think for me, I will reflect for you). What you get from the model is always a reflection of the training data you put in -- itself a by-product of human thinking.
22
88
521
79,643
Blaise Madeline retweeted
15 Feb 2024
OpenAI just dropped Sora, the most advanced AI video generator EVER People are rubbing their eyes in disbelief. 15 unbelievable videos:🧵
47
63
408
227,316
Blaise Madeline retweeted
Judith Godrèche dit quelque chose de terrible dans son interview, c'est qu'elle se sent coupable d'avoir contribué à glamouriser les relations très jeunes filles-hommes mûrs. C'est évidemment une inversion totale de la culpabilité, mais elle dit un truc important.
44
560
3,029
471,563
Blaise Madeline retweeted
RAG is one of the best (and easiest) ways to specialize an LLM over your own data, but successfully applying RAG in practice involves more than just stitching together pretrained models… What is RAG? At the highest level, RAG is a combination of a pretrained LLM with an external (searchable) knowledge base. At inference time, we can search for relevant textual context within this knowledge base and add it to the LLM’s prompt. Then, the LLM can use its in context learning abilities to leverage this added context and produce a more factual/grounded output. Simple implementation. We can create a minimal RAG pipeline using a pretrained embedding model and LLM by: 1. Separating the knowledge base into fixed-size chunks. 2. Vectorizing each chunk with an embedding model. 3. Vectorizing the input/query at inference time and using vector search to find relevant chunks. 4. Adding relevant chunks into the LLM’s prompt. This simple approach works, but building a high-performing RAG application requires much more. Here are five avenues we can follow to refine our RAG pipeline. (1) Hybrid Search: At the end of the day, the retrieval component of RAG is just a search engine. So, we can drastically improve retrieval by using ideas from search. For example, we can perform both lexical and vector retrieval (i.e., hybrid retrieval), as well as re-ranking via a cross-encoder to retrieve the most relevant data. (2) Cleaning the data: The data used for RAG may come from several sources with different formats (e.g., pdf, markdown and more), which could lead to artifacts (e.g., logos, icons, special symbols, and code blocks) that could confuse the LLM. We can solve this by creating a data preprocessing or cleaning pipeline (either manually or by using LLM-as-a-judge) that properly standardizes, filters, and extracts data for RAG. (3) Prompt engineering: Successfully applying RAG is not just a matter of retrieving the correct context—prompt engineering plays a massive role. Once we have the relevant data, we must craft a prompt that i) includes this context and ii) formats it in a way that elicits a grounded output from the LLM. First, we need an LLM with a sufficiently large context window. Then, we can adopt strategies like diversity and lost-in-the-middle selection to ensure the context is properly incorporated into the prompt. (4) Evaluation: We must also implement repeatable and accurate evaluation pipelines for RAG that capture the performance of the whole system, as well as its individual components. We can evaluate the retrieval pipeline using typical search metrics (DCG and nDCG), while the generation component of RAG can be evaluated with an LLM-as-a-judge approach. To evaluate the full RAG pipeline, we can also leverage systems like RAGAS. (5) Data collection: As soon as we deploy our RAG application, we should begin collecting data that can be used to improve the application. For example, we can finetune retrieval models over pairs of input queries with relevant textual chunks, finetune the LLM over high-quality outputs, or even run AB tests to quantitatively measure if changes to our RAG pipeline benefit performance. What’s next? Beyond the ideas explored above, there are a variety of avenues that exist for improving RAG. Once we have implemented a robust evaluation suite, we can test a variety of improvements using both offline metrics and online AB tests. Our approach to RAG should mature (and improve!) over time as we test new ideas.
16
260
1,349
184,371
ho merde, ça file
1
39
Blaise Madeline retweeted
I tried 14 of the multimodal reasoning examples from the @GoogleDeepMind Gemini paper on @OpenAI's chatGPT-4 (with vision). didn't even transcribe the prompts, I just pasted the images of prompts. GPT-4 gets ~12/14 right. 14 part boring thread.
32
291
2,591
1,382,483
Blaise Madeline retweeted
24 Nov 2023
Please ignore the deluge of complete nonsense about Q*. One of the main challenges to improve LLM reliability is to replace Auto-Regressive token prediction with planning. Pretty much every top lab (FAIR, DeepMind, OpenAI etc) is working on that and some have already published ideas and results. It is likely that Q* is OpenAI attempts at planning. They pretty much hired Noam Brown (of Libratus/poker and Cicero/Diplomacy fame) to work on that. [Note: I've been advocating for deep learning architecture capable of planning since 2016].
227
707
5,673
1,436,864
Blaise Madeline retweeted
23 Nov 2023
Current LLMs are trained on text data that would take 20,000 years for a human to read. And still, they haven't learned that if A is the same as B, then B is the same as A. Humans get a lot smarter than that with comparatively little training data. Even corvids, parrots, dogs, and octopuses get smarter than that very, very quickly, with only 2 billion neurons and a few trillion "parameters."
23 Nov 2023
Animals and humans get very smart very quickly with vastly smaller amounts of training data. My money is on new architectures that would learn as efficiently as animals and humans. Using more data (synthetic or not) is a temporary stopgap made necessary by the limitations of our current approaches.
482
932
7,398
2,524,662
Bonjour @DeezerFR j'ai eu 5 mails de demande de reset de password en qq heures de tous les coin du monde, il y a une tentative de vol de compte en cours à votre niveau?
45
Blaise Madeline retweeted
14 Nov 2023
Tu souhaites rejoindre le côté Flip de la force ? Tente de remporter le tout dernier Samsung Galaxy Z Flip5, le smartphone pliant le plus stylé de sa génération 🤙 Pour participer, c'est juste ici ⬇️ RT ce tweet follow les comptes : @free et @SamsungFR Tirage au sort le 17/11. Bonne chance à tous ! 🍀
1,165
3,892
1,668
156,182
Mon neveu 4 ans : - on est d'accord, rien ne va plus vite que la lumière ? -oui - alors pourquoi des fois, quand on appui sur le bouton, ça ne s'allume pas tout de suite ?
65
Blaise Madeline retweeted
24 Jun 2023
zeroscope_v2 XL, A watermark-free Modelscope-based video model capable of generating high quality video at 1024 x 576 Model on @huggingface : huggingface.co/cerspense/zer… This model was trained with offset noise using 9,923 clips and 29,769 tagged frames at 24 frames, 1024x576 resolution.
zeroscope_v2_XL is specifically designed for upscaling content made with zeroscope_v2_576w using vid2vid in the 1111 text2video extension by kabachuha. Leveraging this model as an upscaler allows for superior overall compositions at higher resolutions, permitting faster exploration in 576x320 (or 448x256) before transitioning to a high-resolution render. zeroscope_v2_XL uses 15.3gb of vram when rendering 30 frames at 1024x576
43
314
1,492
714,275
Blaise Madeline retweeted
Introducing – Paragraphica! 📡📷 A camera that takes photos using location data. It describes the place you are at and then converts it into an AI-generated "photo". See more here: bjoernkarmann.dk/project/par… or try to take your own photo here: paragraphica.bjoernkarmann.d…
930
4,584
21,923
10,130,096
Blaise Madeline retweeted
La politique change-t-elle vraiment nos vies ? Ne devrions-nous pas enseigner l’histoire des sciences, plutôt que celle des guerres et des révolutions ; déboulonner les statues, pour moins de François 1er et plus de Steve Jobs ou de Marie Curie ? Un #thread 🧵⬇️
70
221
583
153,416
Blaise Madeline retweeted
Rappel pour nos amis chercheurs et étudiants : il est INTERDIT d'utiliser ce site qui permet d'accéder à tous les articles de recherche gratuitement ! Ce site est souvent bloqué par les fournisseurs internet, il est aussi interdit de contourner le blocage avec cette méthode. 🧵
473
10,632
29,860
3,327,888