LSE Impact Blog

LSE Impact Blog

44 Photos and videos

Tweets

Marco Fahmi retweeted

LSE Impact Blog @LSEImpactBlog

26 May 2024

✍️Dorothea Strecker, Heinz Pampel, Rouven Schabinger and Nina Leonie Weisweiler, explore how common data repository shutdowns are and suggest what can be done to ensure data preservation in the long-term. #OpenData wp.me/p4m9em-cPR

2,898

Jeni Tennison

Marco Fahmi retweeted

Jeni Tennison @JeniT

26 May 2024

I am really troubled by the proposal from the Tony Blair Institute for a "National Data Trust" (NDT) for health data, particularly by the idea that the next government might actually go for it. institute.global/insights/po…

A New National Purpose: Harnessing Data for Health

institute.global

5,242

giulio quaggiotto

Marco Fahmi retweeted

giulio quaggiotto @gquaggiotto

16 Apr 2024

“When we have a technology that treats something as simple and fundamental as our name as an error, it robs us of our personhood" theatlantic.com/technology/a…

The AI Revolution Is Crushing Thousands of Languages

English is the internet’s primary tongue—a fact that may have unexpected consequences as generative AI becomes central to daily life.

theatlantic.com

333

giulio quaggiotto

Marco Fahmi retweeted

giulio quaggiotto @gquaggiotto

26 Dec 2023

Transformative governance of innovation ecosystems sciencedirect.com/science/ar…

772

National Centre for Youth Substance Use Research

Marco Fahmi retweeted

National Centre for Youth Substance Use Research @NCYSUR

15 Dec 2023

Congratulations team NCYSUR who is being awarded a 1.5 million dollar @nhmrc ideas grant, led by Dr Daniel Stjepanovic, @tianzesun , A/Prof Phong Thai, and Prof. @DavidHammondPhD.

838

Marco Fahmi

Marco Fahmi @dataronin

15 Dec 2023

RT @CREtobacco: Congratulations 🎉 to CRE researcher Dr Carmen Lim @Cwernlim who has been awarded $660,000 by the @nhmrc to develop a progra…

Kevin Gee

Marco Fahmi retweeted

Kevin Gee

@kevg1412

17 Apr 2023

"He's not even a 10x engineer. He's like, 100x, or 1,000x engineer" Gmail creator Paul Buchheit on how Bret Taylor once rewrote Google Maps in a single weekend:

689

9,065

2,366,176

Axios

Marco Fahmi retweeted

Axios

@axios

24 Aug 2023

Behind the hype of generative AI, large companies are struggling to deploy the new technology — hitting cost and data management hurdles that are leaving many of their generative AI projects stuck in pilot phase. trib.al/zwRlIVl

Companies struggle to deploy AI due to high costs and confusion

Behind the hype lie major early hurdles.

axios.com

12,077

Axios

Marco Fahmi retweeted

Axios

@axios

24 Aug 2023

As adoption of generative AI grows, providers are hoping that greater transparency about how they do and don't use customers' data will increase those clients' trust in the technology. trib.al/tCr3DVd

12,064

Rachel Woods

Marco Fahmi retweeted

Rachel Woods

@rachel_l_woods

18 Aug 2023

There's a resurgence of interest in fine tuning LLMs I've yet to see a successful public use case where fine tuning > prompting. But here's where I see fine tuning *mattering*: First, fine tuning is for teaching an LLM specific tasks or behaviors Not teaching an LLM new knowledge. For new knowledge, use Retrieval (store your data in an outside database and strategically pull the right chunks in to give the LLM context to your question) But even in teaching LLMs specific tasks or behaviors - here's the catch... LLMs are remarkably good at picking up tasks and behaviors from just a good prompt THIS is what makes LLMs mind blowing after all So that begs the question. Where is fine tuning actually helpful? Some use cases I could see developing are teaching LLMs tasks that are exceptionally difficult to describe, or fit into ~10 examples you can add to a prompt. One way to think about this: if it would take someone a few weeks doing a task to 'master it' instead of being able to read training materials and get the picture... That *may* be a use case for fine tuning But proceed with caution To truly teach an LLM a new behavior or task, you'll need to treat this like a machine learning project, not just throwing examples in and getting magic in return (which it still blows my mind that ChatGPT does this so well for us). Things like: - Dataset design - Training and test data - Overfitting more as the tooling around fine tuning gets more sophisticated The other obvious use case is cost. If you can get a super small language model to do a task instead of GPT-4, there's meaningful cost savings there. And if you're using a language model to do large scale tasks like triaging your customer support inbox, or analyzing public data for insights The costs can add up. But if you're wondering where the heck to invest in fine tuning... My answer at the moment for most businesses is still: Make sure you can't do it with prompts.

451

164,113

Meredith Whittaker

Marco Fahmi retweeted

Meredith Whittaker

@mer__edith

17 Aug 2023

📢NEW PAPER! Where @davidthewid, @sarahbmyers & I unpack what Open Source AI even is. We find that the terms ‘open’ & ‘open source’ are often more marketing than technical descriptor, and that even the most 'open' systems don't alone democratize AI 1/ papers.ssrn.com/sol3/papers.…

614

1,791

634,910

Bellingcat

Marco Fahmi retweeted

Bellingcat

@bellingcat

7 Aug 2023

New updates on Bellingcat's #github this week. The 'whisperbox' API receives audio or video URLs and returns the video transcripts using OpenAI's Whisper model. Designed by Bellingcat discord member github.com/fspoettel Find the tool at: github.com/bellingcat/whispe…

fspoettel - Overview

fspoettel has 30 repositories available. Follow their code on GitHub.

github.com

67,628

Marco Fahmi

Marco Fahmi @dataronin

5 Aug 2023

Opinion Paper: “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy sciencedirect.com/science/ar…

Marco Fahmi

Marco Fahmi @dataronin

29 Jul 2023

Tackling generative AI’s sustainability problem diginomica.com/tackling-gene…

Tackling generative AI’s sustainability problem

Although the adoption of generative AI is only in its early stages, reports are starting to emerge about potential sustainability problems if it goes mainstream. As a result, the time is now for...

diginomica.com

Daniel Severo

Marco Fahmi retweeted

Daniel Severo @_dsevero

27 Jul 2023

Best poster award

120

3,288

27,965

2,621,491

Global Investigative Journalism Network

Marco Fahmi retweeted

Global Investigative Journalism Network

@gijn

26 Jul 2023

🗺️ NEW! Can journalists rely on #AI chatbots to successfully geolocate an image? Investigative journalism group @bellingcat conducted 3️⃣ tests to examine the geolocation capabilities of #Bing and #Bard. Reporter @DennisKovtun broke down the findings: ⬇️ buff.ly/3O7PmFu

2,583

Peter Griffin

Marco Fahmi retweeted

Peter Griffin @petergnz

26 Jul 2023

Department of Internal Affairs has put out some guidance today on use of generative AI in the public sector...

8,922

Marco Fahmi

Marco Fahmi @dataronin

23 Jul 2023

What will the federal government do with generative AI? nextgov.com/artificial-intel… via @Nextgov

What will the federal government do with generative AI?

Federal employees are going to see AI tools show up in cloud-based productivity suites sooner rather than later, but it's not clear yet how the trending tech will impact public-facing digital...

nextgov.com

Owen Boswarva

Marco Fahmi retweeted

Owen Boswarva @owenboswarva

21 Jul 2023

US judge finds flaws in artists' lawsuit against #AI companies reuters.com/legal/litigation… #StableDiffusion #genAI #AIlaw #IPlaw #webscraping #openweb Andersen et al v. Stability AI Ltd. et al dockets.justia.com/docket/ca… courtlistener.com/docket/667…

Andersen, Kelly McKernan and Karla Ortiz said in their January complaint that Stability "scraped" billions of images from the internet to teach its Stable Diffusion text-to-image system to create its own images, including some in their styles. They accused the company of infringing their copyrights by using their work without permission.

…

The judge also said the artists were unlikely to succeed on their claim that images generated by the systems based on text prompts using their names violated their copyrights.

"I don't think the claim regarding output images is plausible at the moment, because there's no substantial similarity" between images created by the artists and the AI systems, Orrick said.

ALT Andersen, Kelly McKernan and Karla Ortiz said in their January complaint that Stability "scraped" billions of images from the internet to teach its Stable Diffusion text-to-image system to create its own images, including some in their styles. They accused the company of infringing their copyrights by using their work without permission. … The judge also said the artists were unlikely to succeed on their claim that images generated by the systems based on text prompts using their names violated their copyrights. "I don't think the claim regarding output images is plausible at the moment, because there's no substantial similarity" between images created by the artists and the AI systems, Orrick said.

394

Justin Alvey

Marco Fahmi retweeted

Justin Alvey

@justLV

18 Jul 2023

I “jailbroke” a Google Nest Mini so that you can run your own LLM’s, agents and voice models. Here’s a demo using it to manage all my messages (with help from @onbeeper) 🔊 on, and wait for surprise guest! I thought hard about how to best tackle this and why, see 🧵

1:33

365

2,451

13,828

1,698,176