The Official Twitter account of the Global Database of Events, Language, and Tone (GDELT) Project blog.gdeltproject.org

Joined January 2014
102 Photos and videos
GDELT Project retweeted
Visual Explorer: OCR'ing A Year And A Half Of CSPAN Through Tesseract To seed further research into the potential new kinds of insights that could be derived by searching and analyzing the onscreen text of our nation's governance using open OCR tools, today  in collaboration with the Internet Archive's TV News Archive and the multi-party Media-Data Research Consortium, we are releasing a new dataset of nearly a year of a half of Tesseract OCR'd text from CSPAN, running January 1, 2022 through April 30, 2023, applying Tesseract to each image from the every-4-seconds Visual Explorer preview images. In all, 11,192 broadcasts totaling 10,375,897 images representing 41.5 million seconds of airtime were OCR'd by Tesseract yielding 1.5GB of JSON containing 472MB of OCR'd text. blog.gdeltproject.org/visual…
2
2
4
2,054
GDELT Project retweeted
Fully Autonomous Diplomacy Counter-Messaging Experiments With ChatGPT GDELT Given the ability of Large Language Models (LLMs) like ChatGPT to craft human-like prose, how easily could they be used to fully autonomously watch television news, identify narratives that run counter to US interests and generate articulate and fluent counter-messages for different mediums, ready for distribution and without any human intervention required? Such use cases are extremely ethically fraught, but their inevitable application raises the question of just how easy current tools might make this process and how useable the end results might be. Overall, the results here suggest that ChatGPT and GDELT can be combined today with just a few lines of code to create a fully automated narrative monitoring and counter-messaging system. At the same time, the results do suggest that ChatGPT 3.5 lacks the ability to fully recreate the unique voice of non-Western media, especially media systems that feature heavily contextualized narration, but at the same time, the results above are not that far removed from some past human-driven counter-messaging efforts undertaken by Western nations. Most importantly, through proper prompt engineering, additional examples and fine-tuning one could readily yield an LLM capable of writing in a more authentic voice. The kind of fully automated counter-messaging workflow presented here raises myriad ethical and moral questions, but the near-certainty of these kinds of workflows proliferating in the immediate term necessitates a better understanding of what such systems might look like and their nuances in order to understand how to identify and counter them. In the end, the idea of a fully automated counter-messaging system is no longer science fiction – it is here today and available with just a few lines of code. blog.gdeltproject.org/fully-…
2
4
1,135
GDELT Project retweeted
The timeline below compares the percentage of airtime across business television news channels since the start of last year that mentioned President Biden versus Elon Musk, showing that twice last year coverage of Musk nearly equaled that of Biden in a reflection of his outsized media persona. blog.gdeltproject.org/biden-…
2
2
789
GDELT Project retweeted
WashPost: The TikTok fight Is A Generational Fight The Post's Philip Bump includes a graph of mentions across television news using the TV Explorer: washingtonpost.com/politics/…
2
1
589
GDELT Project retweeted
Spinmeisters Of Russia: The Bucha Massacre Likened To WWII Nazis blog.gdeltproject.org/spinme…
1
379
GDELT Project retweeted
Fox News Dominates Mentions Of "Radical" Over The Past Decade As the timeline and graph below show, Fox News has dominated mentions of the word "radical" over the past decade. blog.gdeltproject.org/fox-ne…
1
1
315
GDELT Project retweeted
Being "Canceled" Took Off In 2020 On Television News But Has Been Fading Since 2021 The timeline below tracks total mentions of "canceled" on television news, showing nearly equal mentions through mid-2020, when the term took off on Fox News, but has been declining on Fox since a peak of March 2021. blog.gdeltproject.org/being-…
1
255
GDELT Project retweeted
Pandemic Coverage Continues To Fade Across Both Online And Television News blog.gdeltproject.org/pandem…
1
223
GDELT Project retweeted
Mentions of "woke" and "wokeness" surged on Fox News from January 2021, but over the last three months have surged on CNN and MSNBC as well. blog.gdeltproject.org/mentio…
2
1
227
GDELT Project retweeted
"Maga" Mentions Fade On CNN & Fox News But Continue On MSNBC blog.gdeltproject.org/maga-m…
1
178
GDELT Project retweeted
Mentions of an impending recession continue to fade away on television news channels. blog.gdeltproject.org/recess…
1
186
GDELT Project retweeted
Visual Explorer: Creating Visual Networks Of Facial Co-Occurrences On An Episode Of Russian TV News' 60 Minutes – Revisited Last week we demonstrated using a simplistic facial extraction and visual clustering pipeline to extract the faces from a single episode of Russian TV News Russia 1's "60 Minutes" and build a co-occurrence graph of who appears alongside of whom. To make the pipeline as easy to use as possible, we used a very simplistic pipeline of an older face extractor that is less accurate than modern tools but extremely fast, coupled with a perceptual hash-based clustering postprocessor to group faces together to track them across frames. The results suggested considerable promise for this analytic approach, but also demonstrated the existential limitations of such a simple pipeline. Today we revisit that exploration using a modern face extraction and clustering pipeline that yields vastly more accurate results. blog.gdeltproject.org/visual…
1
1
227
GDELT Project retweeted
Adding Confidence Scores To Tracking A Year Of Tucker Carlson On Russia 1's "60 Minutes" Last month, in collaboration with the Internet Archive's TV News Archive, we demonstrated scanning a year of Russia1's "60 Minutes" for all appearances of Tucker Carlson. Let's repeat that analysis with a more advanced tool that also generates a distance score of the extracted face compared with the source face, allowing us to post-filter to remove false positives, identify the strongest matches, etc. blog.gdeltproject.org/adding…
1
1
204
GDELT Project retweeted
Sampling Russian television news broadcasts every 4 seconds and pairwise comparing those "visual ngrams" over an entire broadcast yields a powerful tool for cataloging advertising, identifying key advertising trends across the Russian television news landscape and how the ad economy is adjusting in the face of global sanctions. Using more sophisticated tooling for identifying ad content and using signature-based tracing approaches, it would be possible to fully automatically construct a live catalog of advertising activity across Russian television news to understand the brands, industries, products and services being advertised and how that composition has changed over the past year as the impact of sanctions has continued to build. blog.gdeltproject.org/visual…
1
1
160
GDELT Project retweeted
In collaboration with the @internetarchive , the Visual Explorer extracts one frame every 4 seconds from each broadcast to create a "visual ngram" that non-consumptively captures the core visual narratives of the broadcast. What if we took all of those images for a given Russian TV news broadcast and pairwise compared each image to every other image in that broadcast based on pixel-level visual similarity (using a perceptual hash)? The end result would allow us to not only identify contiguous sequences (marking "shot changes"), but, most importantly, to identify repeated content that makes an appearance multiple times throughout a broadcast, ranging from a clip that is aired multiple times at different points in the broadcast to repeated advertisements. blog.gdeltproject.org/visual…
11
16
9,290
GDELT Project retweeted
Visualizing Who Appears Alongside Whom On An Episode Of Russian TV News' 60 Minutes Who appears alongside whom on television news represents a key editorial decision of what voices to pair. From split-screen displays to the back-and-forth of presenters and guests, understanding co-occurrence patterns on television news offers a powerful lens into the underlying narrative storytelling of a broadcast. What if we could analyze such co-occurrence patterns automatically, generating a network visualization of the faces that appear onscreen in the same frame or subsequent frames over an entire broadcast? blog.gdeltproject.org/visual…
1
153
GDELT Project retweeted
Yesterday, in collaboration with the @internetarchive's TV News Archive, we announced the availability of more than 1 billion words of transcribed and translated Belarusian, Iranian, Russian and Ukrainian television news broadcasts. How might we examine these transcripts with ChatGPT to understand what a day of Russian television news says about Ukrainian president Volodymyr Zelensky? blog.gdeltproject.org/visual…
1
152
GDELT Project retweeted
In collaboration with the @internetarchive, more than a billion words of Belarusian, Iranian, Russian And Ukrainian television news now accessible for narrative analysis: blog.gdeltproject.org/visual…
1
4
1,021
GDELT Project retweeted
Rep. Marjorie Taylor Greene's (MTG) disapproval of military support to Ukraine remains popular on Russian state television, such as this excerpt of her CPAC speech and one of her Tucker Carlson appearances. blog.gdeltproject.org/marjor…
1
170