i3open - the Innovation Information Initiative (@I3Open)

10 Photos and videos

Tweets

i3open - the Innovation Information Initiative @I3Open

25 Sep 2025

hi, there was some confusion about the deadline being last night vs. end of the week. we'll leave it open until midnight Friday, thanks!

Matt Marx

@marxmatt

24 Sep 2025

friendly reminder! tomorrow is the deadline to submit a paper for the @I3Open Technical Working Group: conference.nber.org/confsubm… as I've written before, this is not a typical academic conference. we're focused not on research results but on *datasets* and methods for building datasets. this year we're especially interested in the use of LLMs and other machine-learning methods for building and linking large-scale data, including how to take advantage of these new tools in a cost-conscious way.

390

i3open - the Innovation Information Initiative @I3Open

28 Jul 2025

Happy to announce our next i3 Upskilling session, Thursday August 21 at noon (New York time / EDT). ➡️"Using Large Language Models without Blowing Your Research Budget"⬅️ Hosts: Navid Asgari (Fordham) and Deepak Nayak (OSU) Register here: cornell.zoom.us/meeting/regi…

496

i3open - the Innovation Information Initiative retweeted

Matt Marx

@marxmatt

28 Jul 2025

I’m happy to announce our next @I3Open Upskilling session, Thursday August 21 at noon (New York time / EDT). By far, our most requested topic was Large Language Models, so I’m excited that I was able to enlist Navid Asgari (Fordham) and his coauthor Deepak Nayak (OSU) for this session. Navid co-founded Cogneunce, an AI-based mental healthcare startup and is also a research fellow at IBM Watson. here's a summary: Large language models (LLMs) are opening new possibilities for research, especially in tasks like classification, sentiment, or theme extraction, and sub-corpus analysis. But navigating the growing range of models and tools can be overwhelming, and many researchers worry about cost, data quality, and hallucination. This session offers a practical, research-focused overview of how to use LLMs effectively and affordably. We’ll compare model types, discuss open vs. closed access, and walk through strategies like prompt design, retrieval-augmented generation (RAG), and lightweight fine-tuning. The focus will be on helping you choose the right tools for your research tasks, without compromising on accuracy or breaking the bank. Sound interesting? Register for the zoom at this link: cornell.zoom.us/meeting/regi…

2,217

i3open - the Innovation Information Initiative @I3Open

30 Apr 2025

brief update: we just received word that the PatentsView contract has been renewed for an additional year, starting tomorrow. I'm not sure what will happen next year, but for now the data will continue to be updated.

282

i3open - the Innovation Information Initiative retweeted

Dror Shvadron @DShvadron

2 Apr 2025

Quick update regarding PatentsView metadata: the final datasets, including granted, pre-grant and beta tables, are now available on the I3 BigQuery data repository. Link: console.cloud.google.com/big… Join our mailing group: groups.google.com/g/i3-bigqu…

i3open - the Innovation Information Initiative @I3Open

20 Mar 2025

Dear Friends, we were advised earlier today that the PatentsView data many of us rely on may soon shut down. @I3Open has archived all metadata and full-text file, both granted and pre-grant. We plan to upload these to our BigQuery Workspace shortly & will update when complete.

584

i3open - the Innovation Information Initiative @I3Open

28 Mar 2025

update: unclear that the patentsview site will come down today (no formal announcement yet), but just in case we've posted all data from the 12/31/2024 release. details here: linkedin.com/posts/mattmarx_…

Final release of PatentsView metadata, pre-grant and granted (12/31/2024) | Matt Marx

Interim PV update: I’ve posted the final release of PatentsView in the following Zenodo repositories: Metadata for grants & applications (these are the files you usually use, also contains the data...

linkedin.com

487

i3open - the Innovation Information Initiative @I3Open

20 Mar 2025

2,805

i3open - the Innovation Information Initiative @I3Open

20 Mar 2025

Update: 3/28 has been confirmed to me as last day for patentsview website. metadata have been posted to a permanent archive, working to find an archive large enough for the remaining ~220G of (compressed) full-text files.

258

i3open - the Innovation Information Initiative @I3Open

25 Feb 2025

let us know what topics we should cover at the next Upskilling session

Matt Marx

@marxmatt

25 Feb 2025

huge thanks @rogermasclans for leading our first @I3Open Upskilling session! Roger did a 75 minute live demo of big-data wrangling using Google BigQuery and the i3-nber data repository. here's the recording (dropbox.com/scl/fi/42ouitaf2…) for anyone interested.

254

i3open - the Innovation Information Initiative @I3Open

21 Feb 2025

starting in about an hour! not too late to register

Matt Marx

@marxmatt

5 Feb 2025

🚀Please join us for our first @I3Open Upskilling Session, "Intro to Google BigQuery" by @rogermasclans & @DShvadron Friday 2/21 11am ET New to BigQuery & SQL? Join our first hands-on webinar to: 🔹 Query massive datasets efficiently 🔹 Optimize costs & avoid common pitfalls 🔹 Use SQL Python for reproducible research register here: cornell.zoom.us/meeting/regi…

469

i3open - the Innovation Information Initiative @I3Open

19 Feb 2025

70 people registered for the first @I3Open Upskilling session this Friday! Can we hit triple digits? cornell.zoom.us/meeting/regi…

659

i3open - the Innovation Information Initiative retweeted

Dror Shvadron @DShvadron

5 Feb 2025

I'm looking forward to this! We’re hosting lots of innovation data on the @I3Open BigQuery repo. Join us on Feb 21st for our first webinar. Roger Masclans (@rogermasclans) will cover efficient querying, cost optimization, and key use cases. Register here cornell.zoom.us/meeting/regi…

Matt Marx

@marxmatt

5 Feb 2025

956

i3open - the Innovation Information Initiative retweeted

Matt Marx

@marxmatt

21 Dec 2024

Releasing an open dataset based on @MBikard's dissertation regarding "idea twins." David Hsu and I scaled up his algorithm to the entire Web of Science, scraping Google Scholar to detect adjacent co-citation in PDFs. Here's the server farm in my basement 1/

5,746

i3open - the Innovation Information Initiative retweeted

Matt Marx

@marxmatt

10 Dec 2024

one last (I promise!) update from @I3Open's big weekend: ➡️the 2025 batch of i3 Fellows⬅️ funded by the Alfred P. @SloanFoundation, Fellows receive a stipend and attend i3 Technical Working Group Meetings. we seek Ph.D students engaged in open datasets. here is this year's batch, in reverse alphabetical order 1/

3,936

i3open - the Innovation Information Initiative retweeted

Matt Marx

@marxmatt

8 Dec 2024

thanks everyone for making the 2024 @I3Open technical working group so fun. none of this would have been possible without the support of the Alfred P. @SloanFoundation. if you would like to join our email list for updates, go here ➡️mailman.mit.edu/mailman/list…⬅️

533

i3open - the Innovation Information Initiative @I3Open

6 Dec 2024

Looking forward to today's Innovation Information Initiative (I3) technical working group! #i3 You can follow the program here: iii.pubpub.org/pub/2024-work…

697

more replies

i3open - the Innovation Information Initiative @I3Open

7 Dec 2024

Josh Lerner on creating a new China patent dataset and its implications: iii.pubpub.org/pub/2024-work…

296

i3open - the Innovation Information Initiative @I3Open

7 Dec 2024

This includes all patents, whether granted or not, from many sources: leading to 16M patents after cleaning and deduping, with translated assignee names, tags, and non-cite measures of patent quality. #i3

341

i3open - the Innovation Information Initiative retweeted

Matt Marx

@marxmatt

7 Dec 2024

Satyaki Chakravarty (Università Cattolica del Sacro Cuore, Milano) has created a dataset of patents (and applications) in India, which are undercounted in commonly-used sources. finds increasing geographic diversity of patents in India, a surge in Mumbai, and huge growth in mechanical engineering key question from Bronwyn Hall: does this mean there's more *invention* in India vs. greater awareness of the practice of patenting inventions

476

i3open - the Innovation Information Initiative retweeted

Matt Marx

@marxmatt

7 Dec 2024

@mayadurvasula, from our first batch of @I3Open Fellows, is back for a 3rd time to show that the performance of commercial LLMs (gpt-4o) can be matched by retraining open\simpler models (BERT) with a small sample of commercial encodings

605