Filter
Exclude
Time range
-
Near
Björn Schotte  retweeted
Jun 9
Ran /codesage-review with Fable 5 (High effort) over three of my PHP extensions to see how benchmarks compare to reality. Biggest takeaway: AI code review isn't one-and-done, and not much of a change from Opus 4.8. Every repo needed multiple passes, and each pass found real bugs the previous one walked straight past, including bugs the model introduced in its own fixes. Per-repo breakdown below.
1
1
3
191
Jun 7
Built CodeSage today. It monitors your repos 24/7, flags bugs before they ship, and sends you fixes. Your codebase's senior dev who never sleeps.
8
Used to spend hours understanding new codebases. Built CodeSage 🧠 Ask your repo questions in plain English → get clear answers. FAISS RAG LLM Node React Open source 👇 github.com/snipeet03/CodeSag… #buildinpublic #OpenSource #DevTools #AI
3
1
53
أداة ثورية لتحليل الأكواد بلحظات! مشروع CodeSage AI بات قادرًا على فحص أي كود، اكتشاف الأخطاء واقتراح التحسينات تلقائيًا. 🎯 لماذا يهمك؟ ستوفر ساعات من العمل للمطورين، وتقلل الأخطاء البرمجية قبل الإطلاق. #AI #GitHub #OpenSource #TechNews
2
40
How to laugh a product in the shortest time possible! #startup #founder #ai #codesage #productlaunch
2
47
Researchers are using powerful language models to make searching and recommending code easier! They tested different models on various datasets, looking at how well they find the right code snippets. A model called Codesage-small-v2 did really well on one dataset, while BGE-base and GIST-base performed similarly on another. Starcoder2-7B worked across multiple programming languages for matching code and identifying its parts. arxiv.org/abs/2506.15655 #ArtificialIntelligence
1
2
9
Built CodeSage, a 19M param LLM for code understanding. Been hacking on tokenizers and transformers… maybe one day I’ll build a real model haha. Gonna deploy it, do some reinforced learning, feed it more data. Just need cloud credits lol
4
5
1,054
5 Dec 2024
Voyage-code-3:更精准、更高效的新一代代码检索引擎 「Voyage AI推出新一代代码检索模型,通过创新的维度压缩和量化技术,在显著提升检索准确率(超越OpenAI 13.8%)的同时,大幅降低了存储和计算成本,为代码搜索领域带来突破性进展」 1. Voyage AI发布了新一代代码检索模型 voyage-code-3,性能显著提升: - 比OpenAI的模型平均高出13.80% - 比CodeSage的模型平均高出16.81% - 支持更长的上下文长度(32K tokens) 2. 创新特性: - 支持多种维度的嵌入(2048/1024/512/256维) - 提供多种量化格式,可以大幅降低存储成本 - 采用"套娃式学习"(Matryoshka learning)技术,一个向量可以灵活用于不同长度 3. 实际优势: - 存储成本大幅降低:使用8位或1位存储可以分别节省4倍或32倍空间 - 性能损失小:即使使用压缩后的格式,检索质量仍然保持在较高水平 - 适配多种主流向量数据库,如Milvus、Qdrant等 4. 训练与评估: - 使用了更大更多样的代码训练数据 - 覆盖300多种编程语言 - 在238个数据集上进行了全面测试 - 支持多种代码检索场景:文本到代码、代码到代码、文档到代码等 这个发布对开发者和企业的意义: - 可以用更低的成本获得更好的代码检索效果 - 在保持高性能的同时大幅降低存储和计算成本 - 提供了更灵活的部署选项,可以根据需求选择不同的维度和存储格式 这是代码检索领域的一个重要进展,特别是在效率和成本方面取得了显著突破。他们提供前2亿个token免费使用,开发者可以通过其文档开始尝试。
📢 Announcing voyage-code-3 embedding model! 1. more accurate: 14% gain over OpenAI-v3-large 2. flexible dimension (Matryoshka): 256-2048 3. quantized embeddings: float, int8, binary 4. new Pareto frontier: (binary,256 dim.) is 6% better than OpenAI (float,3072 dim.) 🧵🧵
1
6
1,042
We evaluated various embedding models, @OpenAI , @awscloud CodeSage, CodeRankEmbed, @JinaAI_ v2 code, along with the @Voyage AI’s newly released voyage-code-3 (blog.voyageai.com/2024/12/04…) on these datasets:
1
4
772
8 Jul 2024
ChainIDE had the pleasure of being part of the @HackQuest_ x @arbitrum IRL Bootcamp in Kolkata! The energy and passion of the developers were incredible. We're thrilled to have been a guest for this Partner-sharing session! 🎉 We showcased how to use the AI capabilities of ChainIDE-CodeSage for full-stack and AI-driven Dapp development, from front-end to back-end to smart contracts.
6
21
2,136
AWS AI Labs Introduce CodeSage: A Bidirectional Encoder Representation Model for Source Code Quick read: marktechpost.com/2024/02/21/… The researchers from AWS AI Labs’ introduction of CODE SAGE marks a pivotal shift towards an innovative bidirectional encoder representation model designed specifically for source code. This model pioneers a two-stage training scheme, utilizing a vast dataset far exceeding the scale traditionally employed in this field. The approach is novel, intertwining identifier deobfuscation and a refined version of masked language modeling objectives that move beyond conventional masking techniques. This methodology is crafted to more effectively capture the intricate semantic and structural nuances of programming languages. Paper: arxiv.org/abs/2402.01935 #ArtificialIntelligence @awscloud
5
13
206
18 Feb 2024
🚀 Our latest research paper on code representation learning, CodeSage, outperforms OpenAI text-embedding-3-large on Code2Code search, and is on par with NL2Code search tasks! Dive into the techniques and insights - check them out on the blog: code-representation-learning…

Introducing #CodeSage, a family of embedding models for generating code representations. To appear at #ICLR2024, co-led w/ @DejiaoZhang. 1/5 Paper: arxiv.org/abs/2402.01935 Evaluation code: github.com/amazon-science/Co… Model checkpoints: huggingface.co/codesage
1
11
660
New Embedding Models for Code released by @awscloud! Embedding Models are at the heart of every RAG application. Without good embeddings, retrieving relevant context to answer your user prompts is impossible. 🔍 Super exciting to see Amazon release CodeSage, a family of open code embedding models with an encoder architecture that supports a wide range of source code understanding tasks. 🤗 TL;DR; 📏 Comes in 3 sizes: 130M, 356M, 1.3B 📚 Pre-trained on @BigCodeProject the Stack (237 million code files) 🇪🇺 Fine-tuned on 75 million bimodal (code and natural language) pairs 🔍 Using hard negatives & hard positive improve MAP > 10% 🔠 Using @BigCodeProject StarCoder Tokenizer ⚖️ Licensed under Apache 2.0 🥇 Outperforms @OpenAI and others on 0-shot Code Search 🚀 Sota Performance on NL2Code (Natural Language to Code) 🤗 Available on @huggingface and supported in Sentence Transformers
3
37
217
28,213
#GLBajaj (GLBITM) is filled with pride to share that Team CodeSage from GL Bajaj have stood as winners in KAVACH 2023 in the "New Age Women Safety App" category. #Kavach2023 #winners #glbajastudents #hackathon #CyberSecurityHackathon #MinistryOfEducation #AICTE #MIC #BPR&D #I4C
1
11
400
CodeSage, Cookie Bytes, Photon in a Double Slit and Little Champs were the four teams who led themselves to glory in the epic Grand Finale of KAVACH 2023! #kavach2023 #cybersecurityhackathon #ministryofeducation #ministryofhomeaffairs #aicte #mic #bpr&d #i4c #kavachhackathon2023
5
9
458
9 Aug 2023
Replying to @violetto96 @shnai0
Ah didnt know about that. Codesage will be fundamentally different because the user chooses a github repository and version/tag to chat with. Ogpt seems to be a skin on top of an older API version of gpt4.
1
3
39
19 Feb 2016
Introduction To Programming goo.gl/4rtAmZ @codesage @SaharaHacker @SecureITZim @Neolabtech #Twimbos

1