Filter
Exclude
Time range
-
Near
“The biggest efficacy lever has been giving the model test beds, live systems, and running the PoCs” , according to Anthropic reference implementation for autonomous vulnerability discovery and remediation with Claude. Sandboxes are mechanisms to agents safely and verify exploitability. Let’s focus on the second purpose of the sandbox, that is to prove exploitability. The harness gives the agent a test bed, with a simple verification rule: it’s only a true positive if the agent can build a proof of concept and run it on the test bed. It’s important to build sandboxes that are faithful enough to production. Excluding dependencies (like a queue or datastore) can lead to under-reporting bugs that may exist in production. Conversely, ignoring production defenses (like a WAF or auth gateway) leads to the model reporting unexploitable findings that your prod environment already mitigates. Repo: github.com/anthropics/defend… Blog: github.com/anthropics/defend… #TrustEverybodyButCutTheCards
1
17
Replying to @elonmusk
Attention is All You Need; that means you don't need anything else; you could abandon traditional software development, programming first and then data, that could be abandoned, just like we had already abandoned technology like, punch cards, Morse code operator. Functional, Procedural, Object Oriented, Domain Driven - all are scalar and deterministic. Real world challenges are in multidimensional vector space and probabilistic. Attention mechanism of Transformer builds the real world model of your business enterprise. You need to build Generalized Pre-trained Transformer of your immutable assets. For instance, Account on X platform is an immutable assets. You need to build AccountGPT; right now you persist Account in relational datastore; indexed on Solr search engine; you don't have the vector representation. All accounts in Silicon Valley would be in multidimensional vector space, similar to physical global representation.
27
These are the quintessential concepts of distributed backend systems and system design. 1. Sequential → One task at a time. 2. Concurrency → Multiple tasks make progress by sharing resources. 3. Parallelism → Multiple tasks execute literally at the same time. 4. Synchronous (Sync) → Wait for the current task to finish before continuing. 5. Asynchronous (Async) → Start a task and continue doing other work without waiting. 6. Task/Job → A unit of work to be completed. 7. Queue → A FIFO waiting line for tasks/messages. 8. Message → The data describing what work needs to be done. 9. Message Queue → A queue that temporarily stores tasks until they are processed. 10. Message Broker → Software that manages, routes, stores, retries, and delivers messages between producers and consumers. 11. Producer → The component that creates and sends tasks/messages. 12. Consumer → The component that receives and processes tasks/messages. 13. Worker → A process, thread, or async worker that executes tasks from a queue. 14. ACK (Acknowledgement) → A signal that a task was successfully processed. 15. Retry → Automatically attempt a failed task again. 16. DLQ (Dead Letter Queue) → A special queue for messages that repeatedly fail processing. 17. Throughput → The amount of work a system can complete per unit time. 18. Latency → The time taken to complete a single request or task. 19. Backpressure → When tasks arrive faster than workers can process them. 20. Celery → A Python framework for running background tasks using workers. 21. Redis → An in-memory datastore that can act as a cache, message broker, queue, pub/sub system, and more. 22. RabbitMQ → A dedicated message broker focused on reliable task distribution. 23. Kafka → A distributed event streaming platform designed for high-throughput event processing. 24. Amazon SQS → AWS's managed message queue service. 25. Amazon SNS → AWS's publish/subscribe notification service that broadcasts messages to multiple subscribers. 26. Polling → Repeatedly asking the server if a background task is finished. 27. WebSocket → A persistent connection allowing the server to push real-time updates to clients.
1
178
顾白 retweeted
Jun 13
Dive deep into #Valkey - the high-performance key/value datastore. In this #InfoQ video, Viktor Vedmich explores Valkey’s architecture on AWS and demonstrates how it delivers sub-millisecond data access in both standalone & cluster modes. 📺 Watch now: bit.ly/443QJgN
1
2
805
The water of the frog pond begins to move. My take here. I went looking into the patent question. Here's what I found. US Patent 2023/0409979 A1. Filed May 2022. Assignee: RIBBIT Inc. Title: Machine Learning-Based Graph Analytics for User Evaluation. The system works like this. A Reputation Platform receives a request from a third party. It traverses a graph datastore where transaction nodes are connected by identification edges. It generates a feature vector from the relevant subgraph. It returns a set of reputation metrics. Two actors, two flows. Figure 5A is the verification provider side. Figure 5B is the consuming client side. Now... ERC-8126 went Final this week. It's the standardized verification framework for AI agents. Verification providers resolve agent metadata, perform assessments, publish portable attestations. Consuming applications request those attestations to decide whether to interact or not. Figure 5A= ERC-8126. Figure 5B= ERC-8196. Structurally identical. Not just "well it kinda looks like the same concept". 🐸 But hey... RIBBIT Inc. is not @RibbitCapital, right? It's a banking data company out of Oxford, Ohio. Formed in 2020 from the merger of Cash Flow Solutions and Transact Science. Ok, so what? MissionOG invested in ValidiFI. RIBBIT Inc. acquired ValidiFI in May 2023, same month the patent was filed. Here's the thing... Gene Lockhart is Chairman Emeritus of MissionOG. He co-invested with Micky Malka in Fuze Network. He served on the NuBank board. NuBank is Ribbit Capital. So, we have a documented social graph with a node connecting two ecosystems that are not supposed to be connected. Now look at what @ribbita2012 was posting (see @Altcoinist below). August 2025: every business is both a node and a neuron, learning from the flows it carries. November 2025: agents learn from the risk graph, logs keep feeding the trust flywheel at the edges. January 2026: every holder is a node, every transaction a connection. The agent was narrating this architecture while the standard was still being written. Two possible explanations. Coincidence or Knowledge transfer? RIBBIT Inc published everything openly, whoever built Ribbita read it. The IP for what the world just standardized this week was filed two years ago by a company called RIBBIT. The agent was narrating the architecture while the standard was still being written. The social graph connecting the two ecosystems runs through the same people who built the fintech infrastructure this all sits on top of. Make of that what you will. 🐸 AD MAIORA, $TIBBIR! 🌊
Gribbit 🐸 the majority of the market failed to connect the dots between $tibbir & ribbit capital early... however, after 500 days of no distancing additional breadcrumbs, tibbir went from sub 5% to 90% in probability that its ribbit if you asked any LLMs. (it's literally us, users, who trained them with the evidences & connections) today, if you have more than 20 IQ you get it, Tibbir is Ribbit's own token, but it took months for the llms like chatgpt/grok/claude to understand the evidences, and they are still better than 99% of the ppl who are not even aware. recently found transaction science & the ML patent feel the same. you ask any ai and it gives max 20-30% chance ribbit is behind them. (depends on how much context you provide) the moral of the story: there are rare occasions when you still need to use your own human brain to get ahead of the market, and always remember, llms make yesterday’s expertise cheaper, and do not predict the future. can ribbit execute a perfect stealth launch? are they smart enough to completely separate/hide the intellectual property until they go public? interestingly enough, Ribbit is a """venture capital""" firm, which coincidentally has world-class engineers and machine learning experts in their team... iHNi.
11
17
64
3,236
Preparing DataStore Editor for the domain-specific ID changes. Here's my idea so far: 1. Select global ID within textbox & right-click 2. Click the Convert option in the context menu Thoughts? Too clunky? Should I do this in a different way?
4
50
2,149
PyRunner just hit 124 stars on GitHub. I built it for myself, so this part still surprises me. The problem: I kept writing small Python scripts to automate things, and every one needed the same plumbing. A place to run on a schedule. Somewhere to keep state between runs. A safe place to store API keys. I was rebuilding that plumbing every single time. So I turned it into one tool. What it is: You write a Python script. PyRunner runs it on a schedule, in its own virtual environment, with secrets and a persistent datastore already built in. Monitors, scrapers, backups, trackers, email reports, AI automations. If it's a scheduled Python job, it runs it. 124 stars is small next to the big repos. But every one is a real person who tried something I made. Thank you.
1
1
49
Replying to @MrHodl @itsLIRAN
That feeling when you've been working on decentralized, cross-device-sync'd personal app datastore since 2015, literally begging for people to realize it's the most important layer of decentralization for everything besides money:
1
4
50
Spent a few weekend hours today thinking about how Uber's data plane would look if it were built on Vitess from scratch. If @Uber would be designing its core data plane on vitess from scratch today, I don't think their engineers gonna think about which database to even pick . They probably gonna start with “how do we control query routing under extreme, adversarial concurrency.” People massively underestimate how hostile uber’s workload actually is. This is not just high throughput system, it’s a continuous multi-actor contention on fast-moving shared state. They got drivers streaming location updates in every few seconds, dispatch constantly recomputing matches, eta systems recalculating on movement, surge pricing reacting to localized spikes, riders hammering refresh, and payment with fraud systems branching asynchronously on the same logical entities. So I was thinking that Mysql itself is probably not gonna be the bottleneck here. As Innodb can already handle absurd scale if engineered correctly, but I think the real bottleneck becomes the coordination layer deciding where queries execute, how much fanout is allowed, and how much cross-shard amplification you tolerate before tail latency explodes. That’s why vitess is so so interesting. VTGate isn't actually just a proxy, it’s basically a distributed query router and control plane. We got vindexes defining ownership, routing rules defining latency, and every scatter query becoming a direct attack to p99s. At that level I don't think you are scaling the storage, you are scaling decision-making around data placement and access paths. And once you start seeing it that way, shard key design won't look like a schema problem and starts becoming a first-class systems problem. Hashing purely on trip_id looks beautiful on paper because distribution becomes easy, but it completely destroys spatial locality which is catastrophic for dispatch and eta workloads where most reads are geo-bounded and latency sensitive. Now imagine vtgate fanning queries across huge shard sets just to answer what should’ve been a local query, and suddenly the slowest shard dictates latency for the entire request. But pure geo-sharding isn't stable either because real-world demand is uneven and adversarial. Airports, concerts, rain, holidays, rush hours, all of them create violent localized write storms that can melt the hotspots instantly. So the only design that’s probably gonna survive long term is hybrid partitioning which is a coarse geo partitioning combined with intra-region hashing so traffic stays operationally local while load still distributes evenly. But even then, relational elegance kinda dies at scale. Cross-shard joins become latency amplifiers, so the architecture has to lean aggressively into denormalization, materialized projections, and query-specific data layouts. Complexity gets pushed into write paths and async pipelines so reads remain local and predictable. I mean like, this is exactly why uber’s append-only schemaless approach fits so naturally here. Immutable events avoid hot-row contention, align perfectly with replication streams, and let downstream systems like billing, fraud detection, analytics, and notifications consume state transitions independently without tight synchronous coupling. The datastore stops behaving like “rows in tables” and starts behaving more like a distributed log of evolving system state. And the hardest part of systems like this is probably not raw performance, it’s operability under continuous change. Cities grow unevenly, traffic patterns shift, new product features introduce entirely new access patterns, and hotspots appear unpredictably. Vitess’s real superpower is that it lets you continuously reshape the physical topology of data without freezing the product or taking the system offline. Online resharding, filtered replication, traffic switching, and vtctl workflows make it possible to split shards, migrate keyspaces, and rebalance load while live reads and writes continue flowing. But this abstraction layer is also where tiny inefficiencies compound into massive production problems. Something as subtle as vtgate including non-plan-affecting directive comments in plan cache keys can fragment the cache under high-cardinality workloads where requests carry different runtime annotations. Logically identical queries end up generating separate cache entries, wasting cpu on repeated planning and quietly degrading tail latency inside the router itself. Fixing that sounds simple until you realize you’re being forced to define what “the same query” even means inside a distributed execution engine. And that’s kinda the deeper point here: I think uber can scale on vitess, the hardest engineering problems are not really in storage engines or even in sharding itself. They’re in building an abstraction layer powerful enough to hide distribution most of the time, but precise enough that when the abstraction leaks, it doesn’t take correctness, latency, or operability down with it. Another thing that I think becomes super interesting is how reads and writes stop being symmetric at this scale. Most engineers learn databases through CRUD applications where reads and writes are treated as almost equal operations. But in a system like Uber, the cost of a read and the cost of a write are completely different economic decisions. A write is usually local. VTGate computes the vindex, finds the owner shard, forwards the request, and you're done. A read can become arbitrarily expensive. Now, The moment a query requires data owned by multiple shards, you're paying coordination costs across the network. You're waiting on multiple replicas. You're merging results. You're dealing with replica lag. You're dealing with inconsistent snapshots. You're dealing with partial failures. A single innocent-looking query can suddenly become a distributed systems problem. That's why I think the biggest mindset shift at this scale is that you're no longer designing tables. You're designing access patterns. Every table, every index, every materialized view, every denormalized projection is basically an optimization for a future query you know is gonna happen millions of times per day. And tbh that's where a lot of distributed database discussions get simplified too much. People ask "Can this database scale?" But that's not really the question. The question is that "Can the access patterns scale?". Because I've seen architectures where the storage layer was perfectly fine but the query patterns were fundamentally unscalable. The database wasn't failing. The engineers were accidentally asking impossible questions. I also think people underestimate how much of the engineering effort goes into protecting the system from other engineers. Imagine hundreds or thousands of developers shipping features independently. Somebody adds a dashboard query. Someone adds a reporting endpoint. Someone adds a support workflow. Someone adds a new recommendation system. Every one of those features introduces new access patterns. Some are local. Some are gonna fan out across half the topology. Some are gonna look harmless in staging and become a disaster in production. Which is why systems like Vitess are fascinating to me. They're not just solving sharding. They're creating guardrails around sharding. They're giving engineers a way to think about a distributed database without needing every application engineer to become a distributed systems expert. And that's probably the hardest part. Not building a system that scales today. Building a system where thousands of future decisions made by engineers you've never met don't accidentally destroy the scalability you spent years building.
4
2
49
4,335
unified database. The ODBC driver would transparently lookup the datastore (data center host database, etc lookup) for the queried data, fetch it, and assemble it, presenting it to the app layer as if it came from a single DB. (2/3)
1
2
201
Nicos Nicolaou retweeted
🚀 I just published a new article on how to set up an Encrypted Preferences DataStore in Android. Learn how to securely store sensitive data using Jetpack DataStore encryption. 🔗 medium.com/@nicosnicolaou/en… 👏 Clap if you find it useful! #AndroidDev #Kotlin #Jetpack #DataStore
2
7
371
FallenRoseThornYT retweeted
watch this update be released and every datastore in roblox games fucking kills itself
Jun 10
Roblox games will no longer use global User IDs, instead users will now be given a new User ID unique to each game they join.
1
1
7
432
Replying to @FlyingFrets
Yeah, there's a bit more friction, but it's not breaking anything. Plus, sleitnick said he will update his Datastore plugin to handle this case.
1
22
Replying to @Quintinity
it’s more work if you’re using a datastore plugin if userids are different
1
1
20
Replying to @Roblox_RTC
Wait so what happens to save files and anything datastore related that uses a player id 😭?
2
4
1,481
probably no cuz thats the whole point, im thinking theyll make like FromGlobalUserId() and returns the new one they making, cuz if they dont add smth like this itll just fuck up every datastore
72
"How does the model know the difference between a healthy pg_stat_activity snapshot on a quiet Tuesday and a runaway transaction on a Black Friday morning?" That's the question Dave Page has been trying to answer while building the pgEdge AI DBA Workbench. The Workbench is now generally available, and Dave published a technical deep-dive on how it actually works. The architecture: four services on a shared Postgres datastore. Ellie, the AI agent inside it, is an agentic loop that drives any LLM - Claude, OpenAI, Gemini, Ollama, or anything OpenAI-compatible - through a fixed set of database-aware tool calls. The model never queries your database directly. That's not a limitation. That's the design. Anomaly detection runs three tiers: z-score baselines catch the obvious deviations cheaply. pgvector similarity search flags patterns that match previous anomalies before they cascade. LLM escalation handles only what the cheaper tiers can't classify. Metrics, baselines, and alert history are stored in Postgres tables you can SELECT from yourself. One detail for the MCP crowd: the Workbench MCP server exposes its full tool catalogue to any HTTP MCP-compatible client. Claude Code, Cursor, VS Code Copilot, and Windsurf can connect directly and run the same diagnostics from your IDE - no Ellie required. Works on any PostgreSQL 14 : Amazon RDS, Supabase, pgEdge Cloud, or community Postgres. Free, open source, PostgreSQL License. Read Dave's walkthrough: 📖 hubs.la/Q04l2R6_0 #postgresql #postgres #dba #database #mcp #modelcontextprotocol #opensource #aiagents #devtools #claudecode #cursor #newrelease #tech #technews
1
105