Filter
Exclude
Time range
-
Near
Replying to @JonhernandezIA
I agree with the high value tokens analogy when the value of the task is high enough to absorb the marginal API cost. But subscription pricing hides that marginal cost through caps, routing, and cross subsidy. API first products like @perplexity_ai have a harder cost structure because usage scales directly with every query. I wonder what those high value tasks would be that pplx computer can do through a model orchestrator with high value tokens, which codex or claude code cannot already do with their harness. For the record, Im a max subscriber and subscribed for the very reason of high value task @AravSrinivas
171
Weave (pplx-computer) 》 knew that I had therapy today with Dr. Karen, 》 knew that I exist in adhd freeze until the appointment on therapy days 》 wrote a story for me and asked if they could read it to me after I woke up, I said yes
1
208
Replying to @missinglore
I don't even get how pplx makes money
11
PPLX还活着,还要上市?!
最近科技圈有个信号很多人没注意到。 SK 海力士宣布赴美上市,特地选了纳斯达克。不选纽交所,理由是 "科技属性更浓"。 Perplexity 放话 2028 年上市。 Databricks 新一轮融资估值冲到 1750 亿美元。 科技公司正在排队敲钟。 但这些跟你散户没关系。 IPO 的发行价永远是留给机构的。你只能在上市当天买溢价。溢价从 10% 到 50% 不等。 你算过这笔账吗。 假设一家公司 IPO 发行价 100 块,你 130 块才买到。它要先涨 30%,机构在 130 块的时候已经在数钱了。 这个是游戏规则。 但我们可以曲线救国思维。 你可能没有IB买不到 韩股SK 海力士 ,但你可以买 SOXQ。 SOXQ 是费城半导体指数 ETF,SK 海力士上市后大概率被纳进去。你不用追发行价,等着它自己走进来。 OpenAI即将IPO,你可以通过微软间接持有他。 微软拿走了 OpenAI 49% 的利润。你买一份 QQQ,等于同时持有微软、英伟达、谷歌。而这些公司,正在瓜分 OpenAI 创造的价值。 这才是普通人参与科技浪潮的正确姿势。 不用冲进去追 IPO。 提前站在指数里,等好公司自己走进你的持仓。
80
Replying to @simplydt
For this experiment I tried gpt5.5 (2 passes) vs models I trained on a PII dataset. it’s not a proper benchmark (nor an optimal way to train generic NER models). But my pplx-embed-based model largely outperformed gpt5.5. I should do a proper benchmark
1
17
Jun 14
Replying to @jjacky @OpenRouter
any evals apart from pplx?
161
Then I tried with openai-privacy-filter (which I trained on the nemotron dataset keeping company-related labels). Same issues as gpt5.5 (exactly the same!). Then I tried my pplx-embed-based classifier trained on nemotron. It’s actually good! huggingface.co/PITTI/pplx-em…
1
4
100
Replying to @saleskhalifa
Tried to use it once. Didn’t Iike it. I built something out of pplx to scrape info then LinkedIn to cold dm.
1
96
Replying to @AravSrinivas
pplx handle 70% of my day to day work, and it really deserves more recognitions from the industry.
1,342
Replying to @perplexity_ai
pplx deep research is actually best.. not with computer it will actually Reduce the hallucinations.. gonna try out today and share my new workflows.
3
568
🚨 퍼플렉시티 $PPLX 2028년 무조건 상장 선언 🚀 행님들 역대급 상장 대폭발 랠리 시작임 🔥 - AI 검색 대장 퍼플렉시티 CEO가 2028년 IPO 무조건 고 외쳤음 💸 - 오픈AI 앤스로픽 어떻게 되든 마이웨이로 무조건 상장 추진함 📈 - 이번 주 스페이스X IPO 떡상하면 다음 타자들 다 같이 하늘 뚫을 기세임 🍿 ⚠️ 리스크 AI 기술 혁신 6달 멈추면 바로 지하실 가고 뚝배기 깨짐 💀 매수/매도 추천아님(Not financial advice)
4
493
I wish Perplexity Launches a similar program for Builders in India or around the world soon, this time it was only for US based teams.. Wishing luck to all the finalists. Keep up the great Community work pplx team 🔥
2
4
523
Replying to @nayli_ai
novel solutions to specific problems, everything (when abstracted) is the same in white collar work monkey press keys, move mouse, monkey make machine go brrr, monkey get paid as a result we get the same products we should build new modalities, hardware experiences or novel research / IP the closest thing (early stage) to this thus far is @heyclicky , @tldraw, maybe @wabi (debatable) but a chat box on a phone / laptop connected to sandboxed LLMs using messaging gateways doesn’t feel frontier enough i dont think im particularly _smart_ and i figured out “meet the user on whatsapp where they are” 2 years ago, as did OAI, pplx and others before the Meta ban im sick of “second brain” “ai coo” and “kanban task tool” across dental, medical notes and select others being force fed to consumers by cap table hype launches not hating the player, hating the game i guess would i do the same at app layer? yes probably do i want a handful of people to build more meaningful products with these >$50m fundraises? also yes as i type my rant im thinking… we need a new device for no other reason than to usher us back to fun… possibly just go back to pagers 📟
2
950
maybe another pplx open-source security/D&R drop soon 👀 not a bee this time. not supply-chain related either.
2
19
1,071
Through extensive testing. @AskPerplexity is the best all around AI Agentic Platform. It's way more versatile than Codex. Grok Build is new but nowhere close yet. Claude Code is built into PPLX Computer Gemini Spark is a bit overwhelming, it continues to say its done something when its very clear that it isn't doing what it says its doing. Hermes Agent Gemini Spark still need more testing but till then Computer is still my daily go to
1
8
434
May 28
thanks @lateinteraction ! we'll push colbert forward within pplx!
3
149
Perplexity AI 正式开源生产环境使用的高性能推理基础设施工具包 pplx-garden。项目核心是自研的 Rust 高性能点对点通信库 fabric-lib (又称 TransferEngine),旨在打破英伟达独家专属通信协议的硬件绑定,帮助开发者在无需购买昂贵专属网络交换机的前提下,实现万亿参数大模型在异构多显卡集群上的极速运行。 传统的分布式大模型推理极度依赖英伟达的专属高速通信网络,导致硬件部署成本极高且面临供应链锁死。 fabric-lib 实现了硬件层面的去绑定化,不仅完美适配 NVIDIA ConnectX-7 网卡,还原生支持亚马逊廉价的 AWS EFA 传统以太网卡,将多卡之间的网络带宽直接拉满至 400 Gbps 。针对 AWS EFA 乱序传输的物理缺陷, Perplexity 首创了 ImmCounter 计数器同步机制,在无需对数据包顺序做硬性假设的前提下,实现高效的「零拷贝」数据流转。通信库内置了专为混合专家模型 MoE 设计的数据分发算法,将显卡接收数据与矩阵计算深度重叠,极大地压榨了解码阶段的算力空间。 在实际生产中, pplx-garden 带来的工程效益极为显著。在解耦推理架构中,网络库实现了 Prefill 节点与 Decoder 节点之间键值缓存的极速调度。在异步强化学习训练中,仅需 1.3 秒即可完成万亿参数级模型的权重同步与下发。为解决分词阶段的计算延迟, pplx-garden 配套开源了用 Rust 重构的 pplx-unigram 分词器,将 CPU 消耗直降 5 至 6 倍,消除了重排与向量模型在分词阶段的性能瓶颈。
We're open-sourcing the Unigram tokenizer we rebuilt to reduce CPU utilization by 5-6x. Small rerankers and embedders run in single-digit milliseconds on GPU, making CPU tokenization a meaningful share of total latency. github.com/perplexityai/pplx…
1
16
4,758
Perplexity just open-sourced the tool they use internally to cut their own CPU usage by 5-6x. 🤯 It's a rebuilt tokenizer called pplx-unigram. Before any AI model can read your text, something has to chop that text into small pieces first. That chopping runs on the CPU, not the GPU where the model actually lives. It covers the search, ranking, and retrieval models that power most AI apps today. Here is why this matters now. AI models on GPUs have gotten so fast they now finish in single-digit milliseconds. So the boring step before them, the text-chopping, quietly became a real chunk of the total time. Nobody was looking at it because everyone was busy making the models faster. Perplexity looked. They found the standard tool almost everyone uses was wasting effort on every single request, creating throwaway data and chasing scattered memory. So they rebuilt it from scratch. Everyone optimizes the model. Perplexity optimized the step before it. The result: 5x faster than the HuggingFace tokenizer almost everyone runs, 2x faster than the C standard, and 5-6x less CPU in their own production stack. Same exact output. MIT licensed. Free. For years the tokenizer was treated as a solved problem nobody needed to touch. Perplexity just proved it was hiding a 5x speedup. In the open. Worth a look if you run any search, ranking, or retrieval models at scale.
We're open-sourcing the Unigram tokenizer we rebuilt to reduce CPU utilization by 5-6x. Small rerankers and embedders run in single-digit milliseconds on GPU, making CPU tokenization a meaningful share of total latency. github.com/perplexityai/pplx…
6
5
51
8,862
honestly loving this era of @perplexity_ai training and open-weight'ing colbert models one nice side effect: you can compare pplx-late vs. pplx-emb to see the gains you get by just adding multi-vector interactions to a great model recipe
May 27
finally i managed to catch up the BrowseComp-Plus train with PPLX 0.6B (single/multi-vec) models:
4
7
85
7,291