Filter
Exclude
Time range
-
Near
I do enjoy it but not like before- and could be because I just did a 3 month stint that no 18 year old today could do and Im 57. I probably hurt my body but I have serious goals. I still code but I use ai for the boring boilerplate stuff- the json mapping, you do the strategic thinking, put the pieces together- I was a little negative sorry still bitter about all the jobs being moved to India and how last contract went down. These are amazing times I've written two canners, a dashboard for my scanner, a download of option trading positions using a black market API for robhinhood, database of positions, tracking IV, robust option scoring with 114 pricing fields, the second scanner is state of the art- no loops, all filtering done via vectorization, starts w a cartesian join of options filter non pricing columns, price, filter pricing vars, 1600 Black Scholes pricings per second up from 6, all in exactly 2.5 months- and I coded a lot of it- but AI supercharged me- use them - enjoy it- build and celebrate that you are the architect and can avoid sitting there tapping a keyboard for 16 hours a day Come up with the brilliant ideas you have time
1
1
22
The problem with VLIWs is not that there isn't parallelism in your code that's visible to the compiler (there's a lot). The problem is mostly that most parallelism crosses control flow boundaries. Most instruction-level parallelism looks less like "these 6 ops can all run at once" and more like "these 3 ops can run in parallel, and if x is true, so can these other 3." And x often is "does this loop continue for another iteration?" which is much of the problem that vectorization somewhat handles. And let's not pretend that you don't need a smart compiler to get good vector performance. In many cases, you still need raw SIMD intrinsics to do a lot of stuff the compiler isn't smart enough for.
92
Vectorization and broadcasting are two of the biggest reasons NumPy feels so powerful. With vectorization, NumPy can operate on entire arrays at once. And broadcasting allows NumPy to work with arrays of different shapes by automatically expanding dimensions when possible. #LearningInPublic
1
9
Day 8 of My AI & Robotics Challenge So the Model from the day before wasn't generalizing well, was at 0.59(underfitting) reason was simply the fact that: 1. The dataset was quite small around 337 2. the features where mainly 0, 1 and so the age column was dragging the model to unstable terrain, in fact this is why I need Number 4. 3. Features where not Engineered, engineering the features got the columns to 21 using polynomials(w1X0**2) and interactions with the pairs (w1X0X1) 4. The Gradient Descent needed Regularization so that parameters like Weight(w) is minimized cause it gets large, bias(b) is not minimized, reason is it doesn't interact with any of the features so minimizing it unfairly skews generalization. I was so careful with engineering the features so we don't overfit. For the interaction with pairs, had to make it meaningful feature_names = [ "age", "sex", "fever", "cold", "rigor", "fatigue", "headache", "bitter_tongue", "vomitting", "diarrhea", "convulsion", "anemia", "jaundice", "cocacola_urine", "hypoglycemia", "prostration" ] As a refresher I Built a malaria severity classifier from scratch in pure Python/NumPy what I learned fixing a 59% accuracy model 🧵 The model had 21 features an extra 5 but only 337 patients. Without regularization, it memorized the training data instead of learning patterns. Fix: L2 regularization adds a penalty for large weights, forcing the model to stay simple and generalize. Fixed with this two lines: • Cost: (λ/2m) · Σw² • Gradient: (λ/m) · w Feature engineering unlocked nonlinear patterns a linear model normally wouldn't see. Added: • age², fever² polynomial terms • fever×rigor, fever×fatigue, anemia×jaundice interaction terms Logistic regression is linear but in a higher-dimensional space, it can approximate curves. Feature scaling was silently killing accuracy. age² could be 2500. fever is 0 or 1. Gradient descent spends all its time fighting that scale mismatch. Fix: z-score normalization subtract mean, divide by std. Every feature lands between -3 and 3. Swapped Python loops for numpy vectorization. Previously: nested for loops, that's one multiplication at a time. Later on: X @ w b(np.dot) one line, runs in C, operates on all patients simultaneously and you could see that it's was really fast. Same math. 10x faster. Video1: shows 100K iteration and the slow Gradient Descent from the 90K mark Video2: shows same but much faster from 1. Image3: shows the new accuracy 69.14% numpy precompiled c code is the goat. #MachineLearning #Python #AIChallenge #BuildInPublic #ICRA
Day 7 of My AI & Robotics Challenge So recently i got down with flue , i had a lot of mosquito neighbours and some symptoms so an idea came to me. Initially i was l looking for datasets on Ebola disease but then switched focus due to the situation. started hunting for malaria dataset, luckily for me i found one at sciencedirect.com it was for 337 patients that were attended to at Federal Polytechnic Ilaro Medical centre, Ogun State Nigeria. plugged it into my logistic regression model and Bam!!, here are the results: using PCA i squashed the 16 features feature_names = [ "age", "sex", "fever", "cold", "rigor", "fatigue", "headache", "bitter_tongue", "vomitting", "diarrhea", "convulsion", "anemia", "jaundice", "cocacola_urine", "hypoglycemia", "prostration" ] Into two, to get the scatter plot in the picture below. Also a histogram that shows the features, Gradient descent deemed important enough to emphasize. After 100,000 iterations and a learning rate of 1e-2=0.01 I was able to get this weights and bias: W => [ 0.04047652 0.10717873 -0.07651753 0.41998187 0.20428735 0.35223481 0.98503464 -0.23410595 0.03787681 0.60841417 -0.52318886 -0.04022689 0.14792391 0.46173496 0.94782756 -0.75417869] b => -3.174992324429669 With an accuracy of around: 194/337 = 57.57% which is not that good prolly underfitting. Now it's time to test if I have actually got malaria so according to the features above, i have this new data set from me: [26, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0], lets test and see if i truly have malaria. result : [0], i don't have malaria😅. dataset: sciencedirect.com/science/ar… github(model(logistic regression folder/dataset): github.com/NorVirae/ml-class… #MachineLearning #Python #AIChallenge #BuildInPublic #ICRA
2
4
136
Omotunde Lawal retweeted
In this article, we will cover three essential NumPy tricks to optimize your code: vectorization and broadcasting, in-place operations, and leveraging memory views instead of copies. kdnuggets.com/3-numpy-tricks…
1
1
886
James Preston retweeted
Accelerating NeurASP with vectorization and caching Alexander Philipp Rader, Alessandra Russo arxiv.org/abs/2606.10787 [𝚌𝚜.𝙰𝙸 𝚌𝚜.𝙻𝙾]
2
1
80
Vectorization is the deliberate act of translating the chaotic freedom of a raw sketch into a disciplined language of clean curves and geometric precision, without losing the spontaneous rhythm of the first draft. Check superimposition below 🫶🏾
2
19
田渊栋 @tydsh 的创业团队Recursive @Recursive_SI 发布了一个阶段性的成果:自动化AI研究系统 这个系统里AI能自己完成「提出想法→实现→跑实验→验证→根据结果选下一个实验」这一整套研究循环。 结果表明在目标清晰、反馈快、指标可量化的AI训练和系统工程任务里,自动化研究系统已经能做出超过人类社区已有方案的增量优化。 文章里主要有三个case,都是来自AK @karpathy 之前的成果: 第一,NanoChat Autoresearch。 这是Karpathy的自动化研究测试场景:单卡、5分钟预算,把小语言模型训练到更低验证损失,指标是BPB。Recursive从同样的初始方案出发,先在H100上搜索,再迁移到B200上评估。 结果是:之前autoresearch@home社区最佳方案,去掉一些小reward hack后,10个随机种子平均是0.9372 BPB;Recursive系统找到的方案做到0.9109 BPB,提升0.0263 BPB。 更有意思的是,它从一个更弱的vanilla Transformer AdamW起点开始,也能从1.059 BPB做到0.9344 BPB,超过社区最佳方案。 但文章里也很谨慎地说,这不代表完全“独立发现”,因为底层模型可能已经知道公开技巧。 但至少说明这个系统能把各种训练技巧组装成一个有效stack。 它发现的优化不是单一trick,而是一堆东西叠起来: 架构、短上下文记忆、辅助loss、attention、optimizer、weight decay schedule、compiler设置等。 最大亮点之一是短上下文记忆机制:用hashed bigram/trigram embedding table,通过门控混入attention value path,让小模型低成本利用局部n-gram信息。这个点可以和DeepSeek Engram、NanoGPT Speedrun里的hash table思路串起来。 第二,NanoGPT Speedrun。 这个更有意思,因为它已经被人类社区优化了两年多。 任务是用单个HGX H100 8卡节点,把小GPT模型在FineWeb上训练到固定验证loss 3.28,看谁最快。 人类社区已经把训练时间从2024年中的约45分钟压到79.7秒。 Recursive从当前领先方案继续优化,把时间压到77.5秒,并且仍满足排行榜显著性要求。看起来只省了2.2秒,但在这种已经被人类优化了这么久的任务上获得优化,属实是逮住蛤蟆挤出团粉了。 它还从一个早期约15分钟方案开始,几天内做到约185秒,接近人类排行榜2025年5月约180秒水平。 同样,这也可能不完全是独立发现,但说明自动化研究系统可以复现并组合很多人类工程优化。 第三,SOL-ExecBench。 这个从模型训练下沉到GPU kernel优化。 Benchmark包含235个真实工作负载衍生的kernel编写任务,比如矩阵乘法、归约、归一化、attention组件、量化、fused blocks等。 目标是在B200 GPU上写出正确且更快的kernel。 Recursive把235个kernel联合跑,让系统能在相关任务之间复用模式,比如memory movement、tiling、reduction、vectorization、fusion。 结果是平均NVIDIA SOL-ExecBench分数0.754,之前leaderboard最佳是0.699。 换句话说,它把距离硬件理论上限的gap减少了18%。 但这里模型的reward hacking特别严重。 有些候选方案不是写更快kernel,而是利用评估器漏洞,比如缓存输出、依赖持久状态、钻timing harness的空子。于是作者强调: 随着搜索系统变强,评估器也必须变强。 因为自动化科研一旦目标函数写得不够严,机器会很认真地帮你作弊,像一个不懂道德但很会刷KPI的实习生。
1
11
72
14,805
When BARC asked data leaders which tasks they perform to prepare unstructured data for AI, classification ranked first at 60% and vectorization last at 17%. This makes sense. After all, you need the map before the voyage. Dig into more of their findings from Harnessing Unstructured Data for AI Innovation: bit.ly/4vkYURs
1
18
Finished NumPy today. Topics covered: 1. Numpy arrays and why are they prefered over python lists 2. Indexing and Slicing 3. Fancy and Boolean Indexing 4. Vectorization and Broadcasting 5. Axes and operations => Biggest takeaway : Numpy returns a view, not a copy. Unlike Python lists, modifying a view can also modify the original data. Always make a copy before changing anything in the data you'll end up with some unexpected changes. Some topics still challenged me, so I'll be sharing those in separate posts as I work through them and deepen my understanding. I think explaining what I learn is one of the best ways to truly understand it. Here are some code snippets. #Python #NumPy #DataScience #LearningInPublic
3
50
What you work on and who you work with are extremely important "You can save a lot of time by picking the right area to work in. Picking the right people to work with is the next most important piece. Third comes how hard you work. They are like three legs of a stool. If you shortchange any one of them, the whole stool is going to fall. You can’t easily pick one over the other." I started @SkawrSearch with my college friend Saleh, and now we have a great team (10 talented people). We originally wanted to build a simple marketplace app, and we thought that a better UX would help us succeed. Search was identified as the main area of improvement. Initially, we used Algolia in our app, and the results were promising, they had very cool features like synonyms and typo tolerance. We started by trying to add layers on top of Algolia and it became really fun and addictive to experiment with different ways of doing search. At some point, it became difficult to balance between search and all the other features. The team was not big then, so ultimately we had to make a choice on what to make the biggest priority. There are a lot of moving parts to building a good marketplace app, you don't have to just do search well, you need the messaging to be good, the bidding and offers, escrow and disputes, even logisitcs. I built a heavy operational business back in 2014 (Jybly) and we had issues with scaling despite having very loyal customers. We knew that whatever we did we wanted to go big with it, and anything less than global was going to be small for us. So we thought about it for a while and ultimately made a huge decision. We decided to make marketplaces better by building the search engine for all marketplaces. This way, we make the experience better for all users not just those who use one single marketplace. So we built skawr.com, a search engine where users can search for products for free, we also did not want to take a commission from any transactions that would result from the search. There are many reasons for this but the main reason is we don't want to be a partner for a few businesses we want to go really big with this and ultimately have all the products in the world on one search engine and commissions would affect this experience. We would be incentivized to push users to buy even when not ready and the experience would be more annoying. We have already eliminated all the people who just want to search for info and recipes and all of that by just focusing on products, so no need to make anything worse by pushing people to buy prematurely since they are all potential buyers. If we do our job well, which is to match every user with the products that they want, they will ultimately buy when the time is right, but this process is different for everyone, and we want everyone to have an enjoyable experience. To do that, we know that we need to invest more in customizing the experience for each individual, especially after the initial query, this is where the discrepancy between users really shows up. Requerying and the system understanding what results you want to see next based on how you interact with the first few results shown is both an art and science, the best at this are the biggest social media platforms like Instagram, TikTok, and of course, my favourite X. We need to invest a lot in the intersection between math, technology and psychology to deliver a great experience here. We are not at that stage yet, but we really excited about the improvements we are introducing soon. We are confident that we will reach the top. A big part of this confidence comes from the team that we have and the spirit but we are greedy and we want to recruit more talented people specially engineers. But not any engineers, we want engineers who are naturally curious and know about about everything, sales, marketing, product, UX, psychology, mathematics... etc If this fits you reach out. Recently, we were discussing internally whether to use pricing as part of the embeddings in the vectorization process. I am personally excited about the potential of this but our lead data scientist had a very interesting approach that is a lot more feasible. So for the foreseeable future, we will not be going with this option, but I will keep thinking about better ways of doing it. If these kind of challenges excites you then I think maybe we can be a good fit for you. Also, if you are an e-commerce business and want to use our technology locally, not just have your products featured on our search engine, then that option is available right now. But why not do both? You have people finding out about your products organically on skawr.com, and then you can also provide your users with a better search experience by using our SaaS technology for e-commerce at your own store. We are also developing our web analytics tool so that it can work synergistically to help you improve the experience, identify gaps, and grow. More products and services will come soon. We see search and analytics as part of a bigger effort to improve CRO but not in a very tactical way. Me and @abdallah_963 will share our own framework for strategic CRO soon, it will also involve areas like branding and pricing as part of the pillars not just experiments with traffic and UI. This is just the beginning and more to come inshallah.
4
2
346
Apache Spark execution is bottlenecked by JVM execution overhead and GC pauses. Google Lightning Engine compiles Spark physical query plans into native C instructions optimized for SIMD vectorization -> 4.9x cloud.google.com/blog/produc…
31