Filter
Exclude
Time range
-
Near
Replying to @stloyd @flowphp
You know where but reports belong ;) On GitHub issues.
3
200
What if I said that you can (with some tricks & hacks) even read PDFs in pure #PHP, transform them into @flowphp data frame for further usage (csv, excel, database upsert, ...)?
4
629
Hmm maybe we should integrate FlowPhp with krakjoe.github.io/ort/ 🤔 That would bring ML capabilities to php ETL pipelines

2
388
#flowphp #symfony Http Foundation Bridge has received a DX improvement, making it much easier to stream directly to the client. Streaming benefits include reduced memory usage and no need for temporary files. Works also with #laravel out of the box
1
12
532
In December, I led a beginner's data processing workshop in #PHP at a charity event hosted by our local software architects and developers community. Participants quickly spotted room for improvement in the #flowphp documentation, and we've made it our mission to enhance it.
1
1
3
694
This is what we need at @flowphp documentation to run live examples in the browser with ACE editor and autocompletion included!
This is PHP running in JS through php-wasm. 😮 I might have a use case for this; thanks for suggesting it. @marcelpociot
4
315
Listening to @norbert_tech talking about Parquet and Dremel, which he researched and implemented in his #FlowPHP library. Really interesting stuff, many new things learned 😁. #PHPConPoland2024
1
9
580
What's the easiest way to convert one file format into another with #PHP? The answer is #FlowPHP Command Line Interface The new Flow CLI Package helps to deal with local / remote files directly through the terminal. - flow schema - reads file schema - flow count - read the number of rows - flow run - runs a processing pipeline defined in a PHP file - flow convert - covert input into the output Thanks to underlying Flow Filesystem Abstraction, we can mount external storages and execute all of the above commands against remote locations. More details at: github.com/flow-php/flow/blo…
1
1
9
555
I have been working recently on adding a command line interface to #FlowPHP. A proper one, that would let us: - browse files - read schemas - convert files - execute transformation pipelines My ultimate goal is to implement a custom SQL parser to flow that would allow to quickly run an SQL query against local/remote files. Something like: $ flow query "SELECT * FROM parquet_file("/path/to/file.parquet") as d WHERE d.active = true" --output-file=./path/to/file.csv Which would on the fly filter parquet file and write the output to a csv file
7
347
I think we've all been asked at least once by a business user to add an "export" feature to a system. It's a relatively easy task, assuming the input and output formats are the same, and the data to export is pre-generated. But what happens when the business requests: - The ability to save in different formats - Dynamic column selection - Dynamic filter options - Masking certain columns based on user access level Good news: lnkd.in/ej6kv4U5 for #flowphp allows data to be streamed from Symfony and likely Laravel applications with zero effort. - Minimal memory consumption - Stream data from remote filesystems - Apply dynamic transformations on the fly - Out-of-the-box support for column masking
2
13
583
#flowphp can now not only read from associative arrays but also write to them. Super helpful for testing and debugging. Find more in the documentation: flow-php.com/documentation/d…
2
175
I think everyone has at least once experienced outdated documentation. It's hard, to keep docs fully aligned with the codebase, especially in open-source projects where people are contributing out of their free will in their free and precious time. #flowphp takes a slightly different approach than conventional static, handwritten docs. Flow is developed with the Live Documentation approach. What does it mean? It means that the Flow PHP codebase is decorated with dedicated attributes. This is how we define DSL function to a module and type: #[DocumentationDSL(module: Module::CORE, type: DSLType::EXTRACTOR)] And this is how we link it with an example (one or many): #[DocumentationExample(topic: 'data_source', example: 'array')] All examples in our codebase are fully executable code, that is executed on the CI/CD. This way we make sure that: - documentation stays up to date with the codebase - examples stats up to date and can be copy&pasted and they must work as expected - we create a programmatically accessible database of all our functions/examples/modules which can be used later to generate for example a sitemap to help search engines index our content
1
10
424
Agenda @PHPConPoland gotowa! Jeżeli interesujesz się procesowanien danych, lub dostrzegasz jak Twoja organizacja cierpi z pwodu zbytniej fragmentacji i braku dyscypliny w organizacji danych, wpadnij na moją prezentację. Postaram się przedstawić jak rewelacyjnym formatem plików jest parquet oraz czego możemy się z niego nauczyć implementując nasze systemy 😁 Bo ładowanie wszystkiego do bazki, nie zawsze jest najlepszym wyjściem, a na pewno nie najtańszym 😅 Wszystko przedstawię w formie language agnostic, więc nawet jeżeli nie używasz #php ani #flowphp i tak powinieneś wynieść z tej prezentacji sporo ciekawych informacji 😊 2024.phpcon.pl/pl/#agenda
1
5
268
#php tracking memory leaks Someone recently reported a potential memory leak in FlowPHP, by default I looked into @blackfireio profile which wasn't very helpful, it was showing that memory was leaking but without backfire extension it wasn't so it was pretty much generating memory leaks which I assume is related to how data is collected. Then I played with memprof, but I couldn't figure out anything from the output (mainly because that memory leak was a false report). Then github.com/BitOne/php-meminf… was recommended and I immediately loved it! It gave me precisely what I needed as a very pleasant CLI output.

3
3
25
2,237
A small but important improvement was just merged into FlowPHP landing page. All examples are going to display output (if there is one) 😁
1
11
521
Guys like @norbert_tech are pushing #PHP forward, fingers crossed for him and his #FlowPHP project 🤞.
Replying to @dr4goonis
Emulate? Lol no, I want to replace it 😅 Jokes aside, my goal is to bring a tool for data processing/analyzing to PHP. I spent the last few years designing/building data lakehouses with tools like Apache Spark (Scala) or Delta Lake from DataBricks. My biggest problem was that even the most straightforward aggregations/transformations required adding new technologies/tools to our tech stack, which came with an additional price. Hire more developers who can use those tools or invest in our teams and let them learn how to use them. So yeah, instead of forcing devs to learn pandas or pyspark or forcing businesses to throw more money, I would like to give PHP teams the option to build data transformation pipelines in the language they know. I feel that PHP is a perfect combination of flexibility and strictness and I'm trying to benefit from it. It's something that can be used for both rapid and enterprise development. Also, from my experience, at the end of the day, data accessibility is way more important than clean code, frameworks, and other irrelevant things devs like to argue about. However, due to a lack of data processing tools, PHP devs are not even considering designing data architecture. I want to change that. I'm tired of looking at massive SQL queries devs are using to generate charts, I'm tired of reading blog posts about how ORM's are killing performance while devs are using them to export/import data, I'm tired of looking at heavily abused file_get_contents - those are all use cases for ETL's
5
649
FlowPHP just got mentioned in PHP Annotated 🥳 Thank you @pronskiy for spreading the word about probably the best data processing framework (ETL) in #PHP!
PHP Annotated – October 2023 🎃 blog.jetbrains.com/phpstorm/…
4
16
2,365
I wasn't really following news about building UI, last week I was trying to setup a small landing page for FlowPHP and after few hours of playing with tools I got discouraged by the amount of magic and knowledge one need to have in order to setup a simple skeleton. Then I came across #NoBuild approach through @dhh and I immediately loved it 😍 CSS and JS used to be simple, let's make it simple once again 🤩
3
472
Snappy is a Google compression library used by Parquet format to reduce output file size. It's very popular, especially when working with Apache Spark. So far, in order to read parquet files compressed with Snappy in PHP, we had to install an extension, which is not always an easy task since it's not available at PECL. Not anymore, FlowPHP parquet library will now try to use \snappy_compress \snappy_uncompress functions from the extension, if it doesn't find them, it will fall to pure PHP implementation. PHP Implementation is significantly slower, but it's still reasonable to use it in production where extension can't be installed. Full support for Parquet files in #PHP is coming 🎉 You can find library here: github.com/flow-php/snappy
1
18
3,011