Rumble (formerly Sparksoniq) 1.0.0 "Linden Oak" beta
June 3, 2019
Submitted by Ghislain Fourny.
We are happy to announce that Rumble, formerly known as Sparksoniq, has made it to beta, as we found it stable and usable enough.
Rumble is for large, heterogeneous and nested JSON datasets that do not fit in DataFrames, nevertheless with JSONiq (XQuery's little brother) providing the same high-level experience as Spark SQL, mapping FLWOR expressions to Spark transformations seamlessly.
It is open source and available for download on http://rumbledb.org/ as a small standalone 2MB jar to be used with spark-submit on a cluster, but it also works just as well locally to spread the workload on all your CPU cores.
- Now billions of lines can also be manipulated as sequences of strings with text-file() and FLWORs. It makes easy to convert for example CSV to your own JSON on a lot of records.
- more pushdowns to Spark.
- richer function library.
- more bugs fixed (character escaping, serialization, etc).