Product Promotion
gittech.site
for different kinds of informations and explorations.
Java DataFrame library 1.0 GA release
DFLib
DFLib ("DataFrame Library") is a lightweight pure Java implementation of a common DataFrame
data structure.
DataFrames exist in Python (pandas), R, Spark and other languages and frameworks. DFLib's DataFrame is specifically
intended for Java and JVM languages.
With DataFrame API, you get essentially the same data manipulation capabilities you may be used to in SQL (such as
joins, etc.), only you apply them in-memory and over dynamically defined "table" objects. While SQL is "declarative",
DataFrame
allows step-by-step transformations that are somewhat easier to understand and much easier to compose.
DataFrame
is extremely versatile and can be used to model a variety of data tasks. ETL, log analysis, spreadsheets
processing are just some of the examples. DFLib comes with connectors for many data formats:
CSV, Excel, RDBMS, Avro, Parquet, JSON and can be easily adapted to other formats (e.g. web-based ones like
Google Sheets, etc.)
DFLib provides integration with Apache Echarts to visualize DataFrame data. Charts are generated in a form of HTML/JavaScript code and work in Jupyter as well as regular web applications.
While DFLib works in any Java application, it has a special intergation with Jupyter Notebook, a browser-based interactive environment for data exploration and analysis popular among data scientists and data engineers. In fact, our community maintains a Java "kernel" for Jupyter as a sister project to DFLib.
Project Links
Presentation Videos
- DataFrame, a Swiss Army Knife of Java Data Processing, JUG Milano, Italy 09/2024
- Data visualization with Apache ECharts and DFLib, New York Java SIG, 07/2024
Made with ❤️
to provide different kinds of informations and resources.