gittech. site

for different kinds of informations and explorations.

Text data browser for NLP, LLM researchers and developers

Published at
2 days ago


A lightweight app that makes browsing and analyzing text data a breeze.

Key Features

πŸ” Intuitive Navigation: Effortlessly browse local (or remote) data in HuggingFace, JSONL, etc., formats.
⚑ Efficient Browsing: Stream large local (or remote) datasets without loading (or downloading) in memory.
πŸš€ Powerful Analysis: Easily filter and sort data for better insights.
πŸ’» Pretty-Print Code: Human-friendly visualization of code embedded in your data.

Experience seamless data browsing and analysis with Datahawk πŸ¦…!

Alternatives include: Lilac, HuggingFace Dataset Viewer.

Instructions

Install

Installation requires python>=3.8.

pip install datahawk

Run

Launch the app from anywhere as:

datahawk

This will start the application at localhost:5009.

Specify a custom port number as:

datahawk -p PORT

This will start the application at localhost:PORT.

Usage

Usage is quite intuitive! You can find on-screen instructions by hovering over the information icons ℹ️.

License

Datahawk has an MIT license, as found in the LICENSE file.

Acknowledgements