csvsight

v0.1.0

Published

3 days ago

Eyeball a CSV in seconds: rows × columns, per-column type, null %, ranges, and top values. Zero dependencies, no pandas.

0High
0Medium
0Low

yyfjj

csv csv-stats data profiling data-quality summary inspect eda cli devtools zero-dependency

csvsight

Eyeball a CSV in seconds. Someone drops a CSV export on you and you just want to know: how many rows, what columns, which ones are mostly empty, what's the range of that amount field? csvsight prints exactly that. Zero dependencies — no pandas, no REPL.

npx csvsight data.csv

data.csv — 10,432 rows × 5 columns  (comma, utf-8)

#  column   type    nulls         unique  detail
─  ───────  ──────  ────────────  ──────  ─────────────────────────────────────────
1  id       int     0 (0.0%)      10,432  min 1 · max 10432 · mean 5216.5
2  email    string  12 (0.1%)     10,411  e.g. "[email protected]" · len 9–48
3  amount   float   34 (0.3%)     2,015   min 0.01 · max 9999 · mean 42.3
4  status   string  0 (0.0%)      3       active (61%) · churned (28%) · trial (11%)
5  country  string  120 (1.1%)    47      US (40%) · GB (12%) · DE (7%)

Why

You don't need a DataFrame to answer "what's in this file?" — but the usual tools make you spin one up anyway:

pandas means pip install pandas, a Python session, and remembering the API for .describe() / .isna().sum() / .nunique().
csvkit is lovely but pulls in a handful of dependencies.
Excel chokes on big files and isn't in your terminal.

csvsight is one command on a CSV. It auto-detects the delimiter, infers each column's type, counts 10+ spellings of "missing" (NULL, N/A, nan, -, none, empty, …), and shows ranges for numbers and value distributions for categorical columns.

Usage

csvsight data.csv              # profile a file
cat data.csv | csvsight        # or read from stdin
csvsight data.tsv              # delimiter auto-detected (, tab ; |)

| Option | | |---|---| | --delimiter <c> | force the field delimiter | | --no-header | treat the first row as data (columns named col1, col2, …) | | --top <n> | top N values for categorical columns (default 3) | | --json | emit the analysis as JSON instead of the table |

What it reports per column

type — int / float / string (inferred from the non-null values)
nulls — count and percentage, recognizing many "missing" spellings
unique — distinct non-null values
detail — numbers get min · max · mean; low-cardinality columns get their value distribution; free-text columns get an example and length range

Install

npx csvsight data.csv       # Node >= 18
pip install csvsight        # Python >= 3.8 (byte-for-byte port)

npm: https://www.npmjs.com/package/csvsight
PyPI: https://pypi.org/project/csvsight/
GitHub: https://github.com/jjdoor/csvsight · csvsight-py

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

csvsight

Why

Usage

What it reports per column

Install

License