
gittech. site
for different kinds of informations and explorations.
egoist/sitefetch: Fetch an entire site and save it as a text file
sitefetch
Fetch an entire site and save it as a text file (to be used with AI models).
Install
One-off usage (choose one of the followings):
bunx sitefetch
npx sitefetch
pnpx sitefetch
Install globally (choose one of the followings):
bun i -g sitefetch
npm i -g sitefetch
pnpm i -g sitefetch
Usage
sitefetch https://egoist.dev -o site.txt
# or better concurrency
sitefetch https://egoist.dev -o site.txt --concurrency 10
Match specific pages
Use the -m, --match
flag to specify the pages you want to fetch:
sitefetch https://vite.dev -m "/blog/**" -m "/guide/**"
The match pattern is tested against the pathname of target pages, powered by micromatch, you can check out all the supported matching features.
Content selector
We use mozilla/readability to extract readable content from the web page, but on some pages it might return irrelevant contents, in this case you can specify a CSS selector so we know where to find the readable content:
sitefetch https://vite.dev --content-selector ".content"
Plug
If you like this, please check out my LLM chat app: https://chatwise.app
API
import { fetchSite } from "sitefetch"
await fetchSite("https://egoist.dev", {
//...options
})
Check out options in types.ts.
License
MIT.