Privacy Policy

Last updated: March 5, 2026

What pdfmux is

pdfmux is an open-source Python library that runs entirely on your machine. It processes PDF files locally — no data is sent to any server unless you explicitly configure an optional cloud extractor.

Data we collect

None. The pdfmux library does not collect, transmit, or store any personal data, usage data, or telemetry. Your PDFs stay on your machine.

Website (pdfmux.com)

This website is a static site hosted on GitHub Pages. We do not use cookies, analytics, or tracking scripts. GitHub may collect standard server logs (IP address, browser type) as part of hosting — see GitHub's privacy statement for details.

Optional cloud extractors

If you install pdfmux[llm] and configure a Gemini API key, pdfmux will send page images to Google's Gemini API for extraction. This is opt-in and requires your explicit API key configuration. The data handling for that request is governed by Google's API terms.

PyPI

When you install pdfmux via pip install pdfmux, your request goes through PyPI. We do not have access to any data from pip installs.

Third-party integrations

pdfmux offers optional integrations with LangChain and LlamaIndex. These integrations run locally and do not transmit data on their own. Any data transmission depends on how you configure those frameworks in your application.

Contact

Questions about this policy? Open an issue on GitHub.