Last updated: March 5, 2026
pdfmux is an open-source Python library that runs entirely on your machine. It processes PDF files locally — no data is sent to any server unless you explicitly configure an optional cloud extractor.
None. The pdfmux library does not collect, transmit, or store any personal data, usage data, or telemetry. Your PDFs stay on your machine.
This website is a static site hosted on GitHub Pages. We do not use cookies, analytics, or tracking scripts. GitHub may collect standard server logs (IP address, browser type) as part of hosting — see GitHub's privacy statement for details.
If you install pdfmux[llm] and configure a Gemini API key, pdfmux will send page images to Google's Gemini API for extraction. This is opt-in and requires your explicit API key configuration. The data handling for that request is governed by Google's API terms.
When you install pdfmux via pip install pdfmux, your request goes through PyPI. We do not have access to any data from pip installs.
pdfmux offers optional integrations with LangChain and LlamaIndex. These integrations run locally and do not transmit data on their own. Any data transmission depends on how you configure those frameworks in your application.
Questions about this policy? Open an issue on GitHub.