This is a CLI tool to efficiently download a local copy of the pwned password hash data from the very awesome HIBP pwned passwords api-endpoint using all the good bits; multiprocessing, async-processes, local-caching, content-etags and http2-connection pooling to probably make things as fast as is Pythonly possible.
query
for compromised password values from the compressed file data-store!download
operations into a --data-path
without re-clobbering api-source.pipx install hibp-downloader
Sample download activity log; host with 32 cores on 500Mbit/s connection.
...
2024-05-16T10:18:01-0400 | INFO | hibp-downloader | prefix=f80c7 source=[lc:13616 et:3 rc:1002358 ro:25 xx:1] processed=[17836.6MB ~414462H/s] api=[918req/s 17597.4MB] runtime=36.4min
2024-05-16T10:18:02-0400 | INFO | hibp-downloader | prefix=f81af source=[lc:13616 et:3 rc:1002558 ro:25 xx:1] processed=[17840.1MB ~414454H/s] api=[918req/s 17600.9MB] runtime=36.4min
2024-05-16T10:18:02-0400 | INFO | hibp-downloader | prefix=f826f source=[lc:13616 et:3 rc:1002758 ro:25 xx:1] processed=[17843.6MB ~414454H/s] api=[918req/s 17604.4MB] runtime=36.4min
2024-05-16T10:18:03-0400 | INFO | hibp-downloader | prefix=f833f source=[lc:13616 et:3 rc:1002958 ro:25 xx:1] processed=[17847.1MB ~414450H/s] api=[918req/s 17607.9MB] runtime=36.4min
api.pwnedpasswords.com
lc
: 13616 from local-cache (lc) - request-responses handled locally without hitting the network.et
: 3 etag-matched (et) - request-responses that confirmed our local data was up-to-date and did not require a new download.rc
: 1002958 from remote-cache (rc) - request-responses that were downloaded to local, but came from the remote-server cache.ro
: 25 from remote-origin (ro) - request-responses that were downloaded to local, and the download needed to be fetched from remote origin source.xx
: 1 failed responses - request-responses that failed (and successfully retried).All rights reserved.