pinterest dl下載 - pinterest dl原始碼下載

Pinterest 圖片下載器 (pinterest-dl)

該庫有助於從 Pinterest 抓取和下載圖像。使用 Selenium 進行自動化，它使用戶能夠從指定的 Pinterest URL 中提取圖像並將其保存到選定的目錄中。

它包括用於直接使用的 CLI 和用於程式存取的 Python API。該工具支援使用瀏覽器 cookie 從公共和私人板和圖釘中抓取圖像。它還允許用戶將抓取的 URL 保存到 JSON 檔案中以供將來存取。

️免責聲明：
該項目是獨立的，不隸屬於 Pinterest。它是專為教育目的而設計。請注意，自動抓取網站可能會與其服務條款發生衝突。儲存庫擁有者對濫用此工具不承擔任何責任。負責任地使用它並自行承擔法律風險。

？
該項目的靈感來自 pinterest-image-scraper。

？特徵

✅ 直接從 Pinterest URL 抓取圖片。
✅ 從 URL 清單非同步下載圖片。（請參閱拉取請求）
✅ 將抓取的 URL 儲存到 JSON 檔案以供將來存取。
✅ 隱身模式可讓您的刮擦保持謹慎。
✅ 存取詳細的輸出以進行有效的偵錯。
✅ 支援火狐瀏覽器。
✅ 在下載的圖像中插入圖像的alt文字作為元資料comment ，以便於搜尋。
✅ 使用瀏覽器 cookie 刮擦私人板和圖釘。（請參閱拉取請求）
✅ 使用反向工程的 Pinterest API 抓取圖片。（這將是預設行為。您可以透過指定--client chrome或--client firefox來使用 webdriver ）（請參閱拉取請求

已知問題

？與包含搜尋查詢的 Pinterest URL 不相容。
？沒有在 Linux 和 Mac 上進行過嚴格測試。請建立一個問題來報告任何錯誤。

？要求

Python 3.10 或更高版本
Chrome 或 Firefox 瀏覽器

？安裝

使用 pip（建議）

pip install pinterest-dl

從 GitHub 克隆

git clone https://github.com/sean1832/pinterest-dl.git
cd pinterest-dl
pip install .

CLI 用法

一般命令結構

pinterest-dl [command] [options]

範例

以匿名模式抓取影像：

以匿名模式，無需登錄，從 Pinterest URL https://www.pinterest.com/pin/1234567抓取圖片至./images/art目錄，圖片限制為30張，最小解析度為512x512 。將抓取的 URL 儲存到JSON檔案。

pinterest-dl scrape " https://www.pinterest.com/pin/1234567 " " images/art " -l 30 -r 512x512 --json

取得瀏覽器 Cookie：

取得用於 Pinterest 登入的瀏覽器 cookie，並以 headful 模式（使用瀏覽器視窗）將它們儲存到cookies.json檔案中。

pinterest-dl login -o cookies.json --headful

提示

系統將提示您輸入 Pinterest 電子郵件和密碼。該工具會將瀏覽器cookie儲存到指定的檔案中以供將來使用。

刮掉私人董事會：

使用cookies.json檔案中儲存的 cookies 從私人 Pinterest 板上抓取影像。

pinterest-dl scrape " https://www.pinterest.com/pin/1234567 " " images/art " -l 30 -c cookies.json

提示

您可以使用--client選項來使用chrome或firefox Webdriver 進行抓取。這較慢但更可靠。它將以無頭模式打開瀏覽器來抓取圖像。您也可以使用--headful標誌以視窗模式執行瀏覽器。

下載圖片：

將圖片從art.json檔案下載到./downloaded_imgs目錄，最小解析度為1024x1024 。

pinterest-dl download art.json -o downloaded_imgs -r 1024x1024

命令

1. 登入

使用您的憑證登入 Pinterest 以取得瀏覽器 cookie，以抓取私人圖板和圖釘。

句法：

pinterest-dl login [options]

選項：

-o , --output [file] ：儲存瀏覽器 cookie 供將來使用的檔案。（預設： cookies.json ）
--client ：選擇抓取客戶端（ chrome / firefox ）。（預設： chrome ）
--headful ：在瀏覽器視窗的 headful 模式下運作。
--verbose ：啟用詳細輸出以進行偵錯。
--incognito ：啟動隱身模式進行抓取。

提示

輸入login指令後，系統將提示您輸入 Pinterest 電子郵件和密碼。然後，該工具會將瀏覽器 cookie 儲存到指定檔案以供將來使用。（如果不指定，則會儲存到./cookies.json ）

2.刮擦

從指定的 Pinterest URL 中提取圖像。

句法：

pinterest-dl scrape [url] [output_dir] [options]

選項：

-c , --cookies [file] ：包含專用板/接腳的瀏覽器 cookie 的檔案。執行login指令取得cookie。
-l , --limit [number] ：要下載的最大影像數量（預設值：100）。
-r , --resolution [width]x[height] ：下載的最小影像解析度（例如，512x512）。
--timeout [second] ：請求逾時（以秒為單位）（預設值：3）。
--json ：將抓取的 URL 儲存到 JSON 檔案中。
--dry-run ：執行抓取而不下載映像。
--verbose ：啟用詳細輸出以進行偵錯。
--client : 選擇抓取客戶端 ( api / chrome / firefox )。（預設：API）
--incognito ：啟動隱身模式進行抓取。（僅限鉻/火狐瀏覽器）
--headful ：在瀏覽器視窗的 headful 模式下運作。（僅限鉻/火狐瀏覽器）

3. 下載

從文件中提供的 URL 清單下載圖片。

句法：

pinterest-dl download [url_list] [options]

選項：

-o , --output [directory] ：輸出目錄（預設值：./<json_filename>）。
-r , --resolution [width]x[height] ：下載的最小解析度（例如 512x512）。
--verbose ：啟用詳細輸出。

Python API

您也可以直接在 Python 程式碼中使用PinterestDL類別以程式方式抓取和下載映像。

1. 快速抓取和下載

以下範例展示如何一步從 Pinterest URL 抓取和下載圖像。

 from pinterest_dl import PinterestDL

# Initialize and run the Pinterest image downloader with specified settings
images = PinterestDL . with_api (
    timeout = 3 ,  # Timeout in seconds for each request (default: 3)
    verbose = False ,  # Enable detailed logging for debugging (default: False)
). scrape_and_download (
    url = "https://www.pinterest.com/pin/1234567" ,  # Pinterest URL to scrape
    output_dir = "images/art" ,  # Directory to save downloaded images
    limit = 30 ,  # Max number of images to download 
    min_resolution = ( 512 , 512 ),  # Minimum resolution for images (width, height) (default: None)
    json_output = "art.json" ,  # File to save URLs of scraped images (default: None)
    dry_run = False ,  # If True, performs a scrape without downloading images (default: False)
    add_captions = True ,  # Adds image `alt` text as metadata to images (default: False)
)

2. 用 Cookie 刮私人板

2a.取得 cookie您需要先登入 Pinterest 取得瀏覽器 cookie，以便抓取私人圖板和圖釘。

 import os
import json

from pinterest_dl import PinterestDL

# Make sure you don't expose your password in the code.
email = input ( "Enter Pinterest email: " )
password = os . getenv ( "PINTEREST_PASSWORD" )

# Initialize browser and login to Pinterest
cookies = PinterestDL . with_browser (
    browser_type = "chrome" ,
    headless = True ,
). login ( email , password ). get_cookies (
    after_sec = 7 ,  # Time to wait before capturing cookies. Login may take time.
)

# Save cookies to a file
with open ( "cookies.json" , "w" ) as f :
    json . dump ( cookies , f , indent = 4 )

2b.使用cookies刮取cookies後，您可以使用它們來刮取私人闆卡和引腳。

 from pinterest_dl import PinterestDL

# Initialize and run the Pinterest image downloader with specified settings
images = (
    PinterestDL . with_api ()
    . with_cookies (
        "cookies.json" ,  # Path to cookies file
    )
    . scrape_and_download (
        url = "https://www.pinterest.com/pin/1234567" ,  # Assume this is a private board URL
        output_dir = "images/art" ,  # Directory to save downloaded images
        limit = 30 ,  # Max number of images to download
    )
)

3. 低階控制的詳細抓取

如果您需要對抓取和下載影像進行更精細的控制，請使用此範例。

3a.具有API

 import json

from pinterest_dl import PinterestDL

# 1. Initialize PinterestDL with API.
scraped_images = PinterestDL . with_api (). scrape (
    url = "https://www.pinterest.com/pin/1234567" ,  # URL of the Pinterest page
    limit = 30 ,  # Maximum number of images to scrape
    min_resolution = ( 512 , 512 ),  # <- Only available to set in the API. Browser mode will have to pruned after download.
)

# 2. Save Scraped Data to JSON
# Convert scraped data into a dictionary and save it to a JSON file for future access
images_data = [ img . to_dict () for img in scraped_images ]
with open ( "art.json" , "w" ) as f :
    json . dump ( images_data , f , indent = 4 )

# 3. Download Images
# Download images to a specified directory
downloaded_imgs = PinterestDL . download_images ( images = scraped_images , output_dir = "images/art" )

valid_indices = list ( range ( len ( downloaded_imgs )))  # All images are valid to add captions

# 4. Add Alt Text as Metadata
# Extract `alt` text from images and set it as metadata in the downloaded files
PinterestDL . add_captions ( images = downloaded_imgs , indices = valid_indices )

3b.使用瀏覽器

 import json

from pinterest_dl import PinterestDL

# 1. Initialize PinterestDL with API.
scraped_images = PinterestDL . with_browser (
    browser_type = "chrome" ,  # Browser type to use ('chrome' or 'firefox')
    headless = True ,  # Run browser in headless mode
). scrape (
    url = "https://www.pinterest.com/pin/1234567" ,  # URL of the Pinterest page
    limit = 30 ,  # Maximum number of images to scrape
)

# 2. Save Scraped Data to JSON
# Convert scraped data into a dictionary and save it to a JSON file for future access
images_data = [ img . to_dict () for img in scraped_images ]
with open ( "art.json" , "w" ) as f :
    json . dump ( images_data , f , indent = 4 )

# 3. Download Images
# Download images to a specified directory
downloaded_imgs = PinterestDL . download_images ( images = scraped_images , output_dir = "images/art" )

# 4. Prune Images by Resolution
# Remove images that do not meet the minimum resolution criteria
valid_indices = PinterestDL . prune_images ( images = downloaded_imgs , min_resolution = ( 200 , 200 ))

# 5. Add Alt Text as Metadata
# Extract `alt` text from images and set it as metadata in the downloaded files
PinterestDL . add_captions ( images = downloaded_imgs , indices = valid_indices )