ดาวน์โหลด pinterest dl - ดาวน์โหลด pinterest dl ซอร์สโค้ด

เครื่องมือดาวน์โหลดรูปภาพ Pinterest (pinterest-dl)

ไลบรารีนี้อำนวยความสะดวกในการคัดลอกและดาวน์โหลดรูปภาพจาก Pinterest การใช้ Selenium สำหรับระบบอัตโนมัติทำให้ผู้ใช้สามารถดึงภาพจาก Pinterest URL ที่ระบุและบันทึกลงในไดเร็กทอรีที่เลือกได้

ประกอบด้วย CLI สำหรับการใช้งานโดยตรงและ Python API สำหรับการเข้าถึงโดยทางโปรแกรม เครื่องมือนี้รองรับการคัดลอกรูปภาพจากบอร์ดสาธารณะและส่วนตัวและพินโดยใช้คุกกี้ของเบราว์เซอร์ นอกจากนี้ยังอนุญาตให้ผู้ใช้บันทึก URL ที่คัดลอกไปยังไฟล์ JSON เพื่อการเข้าถึงในอนาคต

ข้อสงวนสิทธิ์:
โครงการนี้เป็นโครงการอิสระและไม่มีส่วนเกี่ยวข้องกับ Pinterest มันถูกออกแบบมาเพื่อวัตถุประสงค์ทางการศึกษาเท่านั้น โปรดทราบว่าการคัดลอกเว็บไซต์โดยอัตโนมัติอาจขัดแย้งกับข้อกำหนดในการให้บริการ เจ้าของพื้นที่เก็บข้อมูลจะไม่รับผิดชอบต่อการใช้งานเครื่องมือนี้ในทางที่ผิด ใช้อย่างมีความรับผิดชอบและเป็นความเสี่ยงทางกฎหมายของคุณเอง

️หมายเหตุ:
โปรเจ็กต์นี้ได้รับแรงบันดาลใจจาก pinterest-image-scraper

- คุณสมบัติ

✅ คัดลอกรูปภาพโดยตรงจาก URL ของ Pinterest
✅ ดาวน์โหลดรูปภาพจากรายการ URL แบบอะซิงโครนัส (ดูคำขอดึง)
✅ บันทึก URL ที่คัดลอกมาเป็นไฟล์ JSON เพื่อการเข้าถึงในอนาคต
✅ โหมดไม่ระบุตัวตนเพื่อให้การขูดของคุณรอบคอบ
✅ เข้าถึงเอาต์พุตโดยละเอียดเพื่อการดีบักที่มีประสิทธิภาพ
✅ รองรับเบราว์เซอร์ Firefox
✅ แทรกข้อความ alt สำหรับรูปภาพเป็น comment ข้อมูลเมตาในรูปภาพที่ดาวน์โหลดเพื่อให้สามารถค้นหาได้
✅ขูดบอร์ดและหมุดส่วนตัวด้วยคุกกี้ของเบราว์เซอร์ (ดูคำขอดึง)
✅ คัดลอกรูปภาพโดยใช้ Pinterest API ที่ออกแบบทางวิศวกรรมย้อนกลับ (นี่จะเป็นพฤติกรรมเริ่มต้น คุณสามารถใช้ webdriver ได้โดยการระบุ --client chrome หรือ --client firefox ) (ดูคำขอดึง

ปัญหาที่ทราบ

- ความเข้ากันไม่ได้กับ URL ของ Pinterest ที่มีคำค้นหา
- ไม่มีการทดสอบอย่างมากบน Linux และ Mac โปรดสร้างปัญหาเพื่อรายงานข้อบกพร่องใด ๆ

- ความต้องการ

Python 3.10 หรือใหม่กว่า
เบราว์เซอร์ Chrome หรือ Firefox

- การติดตั้ง

การใช้ pip (แนะนำ)

pip install pinterest-dl

การโคลนจาก GitHub

git clone https://github.com/sean1832/pinterest-dl.git
cd pinterest-dl
pip install .

CLI-การใช้งาน

โครงสร้างการบังคับบัญชาทั่วไป

pinterest-dl [command] [options]

ตัวอย่าง

การขูดรูปภาพในโหมดไม่ระบุชื่อ:

คัดลอกรูปภาพในโหมดไม่ระบุตัวตนโดยไม่ต้องเข้าสู่ระบบไปยังไดเร็กทอรี ./images/art จาก URL ของ Pinterest https://www.pinterest.com/pin/1234567 โดยจำกัดไว้ที่ 30 ภาพและความละเอียดขั้นต่ำ 512x512 บันทึก URL ที่คัดลอกมาเป็นไฟล์ JSON

pinterest-dl scrape " https://www.pinterest.com/pin/1234567 " " images/art " -l 30 -r 512x512 --json

รับคุกกี้เบราว์เซอร์:

รับคุกกี้ของเบราว์เซอร์สำหรับการเข้าสู่ระบบ Pinterest และบันทึกลงในไฟล์ cookies.json ในโหมด headful (พร้อมหน้าต่างเบราว์เซอร์)

pinterest-dl login -o cookies.json --headful

เคล็ดลับ

คุณจะได้รับแจ้งให้ป้อนอีเมลและรหัสผ่าน Pinterest ของคุณ เครื่องมือจะบันทึกคุกกี้ของเบราว์เซอร์ลงในไฟล์ที่ระบุเพื่อใช้ในอนาคต

การขูดกระดานส่วนตัว:

คัดลอกรูปภาพจากบอร์ด Pinterest ส่วนตัวโดยใช้คุกกี้ที่บันทึกไว้ในไฟล์ cookies.json

pinterest-dl scrape " https://www.pinterest.com/pin/1234567 " " images/art " -l 30 -c cookies.json

เคล็ดลับ

คุณสามารถใช้ตัวเลือก --client เพื่อใช้ chrome หรือ firefox Webdriver สำหรับการขูด สิ่งนี้ช้ากว่าแต่เชื่อถือได้มากกว่า มันจะเปิดเบราว์เซอร์ในโหมดหัวขาดเพื่อขูดภาพ คุณยังสามารถใช้ --headful ธงเพื่อเรียกใช้เบราว์เซอร์ในโหมดหน้าต่าง

กำลังดาวน์โหลดรูปภาพ:

ดาวน์โหลดภาพจากไฟล์ art.json ไปยังไดเร็กทอรี ./downloaded_imgs ที่มีความละเอียดขั้นต่ำ 1024x1024

pinterest-dl download art.json -o downloaded_imgs -r 1024x1024

คำสั่ง

1. เข้าสู่ระบบ

เข้าสู่ระบบ Pinterest โดยใช้ข้อมูลประจำตัวของคุณเพื่อรับคุกกี้ของเบราว์เซอร์สำหรับขูดบอร์ดและหมุดส่วนตัว

ไวยากรณ์:

pinterest-dl login [options]

ตัวเลือก:

-o , --output [file] : ไฟล์สำหรับบันทึกคุกกี้ของเบราว์เซอร์เพื่อใช้ในอนาคต (ค่าเริ่มต้น: cookies.json )
--client : เลือกไคลเอนต์การขูด ( chrome / firefox ) (ค่าเริ่มต้น: chrome ยม)
--headful : ทำงานในโหมด headful พร้อมหน้าต่างเบราว์เซอร์
--verbose : เปิดใช้งานเอาต์พุตโดยละเอียดสำหรับการดีบัก
--incognito : เปิดใช้งานโหมดไม่ระบุตัวตนสำหรับการขูด

เคล็ดลับ

หลังจากป้อนคำสั่ง login คุณจะได้รับแจ้งให้ป้อนอีเมลและรหัสผ่าน Pinterest ของคุณ เครื่องมือจะบันทึกคุกกี้ของเบราว์เซอร์ลงในไฟล์ที่ระบุเพื่อใช้ในอนาคต (หากไม่ระบุจะบันทึกไปที่ ./cookies.json )

2. ขูด

แยกรูปภาพจาก URL Pinterest ที่ระบุ

ไวยากรณ์:

pinterest-dl scrape [url] [output_dir] [options]

ตัวเลือก:

-c , --cookies [file] : ไฟล์ที่มีคุกกี้ของเบราว์เซอร์สำหรับบอร์ด/พินส่วนตัว เรียกใช้คำสั่ง login เพื่อรับคุกกี้
-l , --limit [number] : จำนวนรูปภาพสูงสุดที่จะดาวน์โหลด (ค่าเริ่มต้น: 100)
-r , --resolution [width]x[height] : ความละเอียดขั้นต่ำของภาพสำหรับการดาวน์โหลด (เช่น 512x512)
--timeout [second] : หมดเวลาเป็นวินาทีสำหรับคำขอ (ค่าเริ่มต้น: 3)
--json : บันทึก URL ที่คัดลอกไปยังไฟล์ JSON
--dry-run : ดำเนินการขูดโดยไม่ต้องดาวน์โหลดภาพ
--verbose : เปิดใช้งานเอาต์พุตโดยละเอียดสำหรับการดีบัก
--client : เลือกไคลเอนต์การขูด ( api / chrome / firefox ) (ค่าเริ่มต้น: API)
--incognito : เปิดใช้งานโหมดไม่ระบุตัวตนสำหรับการขูด ( โครเมียม/ไฟร์ฟ็อกซ์เท่านั้น )
--headful : ทำงานในโหมด headful พร้อมหน้าต่างเบราว์เซอร์ ( โครเมียม/ไฟร์ฟ็อกซ์เท่านั้น )

3. ดาวน์โหลด

ดาวน์โหลดภาพจากรายการ URL ที่ให้ไว้ในไฟล์

ไวยากรณ์:

pinterest-dl download [url_list] [options]

ตัวเลือก:

-o , --output [directory] : ไดเรกทอรีผลลัพธ์ (ค่าเริ่มต้น: ./<json_filename>)
-r , --resolution [width]x[height] : ความละเอียดขั้นต่ำในการดาวน์โหลด (เช่น 512x512)
--verbose : เปิดใช้งานเอาต์พุตแบบละเอียด

หลาม API

คุณยังสามารถใช้คลาส PinterestDL ได้โดยตรงในโค้ด Python ของคุณเพื่อคัดลอกและดาวน์โหลดรูปภาพโดยทางโปรแกรม

1. ขูดและดาวน์โหลดอย่างรวดเร็ว

ตัวอย่างต่อไปนี้แสดงวิธีการคัดลอกและดาวน์โหลดรูปภาพจาก URL ของ Pinterest ในขั้นตอนเดียว

 from pinterest_dl import PinterestDL

# Initialize and run the Pinterest image downloader with specified settings
images = PinterestDL . with_api (
    timeout = 3 ,  # Timeout in seconds for each request (default: 3)
    verbose = False ,  # Enable detailed logging for debugging (default: False)
). scrape_and_download (
    url = "https://www.pinterest.com/pin/1234567" ,  # Pinterest URL to scrape
    output_dir = "images/art" ,  # Directory to save downloaded images
    limit = 30 ,  # Max number of images to download 
    min_resolution = ( 512 , 512 ),  # Minimum resolution for images (width, height) (default: None)
    json_output = "art.json" ,  # File to save URLs of scraped images (default: None)
    dry_run = False ,  # If True, performs a scrape without downloading images (default: False)
    add_captions = True ,  # Adds image `alt` text as metadata to images (default: False)
)

2. ขูดด้วยคุกกี้สำหรับบอร์ดส่วนตัว

2ก. รับคุกกี้ คุณต้องเข้าสู่ระบบ Pinterest ก่อนจึงจะรับคุกกี้ของเบราว์เซอร์สำหรับขูดกระดานและหมุดส่วนตัว

 import os
import json

from pinterest_dl import PinterestDL

# Make sure you don't expose your password in the code.
email = input ( "Enter Pinterest email: " )
password = os . getenv ( "PINTEREST_PASSWORD" )

# Initialize browser and login to Pinterest
cookies = PinterestDL . with_browser (
    browser_type = "chrome" ,
    headless = True ,
). login ( email , password ). get_cookies (
    after_sec = 7 ,  # Time to wait before capturing cookies. Login may take time.
)

# Save cookies to a file
with open ( "cookies.json" , "w" ) as f :
    json . dump ( cookies , f , indent = 4 )

2b. ขูดด้วยคุกกี้ หลังจากได้รับคุกกี้แล้ว คุณสามารถใช้มันขูดกระดานและหมุดส่วนตัวได้

 from pinterest_dl import PinterestDL

# Initialize and run the Pinterest image downloader with specified settings
images = (
    PinterestDL . with_api ()
    . with_cookies (
        "cookies.json" ,  # Path to cookies file
    )
    . scrape_and_download (
        url = "https://www.pinterest.com/pin/1234567" ,  # Assume this is a private board URL
        output_dir = "images/art" ,  # Directory to save downloaded images
        limit = 30 ,  # Max number of images to download
    )
)

3. การขูดแบบละเอียดพร้อมการควบคุมระดับล่าง

ใช้ตัวอย่างนี้หากคุณต้องการการควบคุมการคัดลอกและการดาวน์โหลดรูปภาพที่ละเอียดยิ่งขึ้น

3ก. ด้วยเอพีไอ

 import json

from pinterest_dl import PinterestDL

# 1. Initialize PinterestDL with API.
scraped_images = PinterestDL . with_api (). scrape (
    url = "https://www.pinterest.com/pin/1234567" ,  # URL of the Pinterest page
    limit = 30 ,  # Maximum number of images to scrape
    min_resolution = ( 512 , 512 ),  # <- Only available to set in the API. Browser mode will have to pruned after download.
)

# 2. Save Scraped Data to JSON
# Convert scraped data into a dictionary and save it to a JSON file for future access
images_data = [ img . to_dict () for img in scraped_images ]
with open ( "art.json" , "w" ) as f :
    json . dump ( images_data , f , indent = 4 )

# 3. Download Images
# Download images to a specified directory
downloaded_imgs = PinterestDL . download_images ( images = scraped_images , output_dir = "images/art" )

valid_indices = list ( range ( len ( downloaded_imgs )))  # All images are valid to add captions

# 4. Add Alt Text as Metadata
# Extract `alt` text from images and set it as metadata in the downloaded files
PinterestDL . add_captions ( images = downloaded_imgs , indices = valid_indices )

3บี ด้วยเบราว์เซอร์

 import json

from pinterest_dl import PinterestDL

# 1. Initialize PinterestDL with API.
scraped_images = PinterestDL . with_browser (
    browser_type = "chrome" ,  # Browser type to use ('chrome' or 'firefox')
    headless = True ,  # Run browser in headless mode
). scrape (
    url = "https://www.pinterest.com/pin/1234567" ,  # URL of the Pinterest page
    limit = 30 ,  # Maximum number of images to scrape
)

# 2. Save Scraped Data to JSON
# Convert scraped data into a dictionary and save it to a JSON file for future access
images_data = [ img . to_dict () for img in scraped_images ]
with open ( "art.json" , "w" ) as f :
    json . dump ( images_data , f , indent = 4 )

# 3. Download Images
# Download images to a specified directory
downloaded_imgs = PinterestDL . download_images ( images = scraped_images , output_dir = "images/art" )

# 4. Prune Images by Resolution
# Remove images that do not meet the minimum resolution criteria
valid_indices = PinterestDL . prune_images ( images = downloaded_imgs , min_resolution = ( 200 , 200 ))

# 5. Add Alt Text as Metadata
# Extract `alt` text from images and set it as metadata in the downloaded files
PinterestDL . add_captions ( images = downloaded_imgs , indices = valid_indices )