pinterest dl 다운로드 - pinterest dl 소스 코드 다운로드

Pinterest 이미지 다운로더(pinterest-dl)

이 라이브러리를 사용하면 Pinterest에서 이미지를 스크랩하고 다운로드할 수 있습니다. 자동화를 위해 Selenium을 사용하면 사용자가 지정된 Pinterest URL에서 이미지를 추출하여 선택한 디렉터리에 저장할 수 있습니다.

여기에는 직접 사용을 위한 CLI와 프로그래밍 방식 액세스를 위한 Python API가 포함되어 있습니다. 이 도구는 브라우저 쿠키를 사용하여 공개 및 비공개 보드와 핀에서 이미지를 스크랩하는 것을 지원합니다. 또한 사용자는 나중에 액세스할 수 있도록 스크랩한 URL을 JSON 파일에 저장할 수도 있습니다.

️ 부인 성명:
이 프로젝트는 독립적이며 Pinterest와 관련이 없습니다. 이는 교육 목적으로만 설계되었습니다. 웹사이트 스크래핑을 자동화하는 것은 해당 서비스 약관과 충돌할 수 있다는 점에 유의하세요. 저장소 소유자는 이 도구의 오용에 대해 어떠한 책임도 지지 않습니다. 책임감있게 사용하고 법적 위험을 감수하십시오.

?️ 참고:
이 프로젝트는 pinterest-image-scraper에서 영감을 얻었습니다.

? 특징

✅ Pinterest URL에서 직접 이미지를 스크랩하세요.
✅ URL 목록에서 이미지를 비동기적으로 다운로드합니다. (풀 요청 참조)
✅ 나중에 액세스할 수 있도록 스크랩한 URL을 JSON 파일에 저장하세요.
✅ 시크릿 모드를 사용하여 긁는 작업을 눈에 띄지 않게 유지하세요.
✅ 효과적인 디버깅을 위해 자세한 출력에 액세스하세요.
✅ Firefox 브라우저를 지원합니다.
✅ 검색 가능성을 위해 다운로드한 이미지에 메타데이터 comment 으로 이미지에 대한 alt 텍스트를 삽입하세요.
✅ 브라우저 쿠키로 개인 보드와 핀을 긁어냅니다. (풀 요청 참조)
✅ 리버스 엔지니어링된 Pinterest API를 사용하여 이미지를 스크랩합니다. (이것이 기본 동작입니다. --client chrome 또는 --client firefox 지정하여 webdriver를 사용할 수 있습니다.) (풀 요청 참조)

알려진 문제

? 검색어가 포함된 Pinterest URL과 호환되지 않습니다.
? Linux 및 Mac에서는 엄격하게 테스트되지 않았습니다. 버그를 보고하려면 이슈를 생성하세요.

? 요구사항

Python 3.10 이상
Chrome 또는 Firefox 브라우저

? 설치

pip 사용(권장)

pip install pinterest-dl

GitHub에서 복제

git clone https://github.com/sean1832/pinterest-dl.git
cd pinterest-dl
pip install .

CLI 사용법

일반 명령 구조

pinterest-dl [command] [options]

예

익명 모드에서 이미지 스크랩:

로그인 없이 익명 모드로 Pinterest URL https://www.pinterest.com/pin/1234567 에서 ./images/art 디렉토리로 이미지를 스크랩합니다. 이미지는 30 개로 제한되고 최소 해상도는 512x512 입니다. 스크랩된 URL을 JSON 파일에 저장합니다.

pinterest-dl scrape " https://www.pinterest.com/pin/1234567 " " images/art " -l 30 -r 512x512 --json

브라우저 쿠키 받기:

Pinterest 로그인을 위한 브라우저 쿠키를 가져와 헤드풀 모드(브라우저 창 포함)에서 cookies.json 파일에 저장합니다.

pinterest-dl login -o cookies.json --headful

팁

Pinterest 이메일과 비밀번호를 입력하라는 메시지가 표시됩니다. 이 도구는 나중에 사용할 수 있도록 브라우저 쿠키를 지정된 파일에 저장합니다.

비공개 보드 긁기:

cookies.json 파일에 저장된 쿠키를 사용하여 개인 Pinterest 보드에서 이미지를 스크랩합니다.

pinterest-dl scrape " https://www.pinterest.com/pin/1234567 " " images/art " -l 30 -c cookies.json

팁

--client 옵션을 사용하면 스크래핑에 chrome 또는 firefox Webdriver를 사용할 수 있습니다. 이는 느리지만 더 안정적입니다. 이미지를 긁기 위해 헤드리스 모드에서 브라우저가 열립니다. --headful 플래그를 사용하여 창 모드에서 브라우저를 실행할 수도 있습니다.

이미지 다운로드:

최소 해상도 1024x1024 로 art.json 파일의 이미지를 ./downloaded_imgs 디렉터리에 다운로드합니다.

pinterest-dl download art.json -o downloaded_imgs -r 1024x1024

명령

1. 로그인

개인 보드와 핀을 스크랩하기 위한 브라우저 쿠키를 얻으려면 자격 증명을 사용하여 Pinterest에 로그인하세요.

통사론:

pinterest-dl login [options]

옵션:

-o , --output [file] : 나중에 사용할 수 있도록 브라우저 쿠키를 저장하는 파일입니다. (기본값: cookies.json )
--client : 스크래핑 클라이언트( chrome / firefox )를 선택합니다. (기본값: chrome )
--headful : 브라우저 창을 사용하여 헤드풀 모드로 실행합니다.
--verbose : 디버깅을 위한 자세한 출력을 활성화합니다.
--incognito : 스크래핑을 위해 시크릿 모드를 활성화합니다.

팁

login 명령을 입력하면 Pinterest 이메일과 비밀번호를 입력하라는 메시지가 표시됩니다. 그러면 도구는 나중에 사용할 수 있도록 브라우저 쿠키를 지정된 파일에 저장합니다. (지정하지 않으면 ./cookies.json 에 저장됩니다)

2. 긁어내기

지정된 Pinterest URL에서 이미지를 추출합니다.

통사론:

pinterest-dl scrape [url] [output_dir] [options]

옵션:

-c , --cookies [file] : 개인 보드/핀에 대한 브라우저 쿠키가 포함된 파일입니다. 쿠키를 얻으려면 login 명령을 실행하십시오.
-l , --limit [number] : 다운로드할 최대 이미지 수(기본값: 100).
-r , --resolution [width]x[height] : 다운로드를 위한 최소 이미지 해상도(예: 512x512).
--timeout [second] : 요청 시간 제한(초)(기본값: 3)
--json : 스크랩된 URL을 JSON 파일에 저장합니다.
--dry-run : 이미지를 다운로드하지 않고 스크랩을 실행합니다.
--verbose : 디버깅을 위한 자세한 출력을 활성화합니다.
--client : 스크래핑 클라이언트( api / chrome / firefox )를 선택합니다. (기본값: API)
--incognito : 스크래핑을 위해 시크릿 모드를 활성화합니다. ( 크롬/파이어폭스만 해당 )
--headful : 브라우저 창을 사용하여 헤드풀 모드로 실행합니다. ( 크롬/파이어폭스만 해당 )

3. 다운로드

파일에 제공된 URL 목록에서 이미지를 다운로드합니다.

통사론:

pinterest-dl download [url_list] [options]

옵션:

-o , --output [directory] : 출력 디렉터리(기본값: ./<json_filename>).
-r , --resolution [width]x[height] : 다운로드할 최소 해상도(예: 512x512).
--verbose : 자세한 출력을 활성화합니다.

파이썬 API

Python 코드에서 직접 PinterestDL 클래스를 사용하여 프로그래밍 방식으로 이미지를 스크랩하고 다운로드할 수도 있습니다.

1. 빠른 스크랩 및 다운로드

다음 예는 한 단계로 Pinterest URL에서 이미지를 스크랩하고 다운로드하는 방법을 보여줍니다.

 from pinterest_dl import PinterestDL

# Initialize and run the Pinterest image downloader with specified settings
images = PinterestDL . with_api (
    timeout = 3 ,  # Timeout in seconds for each request (default: 3)
    verbose = False ,  # Enable detailed logging for debugging (default: False)
). scrape_and_download (
    url = "https://www.pinterest.com/pin/1234567" ,  # Pinterest URL to scrape
    output_dir = "images/art" ,  # Directory to save downloaded images
    limit = 30 ,  # Max number of images to download 
    min_resolution = ( 512 , 512 ),  # Minimum resolution for images (width, height) (default: None)
    json_output = "art.json" ,  # File to save URLs of scraped images (default: None)
    dry_run = False ,  # If True, performs a scrape without downloading images (default: False)
    add_captions = True ,  # Adds image `alt` text as metadata to images (default: False)
)

2. 비공개 게시판의 경우 쿠키로 긁기

2a. 쿠키 얻기 개인 보드 및 핀을 스크랩하기 위한 브라우저 쿠키를 얻으려면 먼저 Pinterest에 로그인해야 합니다.

 import os
import json

from pinterest_dl import PinterestDL

# Make sure you don't expose your password in the code.
email = input ( "Enter Pinterest email: " )
password = os . getenv ( "PINTEREST_PASSWORD" )

# Initialize browser and login to Pinterest
cookies = PinterestDL . with_browser (
    browser_type = "chrome" ,
    headless = True ,
). login ( email , password ). get_cookies (
    after_sec = 7 ,  # Time to wait before capturing cookies. Login may take time.
)

# Save cookies to a file
with open ( "cookies.json" , "w" ) as f :
    json . dump ( cookies , f , indent = 4 )

2b. 쿠키로 긁기 쿠키를 획득한 후 이를 사용하여 개인 보드와 핀을 긁을 수 있습니다.

 from pinterest_dl import PinterestDL

# Initialize and run the Pinterest image downloader with specified settings
images = (
    PinterestDL . with_api ()
    . with_cookies (
        "cookies.json" ,  # Path to cookies file
    )
    . scrape_and_download (
        url = "https://www.pinterest.com/pin/1234567" ,  # Assume this is a private board URL
        output_dir = "images/art" ,  # Directory to save downloaded images
        limit = 30 ,  # Max number of images to download
    )
)

3. 하위 레벨 제어를 통한 정밀 스크래핑

이미지 스크래핑 및 다운로드에 대한 보다 세부적인 제어가 필요한 경우 이 예를 사용하십시오.

3a. API 포함

 import json

from pinterest_dl import PinterestDL

# 1. Initialize PinterestDL with API.
scraped_images = PinterestDL . with_api (). scrape (
    url = "https://www.pinterest.com/pin/1234567" ,  # URL of the Pinterest page
    limit = 30 ,  # Maximum number of images to scrape
    min_resolution = ( 512 , 512 ),  # <- Only available to set in the API. Browser mode will have to pruned after download.
)

# 2. Save Scraped Data to JSON
# Convert scraped data into a dictionary and save it to a JSON file for future access
images_data = [ img . to_dict () for img in scraped_images ]
with open ( "art.json" , "w" ) as f :
    json . dump ( images_data , f , indent = 4 )

# 3. Download Images
# Download images to a specified directory
downloaded_imgs = PinterestDL . download_images ( images = scraped_images , output_dir = "images/art" )

valid_indices = list ( range ( len ( downloaded_imgs )))  # All images are valid to add captions

# 4. Add Alt Text as Metadata
# Extract `alt` text from images and set it as metadata in the downloaded files
PinterestDL . add_captions ( images = downloaded_imgs , indices = valid_indices )

3b. 브라우저 포함

 import json

from pinterest_dl import PinterestDL

# 1. Initialize PinterestDL with API.
scraped_images = PinterestDL . with_browser (
    browser_type = "chrome" ,  # Browser type to use ('chrome' or 'firefox')
    headless = True ,  # Run browser in headless mode
). scrape (
    url = "https://www.pinterest.com/pin/1234567" ,  # URL of the Pinterest page
    limit = 30 ,  # Maximum number of images to scrape
)

# 2. Save Scraped Data to JSON
# Convert scraped data into a dictionary and save it to a JSON file for future access
images_data = [ img . to_dict () for img in scraped_images ]
with open ( "art.json" , "w" ) as f :
    json . dump ( images_data , f , indent = 4 )

# 3. Download Images
# Download images to a specified directory
downloaded_imgs = PinterestDL . download_images ( images = scraped_images , output_dir = "images/art" )

# 4. Prune Images by Resolution
# Remove images that do not meet the minimum resolution criteria
valid_indices = PinterestDL . prune_images ( images = downloaded_imgs , min_resolution = ( 200 , 200 ))

# 5. Add Alt Text as Metadata
# Extract `alt` text from images and set it as metadata in the downloaded files
PinterestDL . add_captions ( images = downloaded_imgs , indices = valid_indices )