Download scrapingbee python - scrapingbee python Download do código -fonte

scrapingbee python

Outro código-fonte

v2.0.1:

Baixar

ScrapingBee Python SDK

O ScrapingBee é uma API de raspagem na Web que lida com navegadores sem cabeça e gira proxies para você. O Python SDK facilita a interação da API do ScrapingBee.

Instalação

Você pode instalar o ScrapingBee Python SDK com PIP.

pip install scrapingbee

Uso

O ScrapingBee Python SDK é um invólucro na biblioteca de solicitações. Suporta suportes de ScrapingBee Get e Post Solicitações.

Inscreva -se no ScrapingBee para obter sua chave da API e alguns créditos gratuitos para começar.

Fazendo um pedido

 >> > from scrapingbee import ScrapingBeeClient

>> > client = ScrapingBeeClient ( api_key = 'REPLACE-WITH-YOUR-API-KEY' )

>> > response = client . get (
    'https://www.scrapingbee.com/blog/' , 
    params = {
        # Block ads on the page you want to scrape	
        'block_ads' : False ,
        # Block images and CSS on the page you want to scrape	
        'block_resources' : True ,
        # Premium proxy geolocation
        'country_code' : '' ,
        # Control the device the request will be sent from	
        'device' : 'desktop' ,
        # Use some data extraction rules
        'extract_rules' : { 'title' : 'h1' },
        # Wrap response in JSON
        'json_response' : False ,
        # Interact with the webpage you want to scrape 
        'js_scenario' : {
            "instructions" : [
                { "wait_for" : "#slow_button" },
                { "click" : "#slow_button" },
                { "scroll_x" : 1000 },
                { "wait" : 1000 },
                { "scroll_x" : 1000 },
                { "wait" : 1000 },            
            ]
        },
        # Use premium proxies to bypass difficult to scrape websites (10-25 credits/request)
        'premium_proxy' : False ,
        # Execute JavaScript code with a Headless Browser (5 credits/request)
        'render_js' : True ,
        # Return the original HTML before the JavaScript rendering	
        'return_page_source' : False ,
        # Return page screenshot as a png image
        'screenshot' : False ,
        # Take a full page screenshot without the window limitation
        'screenshot_full_page' : False ,
        # Transparently return the same HTTP code of the page requested.
        'transparent_status_code' : False ,
        # Wait, in miliseconds, before returning the response
        'wait' : 0 ,
        # Wait for CSS selector before returning the response, ex ".title"
        'wait_for' : '' ,
        # Set the browser window width in pixel
        'window_width' : 1920 ,
        # Set the browser window height in pixel
        'window_height' : 1080
    },
    headers = {
        # Forward custom headers to the target website
        "key" : "value"
    },
    cookies = {
        # Forward custom cookies to the target website
        "name" : "value"
    }
)
>> > response . text
'<!DOCTYPE html><html lang="en"><head>...'

O ScrapingBee pega vários parâmetros para renderizar JavaScript, executar um script JavaScript personalizado, usar um proxy premium de uma geolocalização específica e muito mais.

Você pode encontrar todos os parâmetros suportados na documentação do ScrapingBee.

Você pode enviar cookies e cabeçalhos personalizados como normalmente faria com a biblioteca de solicitações.

Captura de tela

Aqui, um pouco de exemplo de como recuperar e armazenar uma captura de tela do blog Scrapingbee em sua resolução móvel.

 >> > from scrapingbee import ScrapingBeeClient

>> > client = ScrapingBeeClient ( api_key = 'REPLACE-WITH-YOUR-API-KEY' )

>> > response = client . get (
    'https://www.scrapingbee.com/blog/' , 
    params = {
        # Take a screenshot
        'screenshot' : True ,
        # Specify that we need the full height
        'screenshot_full_page' : True ,
        # Specify a mobile width in pixel
        'window_width' : 375
    }
)

>> > if response . ok :
        with open ( "./scrapingbee_mobile.png" , "wb" ) as f :
            f . write ( response . content )

Usando ScrapingBee com Scrapy

O SCRAPY é a estrutura de raspagem na web do Python mais popular. Você pode facilmente integrar a API do ScrapingBee com o middleware de scrapy.

Tentativas

O cliente inclui um mecanismo de tentativa de respostas 5xx.

 >> > from scrapingbee import ScrapingBeeClient

>> > client = ScrapingBeeClient ( api_key = 'REPLACE-WITH-YOUR-API-KEY' )

>> > response = client . get (
    'https://www.scrapingbee.com/blog/' , 
    params = {
        'render_js' : True ,
    },
    retries = 5
)

Expandir

Informações adicionais

Versão v2.0.1:
Tipo Outro código-fonte
Data da Última Atualização 2025-02-15
tamanho 11.38KB
Vindo de Github

Aplicativos Relacionados

Python Portfolio

2024-11-10
datamule python

2024-11-08
stripe python

2024-11-05
automaited python

2024-11-03
Código-fonte Python sistema de gerenciamento python código-fonte python caso python sistema python

2023-01-11
Pitão

2009-05-24

Recomendado para você

chat.petals.dev

Outro código-fonte

1.0.0
GPT Prompt Templates

Outro código-fonte

1.0.0
GPTyped

Outro código-fonte

GPTyped 1.0.5
waymo open dataset

Outro código-fonte

December 2023 Update
Sunamu

Outro código-fonte

Release 2.2.0
MySchedule.py

Outro código-fonte

Updates to the fetching of week codes
waymo open dataset

Outro código-fonte

December 2023 Update
termwind

Outras categorias

v2.3.0
wp functions

Outras categorias

1.0.0

Informações Relacionadas Todos