在Python中捕获网络请求通常有以下几种方法:
1. 使用`requests`库:
import requests
url = 'https://www.example.com'
response = requests.get(url)
print(response.text)
2. 使用`urllib`库:
import urllib.request
url = 'https://www.example.com'
response = urllib.request.urlopen(url)
html = response.read().decode('utf-8')
print(html)
3. 使用`Selenium`库:
from seleniumwire import webdriver
driver = webdriver.Chrome()
driver.get('https://my.test.url.com')
for request in driver.requests:
print(request.url)
print(request.headers)
print(request.response.headers)
4. 使用`Playwright`库:
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
context = browser.new_context()
page = context.new_page()
page.goto('https://example.com')
page.expect_request().then(lambda request: print(request.url))
browser.close()
from flask import Flask, request
app = Flask(__name__)
@app.route('/', methods=['GET', 'POST'])
def index():
if request.method == 'GET':
return 'Hello, GET request!'
elif request.method == 'POST':
data = request.get_json()
return 'Hello, POST request! Data: {}'.format(data)
else:
return 'Unsupported request method'
if __name__ == '__main__':
app.run()
6. 使用`Selenium-ware`库(注意:该库可能已经不再维护):
from seleniumwire import webdriver
driver = webdriver.Chrome()
driver.get('https://my.test.url.com')
for request in driver.requests:
print(request.url)
print(request.headers)
print(request.response.headers)
以上方法可以帮助你捕获网络请求,但请注意,对于有反爬虫策略的网站,可能需要额外的设置,如设置浏览器标识、携带Cookie等。