在Python中抓取网页源代码通常使用`requests`库。以下是使用`requests`库抓取网页源代码的基本步骤:
1. 安装`requests`库:
pip install requests
2. 导入`requests`库并发送HTTP请求获取网页内容:
import requests
url = 'https://example.com' 替换为你想抓取的网址
response = requests.get(url)
3. 获取网页的编码,并设置响应内容为字符串:
response.encoding = 'utf-8' 设置编码为utf-8
html_content = response.text 移除可能的换行符
4. 打印网页源代码或保存到文件:
print(html_content)
with open('output.html', 'w', encoding='utf-8') as file:
file.write(html_content)