在Python中获取文件下载或上传进度可以通过以下几种方法实现:
1. 使用`tqdm`库:
`tqdm`是一个快速,可扩展的进度条库,可以很容易地添加到循环中显示进度。
from tqdm import tqdmimport requestsurl = 'http://example.com/file.zip'response = requests.get(url, stream=True)file_size = int(response.headers.get('Content-Length', 0))filename = url.split('/')[-1]with open(filename, 'wb') as f:for chunk in tqdm(response.iter_content(chunk_size=1024),total=file_size, unit='B', unit_scale=True, unit_divisor=1024):if chunk:f.write(chunk)
2. 使用`urllib.request.urlretrieve`的`reporthook`参数:
import sysimport osfrom urllib.request import urlretrievedef reporthook(count, blockSize, totalSize):percent = int(count * blockSize * 100 / totalSize)sys.stdout.write('\r%d%%' % percent)sys.stdout.flush()url = 'http://example.com/file.zip'filename = 'file.zip'urlretrieve(url, filename, reporthook=reporthook)sys.stdout.write('\rDownload complete, saved as %s\n' % filename)

3. 使用`logging`模块记录进度信息:
import logginglogger = logging.getLogger('my_crawler')logger.setLevel(logging.INFO)def log_progress(message):logger.info(message)爬虫代码for url in urls:爬取页面html = requests.get(url).text记录进度log_progress(f'已爬取 {len(html)} 个字节')
4. 自定义回调函数:
import timedef progress_callback(block_num, block_size, total_size):percent = int(block_num * block_size * 100 / total_size)sys.stdout.write('\r%d%%' % percent)sys.stdout.flush()下载文件的代码urllib.request.urlretrieve(DATA_URL, name, reporthook=progress_callback)
以上方法可以帮助你在Python中跟踪文件的下载或上传进度。你可以根据具体需求选择合适的方法
