實現簡單的多線程下載,需要關注如下幾點:
1.文件的大小:可以從reponse header中提取,如“Content-Length:911”表示大小是911字節
2.任務拆分:指定各個線程下載的文件的哪一塊,可以通過request header中添加“Range: bytes=300-400”(表示下載300~400byte的內容),注意可以請求的文件的range是[0, size-1]字節的。
3.下載文件的聚合:各個線程將自己下載的文件塊保存為臨時文件,所有線程都完成后,再將這些臨時文件按順序聚合寫入到最終的一個文件中。
實現代碼:
代碼如下:
#!/usr/bin/python
# -*- coding: utf-8 -*-
# filename: paxel.py
# FROM: http://jb51.net/code/view/58/full/
# Jay modified it a little and save for further potential usage.
'''It is a multi-thread downloading tool
It was developed following axel.
Author: volans
E-mail: volansw [at] gmail.com
'''
import sys
import os
import time
import urllib
from threading import Thread
# in case you want to use http_proxy
local_proxies = {'http': 'http://131.139.58.200:8080'}
class AxelPython(Thread, urllib.FancyURLopener):
'''Multi-thread downloading class.
run() is a vitural method of Thread.
'''
def __init__(self, threadname, url, filename, ranges=0, proxies={}):
Thread.__init__(self, name=threadname)
urllib.FancyURLopener.__init__(self, proxies)
self.name = threadname
self.url = url
self.filename = filename
self.ranges = ranges
self.downloaded = 0
def run(self):
'''vertual function in Thread'''
try:
self.downloaded = os.path.getsize(self.filename)
except OSError:
#print 'never downloaded'
self.downloaded = 0
# rebuild start poind
self.startpoint = self.ranges[0] + self.downloaded
# This part is completed
if self.startpoint >= self.ranges[1]:
print 'Part %s has been downloaded over.' % self.filename
新聞熱點
疑難解答