如何在 Python 中下载文件

python 提供了几种从 Internet 下载文件的方法。这可以使用 urllib 包或请求库通过 HTTP 完成。本教程将讨论如何使用这些库使用 Python 从 URL 下载文件。

如何在 Python 中下载文件第1张

请求库

requests 库是 Python 中最流行的库之一。请求允许您发送 HTTP/1.1 请求，而无需手动将查询字符串添加到您的 URL，或对您的 post 数据进行表单编码。

使用请求库，您可以执行许多功能，包括：

添加表单数据，
添加多部分文件，
并访问 Python 的响应数据

发出请求

您需要做的第一件事是安装库，它非常简单：

pip install requests

要测试安装是否成功，您可以在 Python 解释器中进行非常简单的测试，只需键入：

import requests

如果安装成功，则不会出现任何错误。

HTTP 请求包括：

得到
邮政
放
删除
选项
头

发出 GET 请求

发出请求非常简单，如下图所示。

import requests
req = requests.get(“https://www.google.com”)

上面的命令将获取 google 网页并将信息存储在req变量中。然后我们还可以继续获取其他属性。

例如，要知道获取 Google 网页是否成功，我们将查询status_code.

import requests
req = requests.get(“https://www.google.com") 
req.status_code 
200 
# 200 means a successful request

如果我们想知道Google网页的编码类型怎么办？
req.encoding
ISO-8859–1

您可能还想知道响应的内容。

req.text

这只是响应的截断内容。

'<!doctype html><html itemscope="" itemtype="http://schema.org/webpage" lang="en 
"><head><meta content="Search the world\'s information, including webpages, imag
es, videos and more. Google has many special features to help you find exactly w
hat you\'re looking for." name="description"><meta content="noodp" name="robots" 
><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta conten 
t="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image
"><title>Google</title><script>(function(){window.google={kEI:\'_Oq7WZT-LIf28QWv

发出 POST 请求

简单来说，POST 请求用于创建或更新数据。这尤其用于提交表单。

假设您有一个以电子邮件地址和密码作为输入数据的注册表单，当您单击提交按钮进行注册时，发布请求将如下所示。

data = {"email":"info@weixiaolive.com",
        "password":"12345")
req = requests.post(“http://www.google.com, params = data)

发出 PUT 请求

PUT 请求类似于 POST 请求。它用于更新数据。例如，下面的 api 显示了如何执行 PUT 请求。

data= {"name":"tutsplus",
        "telephone":"12345")
r.put("http://www.contact.com, params= data")

发出删除请求

顾名思义，DELETE 请求用于删除数据。以下是 DELETE 请求的示例。

data= {'name':'Tutsplus'}
url = "https://www.contact.com/api/")
response = requests.delete(url, params= data)

urllib 包

urllib 是一个包，它收集了几个用于处理 URL 的模块，即：

urllib.request用于打开和阅读 URL。
urllib.error包含引发的异常urllib.request
urllib.parse用于解析 URL。
urllib.robotparser用于解析robots.txt文件。

urllib.request提供了一个非常简单的接口，以urlopen能够使用各种不同协议获取 URL 的函数形式出现。它还提供了一个稍微复杂的接口来处理基本身份验证、cookie、代理等。

如何使用 urllib 获取 URL

最简单的使用urllib.request方法如下：

import urllib.request
with urllib.request.urlopen('http://python.org/') as response:
   html = response.read()

如果你想检索互联网资源并存储它，你可以通过urlretrieve()函数来实现。

import urllib.request
filename, headers = urllib.request.urlretrieve('http://python.org/')
html = open(filename)

使用 Python 下载图像

在此示例中，我们要使用请求 llibrary 和 urllib 模块下载此示例图像。

url = 'https://www.python.org/static/opengraph-icon-200x200.png'
# downloading with urllib 
# imported the urllib library 
import urllib.request
# Copy a network object to a local file 
urllib.request.urlretrieve(url, "python.png")
# downloading with requests 
# import the requests library 
import requests
# download the url contents in binary format 
r = requests.get(url)
# open method to open a file on your system and write the contents 
with open("python1.png", "wb") as code:
    code.write(r.content)

使用 Python 下载 pdf 文件

在此示例中，我们将下载有关 Google 趋势的 PDF 文件。

url = 'https://static.googleusercontent.com/media/www.google.com/en//googleblogs/pdfs/google_predicting_the_present.pdf'
# downloading with urllib 
# import the urllib package 
import urllib.request
# Copy a network object to a local file 
urllib.request.urlretrieve(url, "tutorial.pdf")
# downloading with requests 
# import the requests library 
import requests
# download the file contents in binary format 
r = requests.get(url)
# open method to open a file on your system and write the contents 
with open("tutorial1.pdf", "wb") as code:
    code.write(r.content)

使用 Python 下载 Zip 文件

在此示例中，我们将下载GitHub 存储库的内容并将文件存储在本地。

url = 'https://codeload.github.com/fogleman/Minecraft/zip/master'
# downloading with requests 
# import the requests library 
import requests
# download the file contents in binary format 
r = requests.get(url)
# open method to open a file on your system and write the contents 
with open("minemaster1.zip", "wb") as code:
    code.write(r.content)
# downloading with urllib 
# import the urllib library 
import urllib.request
# Copy a network object to a local file 
urllib.request.urlretrieve(url, "minemaster.zip")

使用 Python 下载视频

在这个例子中，我们要下载一个视频讲座

url = 'https://www.youtube.com/watch?v=aDwCCUfNFug'
video_name = url.split('/')[-1]
# downloading with requests 
# import the requests library 
import requests
print("Downloading file:%s" % video_name)
# download the file contents in binary format 
r = requests.get(url)
# open method to open a file on your system and write the contents 
with open("tutorial.mp4", "wb") as code:
    code.write(r.content)
    
# downloading with urllib 
# import the urllib library 
import urllib
print("Downloading file:%s" % video_name)
# Copy a network object to a local file 
urllib.urlretrieve(url, "tutorial2.mp4")

使用 Python 下载 csv 文件

您还可以使用 requests 和 urllib 库下载 CSV 文件并使用 csv 模块处理响应。让我们使用一些示例 CSV 地址数据。

import requests
url = "https://people.sc.fsu.edu/~jburkardt/data/csv/addresses.csv"
# get the file response 
req = requests.get(url)
print(type(req))
# get the contents of the response 
url_content = req.content
csv_file = open('sample2.csv', 'wb')
# write the contents to a csv file 
csv_file.write(url_content)
# close the file 
csv_file.close()
# Using Urllib 
#import necessary modules 
import urllib.request
import csv
import codecs
url = "https://people.sc.fsu.edu/~jburkardt/data/csv/addresses.csv"
# download file from url 
res = urllib.request.urlopen(url)
# open a file 
data = csv.reader(codecs.iterdecode(res, "utf-8"))
for row in data:
    print(row)