爬虫入门:爬取网页源代码
代码
from urllib.request import urlopen
url = "https://baidu.com"
resp = urlopen(url)
# .decode()解码
with open("mybaidu.html", mode="w") as f:
f.write(resp.read().decode("utf-8")) #读取到页面源代码
print("over!")
效果:
Q.E.D.
一个卷不动Java的菜鸡
张培根同学
·
·
·
from urllib.request import urlopen
url = "https://baidu.com"
resp = urlopen(url)
# .decode()解码
with open("mybaidu.html", mode="w") as f:
f.write(resp.read().decode("utf-8")) #读取到页面源代码
print("over!")
Q.E.D.