html文件匹配歌手名和歌名,获取QQ音乐歌手姓名、歌曲信息、播放链接,爬,起,qq,名字...
实现逻辑: 打开chrome浏览器–>F12查看网页源码–>刷新网页查看所要信息是否存于html 中—>是–requests请求和BeautifulSoup分析.否–>找到xhr,刷新页面–>找到相应json请求—>requests请求网站->json提取数据
#实现代码所需模块: requests, csv
代码:
import requests
import csv
#设定初始csv文件
file =open("./qq_music.csv",“w”,newline="",encoding=“utf-8”)
writer =csv.writer(file)
writer.writerow([“歌手名”,“歌曲名”,“播放时长(分钟)”,“播放链接”])
def crawl():
name =input("请输入你想要查的歌手名")
#查询歌曲页数
try:
page =int(input("请输入想要查询的页数"))
except Exception as ret:
print("请输入数字")
else:
#查询歌曲数目
number =input("请输入每页想要查取的数目")
url =""
headers={
"user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36"
}
for i in range(1,page):
pararms={
"ct": "24",
"qqmusic_ver": "1298",
"new_json": "1",
"remoteplace": "txt.yqq.song",
"searchid": "67814518620895005",
"t": "0",
"aggr": "1",
"cr": "1",
"catZhida": "1",
"lossless": "0",
"flag_qc": "0",
"p": i,
"n": number,
"w": name,
"g_tk": "5381",
"loginUin": "0",
"hostUin": "0",
"format": "json",
"inCharset": "utf8",
"outCharset": "utf-8",
"notice": "0",
"platform": "yqq.json",
"needNewCode": "0"
}
res =requests.get(url,headers=headers,params=pararms)
items =res.json()
#定位歌曲信息
songs =items["data"]["song"]["list"]
#歌手名
name =items["data"]["keyword"]
for song in songs:
music_name=song["album"]["name"]
time =int(int(song["interval"])/60)
play_url =song['url']
writer.writerow([name,music_name,time,play_url])
if __name__ == '__main__':
crawl()
html文件匹配歌手名和歌名,获取QQ音乐歌手姓名、歌曲信息、播放链接,爬,起,qq,名字...
实现逻辑: 打开chrome浏览器–>F12查看网页源码–>刷新网页查看所要信息是否存于html 中—>是–requests请求和BeautifulSoup分析.否–>找到xhr,刷新页面–>找到相应json请求—>requests请求网站->json提取数据
#实现代码所需模块: requests, csv
代码:
import requests
import csv
#设定初始csv文件
file =open("./qq_music.csv",“w”,newline="",encoding=“utf-8”)
writer =csv.writer(file)
writer.writerow([“歌手名”,“歌曲名”,“播放时长(分钟)”,“播放链接”])
def crawl():
name =input("请输入你想要查的歌手名")
#查询歌曲页数
try:
page =int(input("请输入想要查询的页数"))
except Exception as ret:
print("请输入数字")
else:
#查询歌曲数目
number =input("请输入每页想要查取的数目")
url =""
headers={
"user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36"
}
for i in range(1,page):
pararms={
"ct": "24",
"qqmusic_ver": "1298",
"new_json": "1",
"remoteplace": "txt.yqq.song",
"searchid": "67814518620895005",
"t": "0",
"aggr": "1",
"cr": "1",
"catZhida": "1",
"lossless": "0",
"flag_qc": "0",
"p": i,
"n": number,
"w": name,
"g_tk": "5381",
"loginUin": "0",
"hostUin": "0",
"format": "json",
"inCharset": "utf8",
"outCharset": "utf-8",
"notice": "0",
"platform": "yqq.json",
"needNewCode": "0"
}
res =requests.get(url,headers=headers,params=pararms)
items =res.json()
#定位歌曲信息
songs =items["data"]["song"]["list"]
#歌手名
name =items["data"]["keyword"]
for song in songs:
music_name=song["album"]["name"]
time =int(int(song["interval"])/60)
play_url =song['url']
writer.writerow([name,music_name,time,play_url])
if __name__ == '__main__':
crawl()