今天看了知乎上的一个问答,关于如何爬取网易云音乐的评论

关于如何爬网易云音乐的评论
我发现,第一位大佬写的方法,嗯,确实看不懂(虽然不妨碍白嫖),然后我自己试了试,params和encSecKey直接F12+ctrlC/V复制的😂

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
"""
不按照大佬写的加密方法,只能获取第一页的评论/(ㄒoㄒ)/~~
"""

import requests
from bs4 import BeautifulSoup
import json
import time

def get_song_html(url):
"""获取网页HTML"""
headers={
"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36",
"Host":"music.163.com",
"Upgrade-Insecure-Requests":'1'
}
data={
"params": "wFDYU3SUWyoCUekCRy6S6oPwnDHv/2Cvd8zMLZ1TCZexhtvcOGdiZdCw+UC7Y5QKC+7KdMMOZqc2eHjTDFfqVEPwuajKbFwywKKBuxe2gfkYBTNiC02rbZM5OxMM22qhVrZRPZMzAxWZz3t213Ts8A==",
"encSecKey": "b6c67b2c848ef79a5cc0bc0261b6b88b75209276f9f1050091d64731398809b2e0d03081618d9c1a3d442ae1367e7e1a1f54224a6e94fed8eddc3bb337017d0b9f3bb8a274fbf58e8142020b7cbd909f9addf68c674f0232811fa18bf7a1dd90030a5f607ff2c488f20e2aab37dbab1bedff5cfa6684f6e49b69bfc727e943c1"
}
song_id=url.split("=")[1]
"""已知的网易云音乐网页链接"""
#歌曲
url_so="http://music.163.com/weapi/v1/resource/comments/R_SO_4_{}?csrf_token=".format(song_id)
#专辑
url_al="https://music.163.com/weapi/v1/resource/comments/R_AL_3_{}?csrf_token=".format(song_id)
#电台
url_dj="https://music.163.com/weapi/v1/resource/comments/A_DJ_1_{}?csrf_token=".format(song_id)
urls=[url_so,url_al,url_dj]#常用三个url
answer=input("获取热门评论:1\n获取全部评论:2\n请输入: ")
if answer=='1':
for url in urls:
try:
html=requests.post(url,headers=headers,data=data)
html.raise_for_status()
except:
return "爬取失败!"
else:
get_song_hot_comments(html)
elif answer=='2':
for url in urls:
try:
html=requests.post(url,headers=headers,data=data)
html.raise_for_status()
except:
return "爬取失败!"
else:
get_song_comments(html)
else:
print("输入错误!")

def get_song_hot_comments(html):
comments = json.loads(html.text)
hot_comments = comments['hotComments']
if hot_comments:
try:
with open('网易云音乐热门评论.txt','w',encoding='utf-8') as f:
for C_list in hot_comments:
f.write("User: "+C_list['user']['nickname']+'\n')
f.write("Comment: \n"+C_list['content']+'\n')
f.write("\n"+"-"*30+"\n\n")
except:
print("保存热门评论失败!")
else:
print("保存热门评论成功!")

def get_song_comments(html):
comments = json.loads(html.text)
comments = comments['comments']
if comments:
try:
with open('网易云音乐评论.txt','w',encoding='utf-8') as f:
for C_list in comments:
f.write("User: "+C_list['user']['nickname']+'\n')
f.write("Comment: \n"+C_list['content']+'\n')
f.write("\n"+"-"*30+"\n\n")
except:
print("保存全部评论失败!")
else:
print("保存全部评论成功!")

def main():
url=input("请输入需要获取的音乐网址(仅网易云音乐): ")
get_song_html(url)

if __name__ == "__main__":
main()

这样好像也可以获取评论,但是只有第一页的评论
88uadI.png

之后又看了第二个的评论,发现有没有加密的api
,于是在尝试了多个各种评论后发现👇:

API

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#单曲{id}{limit}{offset}
url_so="http://music.163.com/api/v1/resource/comments/R_SO_4_{}?limit={}&offset={}"
#专辑
url_al="http://music.163.com/api/v1/resource/comments/R_AL_3_{}?limit={}&offset={}"
#电台
url_dj="http://music.163.com/api/v1/resource/comments/A_DJ_1_{}?limit={}&offset={}"
#视频
url_vi="http://music.163.com/api/v1/resource/comments/R_VI_62_{}?limit={}&offset={}"
#MV
url_mv="http://music.163.com/api/v1/resource/comments/R_MV_5_{}?limit={}&offset={}"
#歌单
url_pl"http://music.163.com/api/v1/resource/comments/A_PL_0_{}?limit={}&offset={}"
#事件(这个好像比较特殊,有id和uid)
url_ev="http://music.163.com/api/v1/resource/comments/A_EV_2_{}_{}?limit={}&offset={}"

这些url对应都是评论,limit是一页的数量,offset就是偏移量=(评论页数-1) * limit
88l2cT.png

如何爬网易云音乐的评论数? - 知乎
https://www.zhihu.com/question/36081767