今天看了知乎上的一个问答,关于如何爬取网易云音乐的评论
关于如何爬网易云音乐的评论
我发现,第一位大佬写的方法,嗯,确实看不懂(虽然不妨碍白嫖),然后我自己试了试,params和encSecKey直接F12+ctrlC/V复制的😂
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87
| """ 不按照大佬写的加密方法,只能获取第一页的评论/(ㄒoㄒ)/~~ """
import requests from bs4 import BeautifulSoup import json import time
def get_song_html(url): """获取网页HTML""" headers={ "User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36", "Host":"music.163.com", "Upgrade-Insecure-Requests":'1' } data={ "params": "wFDYU3SUWyoCUekCRy6S6oPwnDHv/2Cvd8zMLZ1TCZexhtvcOGdiZdCw+UC7Y5QKC+7KdMMOZqc2eHjTDFfqVEPwuajKbFwywKKBuxe2gfkYBTNiC02rbZM5OxMM22qhVrZRPZMzAxWZz3t213Ts8A==", "encSecKey": "b6c67b2c848ef79a5cc0bc0261b6b88b75209276f9f1050091d64731398809b2e0d03081618d9c1a3d442ae1367e7e1a1f54224a6e94fed8eddc3bb337017d0b9f3bb8a274fbf58e8142020b7cbd909f9addf68c674f0232811fa18bf7a1dd90030a5f607ff2c488f20e2aab37dbab1bedff5cfa6684f6e49b69bfc727e943c1" } song_id=url.split("=")[1] """已知的网易云音乐网页链接""" url_so="http://music.163.com/weapi/v1/resource/comments/R_SO_4_{}?csrf_token=".format(song_id) url_al="https://music.163.com/weapi/v1/resource/comments/R_AL_3_{}?csrf_token=".format(song_id) url_dj="https://music.163.com/weapi/v1/resource/comments/A_DJ_1_{}?csrf_token=".format(song_id) urls=[url_so,url_al,url_dj] answer=input("获取热门评论:1\n获取全部评论:2\n请输入: ") if answer=='1': for url in urls: try: html=requests.post(url,headers=headers,data=data) html.raise_for_status() except: return "爬取失败!" else: get_song_hot_comments(html) elif answer=='2': for url in urls: try: html=requests.post(url,headers=headers,data=data) html.raise_for_status() except: return "爬取失败!" else: get_song_comments(html) else: print("输入错误!")
def get_song_hot_comments(html): comments = json.loads(html.text) hot_comments = comments['hotComments'] if hot_comments: try: with open('网易云音乐热门评论.txt','w',encoding='utf-8') as f: for C_list in hot_comments: f.write("User: "+C_list['user']['nickname']+'\n') f.write("Comment: \n"+C_list['content']+'\n') f.write("\n"+"-"*30+"\n\n") except: print("保存热门评论失败!") else: print("保存热门评论成功!")
def get_song_comments(html): comments = json.loads(html.text) comments = comments['comments'] if comments: try: with open('网易云音乐评论.txt','w',encoding='utf-8') as f: for C_list in comments: f.write("User: "+C_list['user']['nickname']+'\n') f.write("Comment: \n"+C_list['content']+'\n') f.write("\n"+"-"*30+"\n\n") except: print("保存全部评论失败!") else: print("保存全部评论成功!")
def main(): url=input("请输入需要获取的音乐网址(仅网易云音乐): ") get_song_html(url)
if __name__ == "__main__": main()
|
这样好像也可以获取评论,但是只有第一页的评论
之后又看了第二个的评论,发现有没有加密的api
,于是在尝试了多个各种评论后发现👇:
API
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| url_so="http://music.163.com/api/v1/resource/comments/R_SO_4_{}?limit={}&offset={}"
url_al="http://music.163.com/api/v1/resource/comments/R_AL_3_{}?limit={}&offset={}"
url_dj="http://music.163.com/api/v1/resource/comments/A_DJ_1_{}?limit={}&offset={}"
url_vi="http://music.163.com/api/v1/resource/comments/R_VI_62_{}?limit={}&offset={}"
url_mv="http://music.163.com/api/v1/resource/comments/R_MV_5_{}?limit={}&offset={}"
url_pl"http://music.163.com/api/v1/resource/comments/A_PL_0_{}?limit={}&offset={}"
url_ev="http://music.163.com/api/v1/resource/comments/A_EV_2_{}_{}?limit={}&offset={}"
|
这些url对应都是评论,limit是一页的数量,offset就是偏移量=(评论页数-1) * limit
如何爬网易云音乐的评论数? - 知乎
https://www.zhihu.com/question/36081767