发布时间:2023-07-21 09:00
嗨喽!大家好呀,这里是魔王~**
解答、资料、源码点击领取~
import requests
import re
import json
import os
headers = {
\'Host\': \'mp.weixin.qq.com\',
\'User-Agent\': \'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36 NetType/WIFI MicroMessenger/7.0.20.1781(0x6700143B) WindowsWechat(0x63060012)\',
\'Cookie\': \'wxuin=2408215323; lang=zh_CN; pass_ticket=TsrY5cXMvTN01ghVFxFxT9k4jdPONJBt8mdl0ta20qxjUHNsnkkWLjib4gXCXSQM; devicetype=android-29; version=2800153f; wap_sid2=CJvmqfwIEooBeV9IQVVCUVAzdVBlWEo5NTlySFpON1Ffek5zTE9qRi1jdWZjVFMyOFYyM0FyVE9RSTRNZ3VuUXFTcU94Q3lKY1VyQlJ2RkEtTWFyRWFLeHhJUTRrWmp0N0VDZ05zOFV4d0kzZ1p5cXBIbTVBbEZGRWJteEt4Q0oxSjY4ZHFhODlaZnMyY1NBQUF+MOXS6ZIGOA1AlU4=\',
}
for page in range(0, 3):
url = f\'https://mp.weixin.qq.com/mp/profile_ext?action=getmsg&__biz=MzU0MzU4OTY2NQ==&f=json&offset={page * 10}&count=10&is_ok=1&scene=&uin=777&key=777&pass_ticket=&wxtoken=&appmsg_token=1161_7%252BO7mVaQbImKSRrYWqKBnNggweX4WNZaqjadeg~~&x5=0&f=json\'
json_data = requests.get(url=url, headers=headers).json()
general_msg_list = json_data[\'general_msg_list\']
general_msg_list = json.loads(general_msg_list)[\'list\']
# print(general_msg_list)
title_list = []
content_url_list = []
for general_msg in general_msg_list:
title = general_msg[\'app_msg_ext_info\'][\'title\']
content_url = general_msg[\'app_msg_ext_info\'][\'content_url\']
multi_app_msg_item_list = general_msg[\'app_msg_ext_info\'][\'multi_app_msg_item_list\']
title_list.append(title)
content_url_list.append(content_url)
for multi_app_msg_item in multi_app_msg_item_list:
title_list.append(multi_app_msg_item[\'title\'])
content_url_list.append(multi_app_msg_item[\'content_url\'])
# print(title_list)
# print(content_url_list)
zip_data = zip(title_list, content_url_list)
for detail_title, detail_url in zip_data:
if not os.path.exists(\'img/\' + detail_title):
os.mkdir(\'img/\' + detail_title)
# 1. 发送请求
response = requests.get(url=detail_url, headers=headers)
# 2. 获取数据
html_data = response.text
# 3. 解析数据
# 正则匹配数据 第一个参数 需要匹配的规则
# 第一个参数 在哪个字符串里面匹配
img_list = re.findall(\'data-src=\"(https://mmbiz\\.qpic\\.cn/.*?)\"\', html_data)
print(detail_title)
# print(img_list)
for img in img_list:
if not \'gif\' in img:
img_data = requests.get(img).content
img_name = img.split(\'/\')[-2]
print(img_name)
with open(f\'img/{detail_title}/{img_name}.jpeg\', mode=\'wb\') as f:
f.write(img_data)
python:也不过如此嘛,这不公众号信息被我爬下来啦~
好了,我的这篇文章写到这里就结束啦!
有更多建议或问题可以评论区或私信我哦!一起加油努力叭(ง •_•)ง
喜欢就关注一下博主,或点赞收藏评论一下我的文章叭!!!
使用vite构建一个自己的vue3.0的UI组件库,并发布到npm
自组网训练生成模型并推理模型完整流程,代码展示LeNet -> AlexNet -> VGGNet -> InceptionNet -> ResNet优化过程
【YOLOV5-6.x中文注释版】整体项目代码全中文注释导航页面-By2022
android智慧停车场代码,计算机视觉实战(十三)停车场车位识别(附完整代码)
[React] vite2 + react17 + ts4 项目初始化遇到的问题解决
【uni-app】点击左上角返回按钮,弹出弹窗或者是携带参数返回上一页
解决has been blocked by CORS policy: No ‘Access-Control-Allow-Origin’报错跨域问题