python抓取唯品会3D打印笔信息

2022-08-07 245 0

本章复习了json的用法以及如何分析数据ID,url分析。最后代码虽然简洁,但是前期的准备工作其实很长

这里了解到了有分页符以及异步加载技术时怎么获取数据的过程。

import json
import requests
from bs4 import BeautifulSoup
from urllib.parse import quote
for i in range(0,120,120):
    url = "https://mapi.vip.com/vips-mobile/rest/shopping/pc/search/product/rank?callback=getMerchandiseIds&app_name=shop_pc&app_version=4.0&warehouse=VIP_NH&fdc_area_id=104104103&client=pc&mobile_platform=1&province_id=104104&api_key=70f71280d5d547b2a7bb370a529aeea1&user_id=&mars_cid=1659756575760_55e439d221d0198bbd69abb4ca77e084&wap_consumer=a&standby_id=nature&keyword=3d%E6%89%93%E5%8D%B0%E7%AC%94&lv3CatIds=&lv2CatIds=&lv1CatIds=&brandStoreSns=&props=&priceMin=&priceMax=&vipService=&sort=0&pageOffset={}&channelId=1&gPlatform=PC&batchSize=120&_=1659773984353".format(i)

    headers={
        'cookie': 'vip_first_visitor=1; vip_address=%257B%2522pid%2522%253A%2522104104%2522%252C%2522cid%2522%253A%2522104104103%2522%252C%2522pname%2522%253A%2522%255Cu5e7f%255Cu4e1c%255Cu7701%2522%252C%2522cname%2522%253A%2522%255Cu6df1%255Cu5733%255Cu5e02%2522%257D; vip_province=104104; vip_province_name=%E5%B9%BF%E4%B8%9C%E7%9C%81; vip_city_name=%E6%B7%B1%E5%9C%B3%E5%B8%82; vip_city_code=104104103; vip_wh=VIP_NH; vip_ipver=31; user_class=a; mars_cid=1659756575760_55e439d221d0198bbd69abb4ca77e084; mars_sid=06cbc3095d3705cb8f275bd2373004ea; VIP_QR_FIRST=1; mars_pid=0; vip_tracker_source_from=; VipUINFO=luc%3Aa%7Csuc%3Aa%7Cbct%3Ac_new%7Chct%3Ac_new%7Cbdts%3A0%7Cbcts%3A0%7Ckfts%3A0%7Cc10%3A0%7Crcabt%3A0%7Cp2%3A0%7Cp3%3A1%7Cp4%3A0%7Cp5%3A0%7Cul%3A3105; VipDFT=4; visit_id=8A0E295EE88490770D831D53E077072B; vip_access_times=%7B%22list%22%3A6%2C%22detail%22%3A2%7D; pg_session_no=24',
'referer': 'https://category.vip.com/',
'user-agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36'
    }
    html = requests.get(url,headers=headers).text

    start=html.find('{"code"')
    ends=html.find(']}}')+len(']}}')
    for p in json.loads(html[start:ends])['data']['products']:

        url="https://mapi.vip.com/vips-mobile/rest/shopping/pc/product/module/list/v2?callback=getMerchandiseDroplets1&app_name=shop_pc&app_version=4.0&warehouse=VIP_NH&fdc_area_id=104104103&client=pc&mobile_platform=1&province_id=104104&api_key=70f71280d5d547b2a7bb370a529aeea1&user_id=&mars_cid=1659756575760_55e439d221d0198bbd69abb4ca77e084&wap_consumer=a&productIds={}&scene=search&standby_id=nature&extParams=%7B%22stdSizeVids%22%3A%22%22%2C%22preheatTipsVer%22%3A%223%22%2C%22couponVer%22%3A%22v2%22%2C%22exclusivePrice%22%3A%221%22%2C%22iconSpec%22%3A%222x%22%2C%22ic2label%22%3A1%2C%22superHot%22%3A1%2C%22bigBrand%22%3A%221%22%7D&context=&_=1659773984355".format(p['pid'])
        html=requests.get(url,headers=headers).text
        start=html.find('{"code":')
        ends=html.find('"}}')+len('"}}')
        result=json.loads(html[start:ends])['data']['products'][0]

        print(result['title'])

到了最后result输出已经可以获取到商品的标题、价格、折扣、缩略图等各种商品信息

最后输出所需要作为分析的数据即可,最后只是输出了前面2页的商品标题做为演示,结果如下

相关文章

python 免费下载歌曲和破解VIP视频
Python抓取淘宝评论(1)
Python抓取3D打印笔天猫评论(3)
利用python对电脑文件进行分类整理
Python抓取3D打印笔天猫评论(1)
python爬取新浪财经新闻内容

发布评论