这两天,百度终于开始收录我的内页了,虽然只是每天收录那么一两条,但是已经让我感觉很不错了(就是这么容易满足),有时候想看看百度收录了我多少页面了,电脑又不在身边,手机操作又不方便,于是就写了这么个python脚本。
功能:
如果有新收录,发送邮件到指定邮箱
每个一小时监控一次。
发送新增页面,和总页面,以及其数量
效果图:
代码:
import smtplib import time from email.mime.text import MIMEText import parsel import requests def get_info(domain): url = f'http://tool.chinaz.com/baidu/?lm=0&wd={domain}&rn=50' headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36', } html = requests.get(url=url, headers=headers).text html = parsel.Selector(html) titles = html.xpath("//a[@class='col-blue02']/text()").extract() urls = html.xpath("//a[@class='col-blue02']/@href").extract() pages = html.xpath("//a[@class='item'][last()]/text()").extract() try: nums = int(html.xpath("//span[@class='col-blue02'][2]/a/text()").extract()[0]) if pages: pages = int(pages[0]) * 10 else: pages = 0 if pages == 0: return { 'titles': titles, 'urls': urls, 'nums': nums } else: all_title = [] all_urls = [] for i in range(0, pages, 10): url = f'http://tool.chinaz.com/baidu/?pn={i}&wd={domain}&rn=10' html = requests.get(url=url, headers=headers).text html = parsel.Selector(html) print(i) all_title.extend(html.xpath("//a[@class='col-blue02']/text()").extract()) all_urls.extend(html.xpath("//a[@class='col-blue02']/@href").extract()) return { 'titles': all_title, 'urls': all_urls, 'nums': nums } except: print("此网站被屏蔽!!!") def send_mail(infomation, old_nums, old_titles, recever): newnums = infomation['nums'] titles = infomation['titles'] urls = infomation['urls'] new_contents = '' if infomation['nums'] > old_nums: for i in range(len(titles)): if titles[i] not in old_titles: new_contents += f'{titles[i]} {urls[i]} ' mail_content = f'老大,截至{time.strftime("%Y年%m月%d日%H:%M:%S", time.localtime())} 百度新收录:{newnums - old_nums}条 总计收录:{newnums}条 新收录信息如下:' mail_content += new_contents mail_content += '目前收录如下: ' # By:www.lanol.cn # autor:Lan for i in range(len(titles)): mail_content += f'{titles[i]} {urls[i]} ' try: content = MIMEText(mail_content, 'plain', 'utf-8') reveivers = f"{recever}" content['To'] = reveivers # 设置邮件的接收者,多个接收者之间用逗号隔开 content['From'] = str("admin@lanol.cn") # 邮件的发送者,最好写成str("这里填发送者"),不然可能会出现乱码 content['Subject'] = f"老大,你的网站又被百度收录啦!!!{time.strftime('%Y年%m月%d日%H:%M:%S', time.localtime())}" # 邮件的主题 smtp_server = smtplib.SMTP_SSL("smtp.exmail.qq.com", 465) smtp_server.login("{发送邮箱}", "{邮箱密码}") smtp_server.sendmail("admin@lanol.cn", [recever], content.as_string()) smtp_server.quit() except Exception as e: print(str(e)) except smtplib.SMTPException: print("Error: 无法发送邮件") return { 'newnums': newnums, 'titles': titles, 'urls': urls } if __name__ == '__main__': domain = input("请输入你要监控的网址(如:www.lanol.cn,不需要加https这些):") reveiver = input("请输入收信邮箱(如:78013994@qq.com):") nums = 0 titles = [] urls = [] while True: new_infomation = send_mail(get_info(domain), nums, titles, reveiver) nums = new_infomation['newnums'] titles = new_infomation['titles'] urls = new_infomation['urls'] print(f'{time.strftime("%Y年%m月%d日%H:%M:%S", time.localtime())}检查成功') time.sleep(3600)
哈哈,刚刚发现个Bug,一个小时之后不能再发送,已经修复了,可以重新下载一下
成品下载地址:
百度收录查询发送邮箱.zip大小:11.4MB
已经过安全软件检测无毒,请您放心下载。
骚操作 Python无所不能
这个功能强大,有空试试
厉害