首页
畅所欲言
友情链接
壁纸大全
数据统计
推荐
工具箱
在线白板
Search
1
职教云小助手重构更新,职教云助手最新版下载地址【已和谐】
13,374 阅读
2
职教云-智慧职教,网课观看分析(秒刷网课)
10,986 阅读
3
gradle-5.4.1-all.zip下载
8,877 阅读
4
职教云-智慧职教,签到补签分析(逆天改命系列)
7,834 阅读
5
一个优秀的程序员从写文档开始:免费领14个月语雀云笔记会员
6,874 阅读
学习笔记
Web
Python
转载文章
算法刷题
JS逆向
综合笔记
安卓
物联网
Java
C
资源收集
软件收藏
网络资源
影视专辑
TED英语角
随便写写
随手拍
登录
/
注册
Search
Lan
累计撰写
624
篇文章
累计收到
617
条评论
首页
栏目
学习笔记
Web
Python
转载文章
算法刷题
JS逆向
综合笔记
安卓
物联网
Java
C
资源收集
软件收藏
网络资源
影视专辑
TED英语角
随便写写
随手拍
页面
畅所欲言
友情链接
壁纸大全
数据统计
推荐
工具箱
在线白板
搜索到
142
篇与
的结果
2020-03-08
python3 + flask + sqlalchemy
python3 + flask + sqlalchemy +orm(1):链接mysql 数据库1、pycharm中新建一个flask项目2、按装flask、PyMySQL、flask-sqlalchemy3、项目下面新建一个config.py 文件DEBUG = True #dialect+driver://root:1q2w3e4r5t@127.0.0.1:3306/ DIALECT = 'mysql' DRIVER='pymysql' USERNAME = 'root' PASSWORD = '1q2w3e4r5t' HOST = '127.0.0.1' PORT = 3306 DATABASE = 'db_demo1' SQLALCHEMY_DATABASE_URI = "{}+{}://{}:{}@{}:{}/{}?charset=utf8".format(DIALECT,DRIVER,USERNAME,PASSWORD,HOST,PORT,DATABASE) SQLALCHEMY_TRACK_MODIFICATIONS = False print(SQLALCHEMY_DATABASE_URI)4、app.py 文件from flask import Flask import config from flask_sqlalchemy import SQLAlchemy app = Flask(__name__) app.config.from_object(config) db = SQLAlchemy(app) db.create_all() @app.route('/') def index(): return 'index' if __name__ == '__main__': app.run()执行app.py 文件,结果如下,表面执行成功from flask import Flask import config from flask_sqlalchemy import SQLAlchemy app = Flask(__name__) app.config.from_object(config) db = SQLAlchemy(app) db.create_all() @app.route('/') def index(): return 'index' if __name__ == '__main__': app.run()FLASK_APP = test_sqlalchemy.pyFLASK_ENV = developmentFLASK_DEBUG = 1In folder /Users/autotest/PycharmProjects/python3_flask/Users/autotest/PycharmProjects/python3_flask/venv/bin/python /Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py --module --multiproc --qt-support=auto --client 127.0.0.1 --port 55365 --file flask runpydev debugger: process 3089 is connectingConnected to pydev debugger (build 182.4505.26)* Serving Flask app "test_sqlalchemy.py" (lazy loading)* Environment: development* Debug mode: onmysql+pymysql://root:1q2w3e4r5t@127.0.0.1:3306/db_demo1?charset=utf8* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)* Restarting with statpydev debugger: process 3090 is connectingmysql+pymysql://root:1q2w3e4r5t@127.0.0.1:3306/db_demo1?charset=utf8* Debugger is active!* Debugger PIN: 216-502-598python3 + flask + sqlalchemy +orm(2):数据库中添加表往数据库中添加一张保存文章的表,表明为article,字段有id,title,content同样一个配置文件:config.pyDEBUG = True #dialect+driver://root:1q2w3e4r5t@127.0.0.1:3306/ DIALECT = 'mysql' DRIVER='pymysql' USERNAME = 'root' PASSWORD = '1q2w3e4r5t' HOST = '127.0.0.1' PORT = 3306 DATABASE = 'db_demo1' SQLALCHEMY_DATABASE_URI = "{}+{}://{}:{}@{}:{}/{}?charset=utf8".format(DIALECT,DRIVER,USERNAME,PASSWORD,HOST,PORT,DATABASE) SQLALCHEMY_TRACK_MODIFICATIONS = False print(SQLALCHEMY_DATABASE_URI)flask app 中新建一个class Blog,里面定义好id ,title ,content。代码执行到db.create_all()时,会自动在数据库中创建一个表,表明为blogfrom flask import Flask import config from flask_sqlalchemy import SQLAlchemy from sqlalchemy.ext.declarative import declarative_base app = Flask(__name__) app.config.from_object(config) db = SQLAlchemy(app) Base = declarative_base() class Blog(db.Model): __tablename__ = 'blog' id = db.Column(db.Integer,primary_key=True,autoincrement=True) title = db.Column(db.String(100),nullable=False) content = db.Column(db.Text,nullable=True) db.create_all() @app.route('/') def index(): return 'index' if __name__ == '__main__': app.run(debug=True)启动flask app,数据库中查询表和表结构如下,有新增相应的表,说明新建表成功 数据库表中的数据增删改查 #新增 blog = Blog(title="first blog",content="this is my first blog") db.session.add(blog) db.session.commit() #查询 #res =Blog.query.filter(Blog.title=="first blog")[0] res =Blog.query.filter(Blog.title=="first blog").first() print(res.title) #修改 blog_edit = Blog.query.filter(Blog.title=="first blog").first() blog_edit.title = "new first blog" db.session.commit() #删除 blog_delete = Blog.query.filter(Blog.title=="first blog").first() db.session.delete(blog_delete) db.session.commit()完整代码from flask import Flask import config from flask_sqlalchemy import SQLAlchemy from sqlalchemy.ext.declarative import declarative_base app = Flask(__name__) app.config.from_object(config) db = SQLAlchemy(app) Base = declarative_base() class Blog(db.Model): __tablename__ = 'blog' id = db.Column(db.Integer,primary_key=True,autoincrement=True) title = db.Column(db.String(100),nullable=False) content = db.Column(db.Text,nullable=True) db.create_all() @app.route('/') def index(): #新增 blog = Blog(title="first blog",content="this is my first blog") db.session.add(blog) db.session.commit() #查询 #res =Blog.query.filter(Blog.title=="first blog")[0] res =Blog.query.filter(Blog.title=="first blog").first() print(res.title) #修改 blog_edit = Blog.query.filter(Blog.title=="first blog").first() blog_edit.title = "new first blog" db.session.commit() #删除 blog_delete = Blog.query.filter(Blog.title=="first blog").first() db.session.delete(blog_delete) db.session.commit() return 'index' if __name__ == '__main__': app.run(debug=True)一篇文章有多个tag,一个tag也可以属于多篇文章,文章和tag存在多对多关系config.pyDEBUG = True #dialect+driver://root:1q2w3e4r5t@127.0.0.1:3306/ DIALECT = 'mysql' DRIVER='pymysql' USERNAME = 'demo_user' PASSWORD = 'demo_123' HOST = '172.16.10.6' PORT = 3306 DATABASE = 'db_demo1' SQLALCHEMY_DATABASE_URI = "{}+{}://{}:{}@{}:{}/{}?charset=utf8".format(DIALECT,DRIVER,USERNAME,PASSWORD,HOST,PORT,DATABASE) SQLALCHEMY_TRACK_MODIFICATIONS = False print(SQLALCHEMY_DATABASE_URI)app.pyfrom flask import Flask import config from flask_sqlalchemy import SQLAlchemy from sqlalchemy.ext.declarative import declarative_base app = Flask(__name__) app.config.from_object(config) db = SQLAlchemy(app) Base = declarative_base() article_tag = db.Table('article_tag', db.Column('article_id',db.Integer,db.ForeignKey("article.id"),primary_key=True), db.Column('tag_id',db.Integer,db.ForeignKey("tag.id"),primary_key=True) ) class Article(db.Model): __tablename__='article' id = db.Column(db.Integer,primary_key=True,autoincrement=True) title = db.Column(db.String(100),nullable=True) tags = db.relationship('Tag',secondary=article_tag,backref=db.backref('articles')) class Tag(db.Model): __tablename__='tag' id = db.Column(db.Integer,primary_key=True,autoincrement=True) name = db.Column(db.String(100),nullable=True) db.create_all() @app.route('/') def index(): article1 = Article(title="aaa") article2 = Article(title="bbb") tag1 = Tag(name='1111') tag2 = Tag(name='2222') article1.tags.append(tag1) article1.tags.append(tag2) article2.tags.append(tag1) article2.tags.append(tag2) db.session.add(article1) db.session.add(article2) db.session.add(tag1) db.session.add(tag2) db.session.commit() return 'index' if __name__ == '__main__': app.run(debug=True)
2020年03月08日
864 阅读
0 评论
0 点赞
2020-03-06
利用Scrapy框架爬取LOL皮肤站高清壁纸
成品打包:点击进入代码:爬虫文件# -*- coding: utf-8 -*- import scrapy from practice.items import PracticeItem from urllib import parse class LolskinSpider(scrapy.Spider): name = 'lolskin' allowed_domains = ['lolskin.cn'] start_urls = ['https://lolskin.cn/champions.html'] # 获取所有英雄链接 def parse(self, response): item = PracticeItem() item['urls'] = response.xpath('//div[2]/div[1]/div/ul/li/a/@href').extract() for url in item['urls']: self.csurl = 'https://lolskin.cn' yield scrapy.Request(url=parse.urljoin(self.csurl, url), dont_filter=True, callback=self.bizhi) return item # 获取所有英雄皮肤链接 def bizhi(self, response): skins = (response.xpath('//td/a/@href').extract()) for skin in skins: yield scrapy.Request(url=parse.urljoin(self.csurl, skin), dont_filter=True, callback=self.get_bzurl) # 采集每个皮肤的壁纸,获取壁纸链接 def get_bzurl(self, response): item = PracticeItem() image_urls = response.xpath('//body/div[1]/div/a/@href').extract() image_name = response.xpath('//h1/text()').extract() yield { 'image_urls': image_urls, 'image_name': image_name } return itemitems.py# -*- coding: utf-8 -*- # Define here the models for your scraped items # # See documentation in: # https://docs.scrapy.org/en/latest/topics/items.html import scrapy class PracticeItem(scrapy.Item): # define the fields for your item here like: # name = scrapy.Field() # titles = scrapy.Field() # yxpngs = scrapy.Field() urls = scrapy.Field() skin_name = scrapy.Field() # 皮肤名 image_urls = scrapy.Field() # 皮肤壁纸url images = scrapy.Field()pipelines.py# -*- coding: utf-8 -*- # Define your item pipelines here # # Don't forget to add your pipeline to the ITEM_PIPELINES setting # See: https://docs.scrapy.org/en/latest/topics/item-pipeline.html import os import re from scrapy.pipelines.images import ImagesPipeline import scrapy # class PracticePipeline(object): # def __init__(self): # self.file = open('text.csv', 'a+') # # def process_item(self, item, spider): # # os.chdir('lolskin') # # for title in item['titles']: # # os.makedirs(title) # skin_name = item['skin_name'] # skin_jpg = item['skin_jpg'] # for i in range(len(skin_name)): # self.file.write(f'{skin_name[i]},{skin_jpg} ') # self.file.flush() # return item # # def down_bizhi(self, item, spider): # self.file.close() class LoLPipeline(ImagesPipeline): def get_media_requests(self, item, info): for image_url in item['image_urls']: yield scrapy.Request(image_url, meta={'image_name': item['image_name']}) # 修改下载之后的路径以及文件名 def file_path(self, request, response=None, info=None): image_name = re.findall('/skin/(.*?)/', request.url)[0] + "/" + request.meta[f'image_name'][0] + '.jpg' return image_namesettings.py# -*- coding: utf-8 -*- # Scrapy settings for practice project # # For simplicity, this file contains only settings considered important or # commonly used. You can find more settings consulting the documentation: # # https://docs.scrapy.org/en/latest/topics/settings.html # https://docs.scrapy.org/en/latest/topics/downloader-middleware.html # https://docs.scrapy.org/en/latest/topics/spider-middleware.html import os BOT_NAME = 'practice' SPIDER_MODULES = ['practice.spiders'] NEWSPIDER_MODULE = 'practice.spiders' # Crawl responsibly by identifying yourself (and your website) on the user-agent # USER_AGENT = 'practice (+http://www.yourdomain.com)' # Obey robots.txt rules ROBOTSTXT_OBEY = False # Configure maximum concurrent requests performed by Scrapy (default: 16) # CONCURRENT_REQUESTS = 32 # Configure a delay for requests for the same website (default: 0) # See https://docs.scrapy.org/en/latest/topics/settings.html#download-delay # See also autothrottle settings and docs # 设置延时 DOWNLOAD_DELAY = 1 # The download delay setting will honor only one of: # CONCURRENT_REQUESTS_PER_DOMAIN = 16 # CONCURRENT_REQUESTS_PER_IP = 16 # Disable cookies (enabled by default) # COOKIES_ENABLED = False # Disable Telnet Console (enabled by default) # TELNETCONSOLE_ENABLED = False # Override the default request headers: # DEFAULT_REQUEST_HEADERS = { # 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', # 'Accept-Language': 'en', # } # Enable or disable spider middlewares # See https://docs.scrapy.org/en/latest/topics/spider-middleware.html # SPIDER_MIDDLEWARES = { # 'practice.middlewares.PracticeSpiderMiddleware': 543, # } # Enable or disable downloader middlewares # See https://docs.scrapy.org/en/latest/topics/downloader-middleware.html # DOWNLOADER_MIDDLEWARES = { # 'practice.middlewares.PracticeDownloaderMiddleware': 543, # } # Enable or disable extensions # See https://docs.scrapy.org/en/latest/topics/extensions.html # EXTENSIONS = { # 'scrapy.extensions.telnet.TelnetConsole': None, # } # Configure item pipelines # See https://docs.scrapy.org/en/latest/topics/item-pipeline.html ITEM_PIPELINES = { # 'practice.pipelines.PracticePipeline': 300, # 'scrapy.pipelines.images.ImagesPipeline': 1, 'practice.pipelines.LoLPipeline': 1 } # 设置采集文件夹路径 IMAGES_STORE = 'E:PythonscrapypracticepracticeLOLskin' # Enable and configure the AutoThrottle extension (disabled by default) # See https://docs.scrapy.org/en/latest/topics/autothrottle.html # AUTOTHROTTLE_ENABLED = True # The initial download delay # AUTOTHROTTLE_START_DELAY = 5 # The maximum download delay to be set in case of high latencies # AUTOTHROTTLE_MAX_DELAY = 60 # The average number of requests Scrapy should be sending in parallel to # each remote server # AUTOTHROTTLE_TARGET_CONCURRENCY = 1.0 # Enable showing throttling stats for every response received: # AUTOTHROTTLE_DEBUG = False # Enable and configure HTTP caching (disabled by default) # See https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#httpcache-middleware-settings # HTTPCACHE_ENABLED = True # HTTPCACHE_EXPIRATION_SECS = 0 # HTTPCACHE_DIR = 'httpcache' # HTTPCACHE_IGNORE_HTTP_CODES = [] # HTTPCACHE_STORAGE = 'scrapy.extensions.httpcache.FilesystemCacheStorage'main.pyfrom scrapy.cmdline import execute execute(['scrapy', 'crawl', 'lolskin'])
2020年03月06日
636 阅读
0 评论
0 点赞
2020-03-01
利用urljoin快速将url补全
如采集的url只有后一半如./target/000001.html./target/000002.html./target/000003.html./target/000004.htmlfrom urllib import parse houzhui = './target/000004.html' url = 'http://pycs.greedyai.com/' url = parse.urljoin(url,houzhui)
2020年03月01日
821 阅读
0 评论
0 点赞
2020-03-01
request请求头快速加引号
import re headers_str = ''' origin: https://sou.zhaopin.com referer: https://sou.zhaopin.com/?p=3&jl=765&kw=python&kt=3 sec-fetch-dest: empty sec-fetch-mode: cors sec-fetch-site: same-site user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36 ''' pattern = '^(.*?): (.*)$' for line in headers_str.splitlines(): print(re.sub(pattern, ''\1': '\2',', line))
2020年03月01日
659 阅读
0 评论
0 点赞
2020-03-01
Scrapy笔记
快捷启动.pyfrom scrapy.cmdline import execute execute(['scrapy', 'crawl', 'ZhiLianZhaoPin'])创建Scrapy爬虫项目调出 CMD,输入如下代码并按【Enter】键,创建 Scrapy 爬虫项目:scrapy startproject stockstar其中 scrapy startproject 是固定命令,stockstar 是工程名字。settings.py中关闭遵循robotROBOTSTXT_OBEY=False创建Scrapy爬虫scrapy genspider stock quote.stockstar.com 名称 允许爬取域名
2020年03月01日
468 阅读
0 评论
0 点赞
2020-02-26
python爬虫之selenium记录
Chrome浏览器驱动下载地址:http://npm.taobao.org/mirrors/chromedriver/基础代码:browser = webdriver.Firefox() //选择浏览器 browser.find_element_by_id().send_keys() //寻找控件通过ID,且发送值selenium browser.find_element_by_id().click() //搜索的按钮的id 叫su ,且点击 browser.quit() //退出并关闭窗口的每一个相关的驱动程序 browser.close() //关闭窗口 browser.implicitly_wait(10) //隐式等待无窗口模式:#selenium:3.12.0 #webdriver:2.38 #chrome.exe: 65.0.3325.181(正式版本) (32 位) from selenium import webdriver from selenium.webdriver.chrome.options import Options chrome_options = Options() chrome_options.add_argument('--no-sandbox')#解决DevToolsActivePort文件不存在的报错 chrome_options.add_argument('window-size=1920x3000') #指定浏览器分辨率 chrome_options.add_argument('--disable-gpu') #谷歌文档提到需要加上这个属性来规避bug chrome_options.add_argument('--hide-scrollbars') #隐藏滚动条, 应对一些特殊页面 chrome_options.add_argument('blink-settings=imagesEnabled=false') #不加载图片, 提升速度 chrome_options.add_argument('--headless') #浏览器不提供可视化页面. linux下如果系统不支持可视化不加这条会启动失败 chrome_options.binary_location = r"C:Program Files (x86)GoogleChromeApplicationchrome.exe" #手动指定使用的浏览器位置 driver=webdriver.Chrome(chrome_options=chrome_options) driver.get('https://www.baidu.com') print('hao123' in driver.page_source) driver.close() #切记关闭浏览器,回收资源键盘操作:selenium.webdriver.common.keys Keys browser = webdriver.Chrome() browser.get() browser.find_element_by_id().send_keys() browser.find_element_by_id().send_keys(Keys.SPACE) browser.find_element_by_id().send_keys(Keys.CONTROL) browser.find_element_by_id().send_keys(Keys.CONTROL) browser.find_element_by_id().send_keys(Keys.CONTROL) browser.find_element_by_id().send_keys(Keys.ENTER)鼠标操作:selenium.webdriver ActionChains driver = webdriver.Chrome() driver.get() driver.find_element_by_id().send_keys() driver.find_element_by_id().click() element = driver.find_element_by_name() ActionChains(driver).move_to_element(element).perform() driver.find_element_by_link_text().click()截屏定位location = img.location (location) size = img.size left = location[] = location[] = left + size[]保存cookie,以及调用cookie保存cookies cookies = driver.get_cookies()with open("cookies.txt", "w") as fp: json.dump(cookies, fp) selenium读取cookies def read_cookies(): # 设置cookies前必须访问一次百度的页面 driver.get("http://www.baidu.com") with open("cookies.txt", "r") as fp: cookies = json.load(fp) for cookie in cookies: # cookie.pop('domain') # 如果报domain无效的错误 driver.add_cookie(cookie) cookies_dict = dict() with open('cookies.txt','r')as f: cookies = json.load(f) for cookie in cookies: cookies_dict[cookie['name']] = cookie['value']
2020年02月26日
631 阅读
0 评论
0 点赞
2020-02-24
Python爬虫:Xpath语法笔记
一、选取节点常用的路径表达式:表达式描述实例nodename选取nodename节点的所有子节点xpath(‘//div’)选取了div节点的所有子节点/从根节点选取xpath(‘/div’)从根节点上选取div节点//选取所有的当前节点,不考虑他们的位置xpath(‘//div’)选取所有的div节点.选取当前节点xpath(‘./div’)选取当前节点下的div节点..选取当前节点的父节点xpath(‘..’)回到上一个节点@选取属性xpath(’//@calss’)选取所有的class属性二、谓语谓语被嵌在方括号内,用来查找某个特定的节点或包含某个制定的值的节点实例: 表达式结果xpath(‘/body/div[1]’)选取body下的第一个div节点xpath(‘/body/div[last()]’)选取body下最后一个div节点xpath(‘/body/div[last()-1]’)选取body下倒数第二个div节点xpath(‘/body/div[positon()<3]’)选取body下前两个div节点xpath(‘/body/div[@class]’)选取body下带有class属性的div节点xpath(‘/body/div[@class=”main”]’)选取body下class属性为main的div节点xpath(‘/body/div[price>35.00]’)选取body下price元素值大于35的div节点 三、通配符Xpath通过通配符来选取未知的XML元素表达式结果xpath(’/div/*’)选取div下的所有子节点xpath(‘/div[@*]’)选取所有带属性的div节点 四、取多个路径使用“|”运算符可以选取多个路径表达式结果xpath(‘//div|//table’)选取所有的div和table节点五、Xpath轴轴可以定义相对于当前节点的节点集轴名称表达式描述ancestorxpath(‘./ancestor::*’)选取当前节点的所有先辈节点(父、祖父)ancestor-or-selfxpath(‘./ancestor-or-self::*’)选取当前节点的所有先辈节点以及节点本身attributexpath(‘./attribute::*’)选取当前节点的所有属性childxpath(‘./child::*’)返回当前节点的所有子节点descendantxpath(‘./descendant::*’)返回当前节点的所有后代节点(子节点、孙节点)followingxpath(‘./following::*’)选取文档中当前节点结束标签后的所有节点following-sibingxpath(‘./following-sibing::*’)选取当前节点之后的兄弟节点parentxpath(‘./parent::*’)选取当前节点的父节点precedingxpath(‘./preceding::*’)选取文档中当前节点开始标签前的所有节点 preceding-siblingxpath(‘./preceding-sibling::*’)选取当前节点之前的兄弟节点selfxpath(‘./self::*’)选取当前节点 六、功能函数 使用功能函数能够更好的进行模糊搜索函数用法解释starts-withxpath(‘//div[starts-with(@id,”ma”)]‘)选取id值以ma开头的div节点containsxpath(‘//div[contains(@id,”ma”)]‘)选取id值包含ma的div节点andxpath(‘//div[contains(@id,”ma”) and contains(@id,”in”)]‘)选取id值包含ma和in的div节点text()xpath(‘//div[contains(text(),”ma”)]‘)选取节点文本包含ma的div节点scrapy xpath文档:http://doc.scrapy.org/en/0.14/topics/selectors.html选取未知节点XPath 通配符可用来选取未知的 XML 元素。通配符描述*匹配任何元素节点。@*匹配任何属性节点。node()匹配任何类型的节点。 在下面的表格中,我们列出了一些路径表达式,以及这些表达式的结果:路径表达式结果/bookstore/*选取 bookstore 元素的所有子元素。//*选取文档中的所有元素。//title[@*]选取所有带有属性的 title 元素。 选取若干路径通过在路径表达式中使用"|"运算符,您可以选取若干个路径。在下面的表格中,我们列出了一些路径表达式,以及这些表达式的结果:路径表达式结果//book/title | //book/price选取 book 元素的所有 title 和 price 元素。//title | //price选取文档中的所有 title 和 price 元素。/bookstore/book/title | //price选取属于 bookstore 元素的 book 元素的所有 title 元素,以及文档中所有的 price 元素。
2020年02月24日
811 阅读
0 评论
0 点赞
1
...
19
20
21