Chen's Blog

发表于2022-06-13

文章：http://www.qdaily.com/articles/65298.html 列表页：http://www.qdaily.com/ newspaper（Released: Jun 14, 2017）安装：pip3 install newspaper3k github地址：https://github.com/codelucas/newspaper 123456789101112131415161718192021222324252627282930# 传入url，指定 language的效果比较好from newspaper import ArticleIn [10]: article = Article(url, language='zh')In [11]: article.download()In [12]: article.parse()In [21]: article.htmlOut[21]: '<!DOCTYPE html><html><head> <meta charset="UT ...

Python框架-Scrapy笔记

发表于2022-06-12|Scrapy

初始配置1234567891011121314download_delay = 20 # 下载延迟custom_settings = { "HTTPERROR_ALLOWED_CODES": [404], # 允许404 "COOKIES_ENABLED": True, "DOWNLOAD_DELAY": 5, "DOWNLOAD_TIMEOUT": 5, "REFERER_ENABLED": False, # 关闭自动refer "REDIRECT_ENABLED": False, # 禁跳转 "RETRY_HTTP_CODES": [429, 401, 403, 408, 414, 500, 502, 503, 504], # 重试http码 "DEFAULT_REQUEST_HEADERS": { ...... &qu ...

Python编程-关于多线程和多进程

发表于2022-06-10

线程池12345678910111213141516171819202122from concurrent.futures import ThreadPoolExecutordef main(url): passPool = ThreadPoolExecutor(max_workers=10)list(Pool.map(main, links))# 传入多个参数def add(x, y): return x + ynums = [(1, 2), (3, 4), (5, 6)]with ThreadPoolExecutor(max_workers=3) as executor: results = executor.map(add, *zip(*nums))# 传参为空import threadingfor i in range(10): threading.Thread(target=main).start() 进程池1234567891011121314import multiprocessing as mppool = mp.Pool(processes=10) ...

FastAPI和aiohttp

发表于2022-06-10|python教程

FastAPI基本使用12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455from fastapi import FastAPI, Form, Fileimport uvicornapp = FastAPI()app = FastAPI(docs_url=None, redoc_url=None) # 关闭帮助文档# 根路径@app.get("/")async def root(): return "Hello World# 子路径@app.get("/items/{item_id}")async def read_item(item_id: int): return fake_items_db[item_id]# 设置查询参数，GET /items?skip=1&limit=2@app.get("/items")async def re ...

web逆向-谷歌翻译旧接口

发表于2022-06-08

把谷歌的接口逆向重新整理一下，方便需要的时候可以用到有一个第三方的依赖 googletrans，但是已经两年没更新，用不了了，所以还是自己逆向比较实在首先是抓包通过多次对比，发现只需确认 f.sid _reqid 两个参数，其他都是固定的那就来看看这两个是怎么来的吧，无痕模式和fiddler开起来首先是 f.sid 就在第一次访问 https://translate.google.cn/ 时候返回的html里面 f.sid 只需要一次请求就行，后面都是不会变动的，生成代码如下： 12345def get_sid(): url = "https://translate.google.cn/" resp = requests.get(url) sid = re.search(r'"FdrFJe":"(.+?)",', resp.text).group(1) return sid reqid就不一样了，每一次请求都是不一样的，这就需要逆向分析它的生成逻辑先打断点调 ...

东方财富的一些脚本整理

发表于2022-06-07

689a0eb7533bd43fcc8e56a8f7ebf6e9a433ea36c39be806eaecf42523f84c1107955146cc7b86176331f7fab3322cbb475598a9c67f08e1c601f3daafcbc58159b3f442b77f810a7b85277e885858a61ad821a2ce1e4060e474929f0fce7efe2871b8782cb1ef07ad3553391568fb99538b38991169f93edf1b79db1546b10ffae53f7250c9f64c5d8599c3d8c78c45b8bd1c7634c4dcb436728dc1e601be81ed4cb71d360e1a5e7d73fdb6ab856a0a65634e42c08cbf8afa43db2367924c4eb6bc6a098141e398fd7ce8329cac3b20933ac29c8f00d4dfc42e9e15db89607a345878b340301293638730194d95ecaf6e633ab00723636fc ...

web逆向-公某部加速乐反爬

发表于2022-06-06

https://www.mps.gov.cn/n2254098/n4904352/index.html 直接访问的话，返回的是一段js代码，主要是对cookie进行赋值直接复制出来，到控制台执行执行的结果先不管，我们这次逆向的目标，主要是cookies中两个值的生成其中，__jsluid_s 是第一次请求的时候服务器返回的那 __jsl_clearance_s 会不会是第一次返回的那段js执行后的结果呢？对比一下： 12345# 执行结果__jsl_clearance_s=1654493484.607|-1|eU3CZr0IIIZOWUJfz3vNntOeewo%3D# cookies__jsl_clearance_s=1654493484.979|0|70p4p0lk5rF59kkzj6oslUuf8Qk%3D 看来不是，如果那么简单我也不会写这篇博客了 😬 还是看一下抓包，第二次请求的时候也返回了一段js 注意到开头是一段大数组，很明显的ob混淆的特诊，先解一下混淆：复制结果放到浏览器执行，跳到代码最后：可以确定，就是最后的go函数，对cookie进行 ...

激活和破解

发表于2022-06-04

0701657335e84f0c327f6bba26337f2ad6ca4ba8e3335a3698f9be1e4802627d713488decb93c54b9a8407de3c232f27db93ddb34ec2c54ab8d7a69d5629f9f4758697157de5149b7d4bd85502ba6c7c3c809a9e95d8fc328e966329c8c9eb8b0aad6786cc9aaec37a84bc6f6c3d739c896926604f770379b0ad30b1959d1838937e0cab467c55ceac9dae10de076212bb6813ab4d9324a225541a9359c82ef21ed9ad0aab7687f4a8ef6209d94275b8f6fd1d745eef7c057614aff0ae8085d176fc395a451069deabe2fb8aa7c3070093c6c6745a95daba56609317974a79a26288a213a6288f75d1269f5bc8c21ff53ea92fba497346e0b ...

web逆向-cloudflare 5秒盾绕过研究

发表于2022-06-04

cloudscraper（Released: Mar 15, 2022）PyPi文档 12345import cloudscraperurl = "https://www.curseforge.com/"scraper = cloudscraper.create_scraper(interpreter="nodejs")print(scraper.get(url).text) cloudflare-scrape（23 Feb 2020）版本太老，弃用 https://github.com/Anorov/cloudflare-scrape

Python编程-魔术方法汇总

发表于2022-06-04

iter()用来生成迭代器，搭配 next() 使用 1234567891011121314def dtime_iterator(): dtime_lst = [1, 2, 3] return iter(dtime_lst)it = dtime_iterator()In [2]: next(it)Out[2]: 1In [3]: next(it)Out[3]: 2In [4]: next(it)Out[4]: 3 __call__ 调用 function abc(x1, x2 ...) 其实就相当于 type(abc).\__call__(abc, x1, x2 ...) 123456class Person: def __call__(self, other): return f'Hi {other}'In [13]: Person()("cxs")Out[13]: 'Hi cxs' _str_ 和 _repr_1234567891011121314class C ...