Python编程技巧（一）

方便一些包的导入

# 把当前目录、父目录、父父目录引入系统
sys.path.append(os.path.abspath("../.."))  
sys.path.append(os.path.abspath(".."))
sys.path.append(os.path.abspath("."))

字符填充

In [1]: "cxs".ljust(16, "0")
Out[1]: 'cxs0000000000000'

In [2]: "cxs".center(16, "0")
Out[2]: '000000cxs0000000'

In [3]: "cxs".rjust(16, "0")
Out[3]: '0000000000000cxs'

tuple拆包

t = ('cxs',16,2,3)
name, *num = t
name = 'cxs'
num = [16,2,3]

`timeit` 模块的使用

from timeit import timeit

t1 = timeit(lambda: main(), number=1)
t2 = timeit(lambda: chr(n) for n in range(100), number=1)

startswith 多个字符串（endswith同理）

"abc".startswith(("a", "b", "c"))
Out[2]: True
"bc".startswith(("a", "b", "c"))
Out[3]: True

in 多个字符

my_string = "Hello, world!"
chars_to_check = ['w', 'x', 'y', 'z']

# a in c or b in c
if any(char in my_string for char in chars_to_check):
    pass

# a in c and b in c
if all(char in my_string for char in chars_to_check):
    pass

random 模块

choices 和 sample

In [5]: lst
Out[5]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In [9]: random.choices(lst, k=5)
Out[9]: [10, 2, 10, 2, 2]

In [10]: random.sample(lst, k=5)
Out[10]: [5, 2, 8, 10, 7]

# 生成随机字符串
In [13]: "".join(random.sample(string.ascii_letters + string.digits, 16))
Out[13]: 'YEeaLtmwoFTsK2yx'

两者都是随机取样，但 sample 是不重复取样

shuffle 打乱顺序

1
2
3

In [11]: random.shuffle(lst)
In [12]: lst
Out[12]: [4, 5, 8, 3, 9, 10, 1, 7, 6, 2]

随机数

In [1]: from random import randint
In [2]: randint(1, 250)
Out[2]: 26

# 作用类似randint，但不包含末尾
In [5]: from random import randrange
In [6]: randrange(0, 11)
Out[6]: 0

In [3]: from random import uniform
In [4]: uniform(0, 1)
Out[4]: 0.5936927386128741

生成随机密钥

使用 os和 binascii两个内置模块，来生成 n*2 位的随机private_key

import os
import binascii

n = 16
os.urandom(n)
>>> b'_5#y2L"F4Q8z\n\xec]/'

binascii.hexlify(os.urandom(n)).decode()
>>> '450a016f5e5a306f0c6805e3813e9db6'

字符串.splitlines()

1
2
3

s = "a \n b\n c"
s.splitlines()
Out[19]: ['a ', ' b', ' c']

合并两个字典，后者覆盖前者

a = {'x': 1, 'z': 3 }
b = {'y': 2, 'z': 4 }
{**a, **b}  # 返回一个新字典，不影响原来的ab值
Out[18]: {'x': 1, 'z': 4, 'y': 2}

a.update(b)  # 不返回值
a
Out[3]: {'x': 1, 'z': 4, 'y': 2}

a = {'x': 1, 'z': 3 }
b = {'y': 2, 'z': 4 }
{**b, **a}
Out[19]: {'y': 2, 'z': 3, 'x': 1}

b.update(a)
b
Out[6]: {'y': 2, 'z': 3, 'x': 1}

字典转换

1 2	In [14]: dict(name=123, age=345) Out[14]: {'name': 123, 'age': 345}

关于 *args 和 **kwargs

1
2
3

def anyargs(*args, **kwargs):
    print(args)  # 数组
    print(kwargs)  # 字典

强制关键字参数

def cxs(*args, name)

只能用类似 cxs(1,2,3, name="cxs") 指定参数（name 使用关键词参数形式）

实际上我们使用的是逗号来生成一个元组，而不是用括号

1
2
3

In [5]: a = 1,2,3,4,5
In [6]: a
Out[6]: (1, 2, 3, 4, 5)

小数转分数

from fractions import Fraction

In [19]: s = 3.75
In [20]: y = Fraction(*s.as_integer_ratio())
In [21]: y
Out[21]: Fraction(15, 4)

list 和 numpy arrays

# 普通列表
>>> x = [1, 2, 3, 4]
>>> y = [5, 6, 7, 8]
>>> x * 2
[1, 2, 3, 4, 1, 2, 3, 4]
>>> x + 10
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
TypeError: can only concatenate list (not "int") to list
>>> x + y
[1, 2, 3, 4, 5, 6, 7, 8]

# Numpy arrays
>>> import numpy as np
>>> ax = np.array([1, 2, 3, 4])
>>> ay = np.array([5, 6, 7, 8])
>>> ax * 2
array([2, 4, 6, 8])
>>> ax + 10
array([11, 12, 13, 14])
>>> ax + ay
array([ 6, 8, 10, 12])
>>> ax * ay
array([ 5, 12, 21, 32])

使用 `latin-1` 编码读取未知文件

当读取一个未知编码的文本时使用latin-1编码永远不会产生解码错误。使用latin-1编码读取一个文件的时候也许不能产生完全正确的文本解码数据，但是它也能从中提取出足够多的有用数据。

多使用生成器表达式``( )而不是 `列表生成式` `[ ]`

这种方式非常高效，因为它不需要预先读取所有数据放到一个临时的列表中去

正则高级操作

忽略大小写匹配

1
2
3

# 匹配 2000iu、1000IU

ret = re.findall(r"\d+iu", line, flags=re.IGNORECASE)

+ 和 * 区别

+ 等价于 {0,}

* 等价于 {1,}

非捕获分组

s = "jfkdjfkjd,udkjfkdfj.jfkdjfk"

# 捕获分组
re.split(r'(,|\.)', s)
Out[7]: ['jfkdjfkjd', ',', 'udkjfkdfj', '.', 'jfkdjfk']

# ?: 非捕获分组
re.split(r'(?:,|\.)', s)
Out[8]: ['jfkdjfkjd', 'udkjfkdfj', 'jfkdjfk']

# 等同于
re.split(r'[,\.]', s)
Out[9]: ['jfkdjfkjd', 'udkjfkdfj', 'jfkdjfk']

search命名分组

s = "my name is cxs, today 24 years old"
re.search(r"is (?P<name>.+?), today (?P<age>\d+?) ", s).group(1)
Out[16]: 'cxs'
re.search(r"is (?P<name>.+?), today (?P<age>\d+?) ", s).group("name")
Out[17]: 'cxs'
re.search(r"is (?P<name>.+?), today (?P<age>\d+?) ", s).group("age")
Out[18]: '24'

sub 捕获组号移位，类似于 notepad的 $1, $2

>>> text = 'Today is 11/27/2012. PyCon starts 3/13/2013.'
>>> re.sub(r'(\d+)/(\d+)/(\d+)', r'\3-\1-\2', text)
'Today is 2012-11-27. PyCon starts 2013-3-13.'

# 使用命名分组
>>> text = 'Today is 11/27/2012. PyCon starts 3/13/2013.'
>>> re.sub(r'(?P<month>\d+)/(?P<day>\d+)/(?P<year>\d+)', r'\g<year>-\g<month>-\g<day>', text)
'Today is 2012-11-27. PyCon starts 2013-3-13.'

unicode 转中文

s = "\\u4ece\\u96f6\\u5f00\\u59cb\\u7684\\u5f02\\u4e16\\u754c\\u751f"
text = s.encode("utf-8").decode("unicode_escape")
In [2]: text
Out[2]: '从零开始的异世界生'

生成艺术字

pyfiglet -f [选择字体] [哪些字符]

效果示例：

pyfiglet -f big Chen Xs
  _____ _                 __   __
 / ____| |                \ \ / /
| |    | |__   ___ _ __    \ V / ___
| |    | '_ \ / _ \ '_ \    > < / __|
| |____| | | |  __/ | | |  / . \\__ \
 \_____|_| |_|\___|_| |_| /_/ \_\___/

print输出到文件

1
2
3

with open("...", "a") as f:  # 追加模式 
	for _ in range(10):
		print("。。。", file=f)

开启本地http服务

1 2	# 需要先cd到指定路径下，绑定 0.0.0.0 其他端才能访问 python -m http.server 1111 --bind 0.0.0.0

uuid生成唯一标识

1 2	uuid.uuid1() # 基于时间戳 uuid.uuid4() # 基于随机数，碰撞几率很小

快速下载保存图片

1
2
3

from urllib.request import urlretrieve

urlretrieve(img_url, filepath)

类变量和实例变量

class Student:

	gender = "man"  # 类变量，所有实例共享

	def __init__(self):
		self.name = "cxs"   # 实例变量，实例独有

保留小数位

In [6]: round(9843.8493, 2)
Out[6]: 9843.85

In [4]: "%.2f" % 89.34938493
Out[4]: '89.35'

进制间的相互转换

In [25]: format(a, "o")  # 十进制zhua八
Out[25]: '231211'
In [26]: format(a, "x")  # 十进制转十六进制
Out[26]: '13289'
In [28]: int('13289', 16)
Out[28]: 78473

deque两头操作

1
2
3

q = deque(maxlen=...)
q.append()  q.pop()  # 默认right
q.appendleft()  q.popleft()  # 队列头

从一个集合中获得最大或者最小的 N 个元素列表

1	from heapq import nlargest, nsmallest

当要查找的元素个数相对比较小的时候，函数 nlargest() 和 nsmallest() 是很合适的。如果你仅仅想查找唯一的最小或最大（N=1）的元素的话，那么使用 min() 和 max() 函数会更快些。类似的，如果 N 的大小和集合大小接近的时候，通常先排序这个集合然后再使用切片操作会更快点（ sorted(items)[:N] 或者是 sorted(items)[-N:] ）。需要在正确场合使用函数 nlargest() 和 nsmallest() 才能发挥它们的优势（如果 N 快接近集合大小了，那么使用排序操作会更好些）。

pysnooper 显示执行流程

import pysnooper


@pysnooper.snoop()
def fib(n):
	assert n > 0, 'n必须大于0'
	if n == 1 or n == 2:
		return 1
	return fib(n-1) + fib(n-2)

In [63]: fib(3)
'''
Starting var:.. n = 3
18:00:31.254635 call         2 def fib(n):
18:00:31.255630 line         3     assert n > 0, 'n必须大于0'
18:00:31.256613 line         4     if n == 1 or n == 2:
18:00:31.257614 line         6     return fib(n-1) + fib(n-2)
    Starting var:.. n = 2
    18:00:31.258615 call         2 def fib(n):
    18:00:31.264042 line         3     assert n > 0, 'n必须大于0'
    18:00:31.264042 line         4     if n == 1 or n == 2:
    18:00:31.265047 line         5         return 1
    18:00:31.266045 return       5         return 1
    Return value:.. 1
    Elapsed time: 00:00:00.008450
    Starting var:.. n = 1
    18:00:31.268045 call         2 def fib(n):
    18:00:31.272063 line         3     assert n > 0, 'n必须大于0'
    18:00:31.274828 line         4     if n == 1 or n == 2:
    18:00:31.275838 line         5         return 1
    18:00:31.276838 return       5         return 1
    Return value:.. 1
    Elapsed time: 00:00:00.009790
18:00:31.278834 return       6     return fib(n-1) + fib(n-2)
Return value:.. 2
Elapsed time: 00:00:00.025208
'''

调试的时候，始终无法进入函数内部，可能是误用了 yield 关键字，把函数包装成了生成器

驼峰、下划线相互转换：https://blog.csdn.net/mouday/article/details/90079956

def get_lower_case_name(text):
    """
    驼峰转下划线
    """
    snake_case = re.sub(r"(?P<key>[A-Z])", r"_\g<key>", text)
    return snake_case.lower().strip("_")

列表分块 / 切片

step = 3

for i in range(0, len(date_list), step):
	b = date_list[i : i + step]
	yield b

打印报错栈

import trace

except Exception as e:
	print(traceback.format_exc(), e)