Python基础语言学习笔记总结(精华)

 更新时间:2017年11月14日 11:58:05   作者:李小小小伟  
给大家分享一篇关于Python基础学习内容的学习笔记整理总结篇,里面汇集了学习Python基础语言的难点和技巧,分享给大家。

以下是Python基础学习内容的学习笔记的全部内容,非常的详细,如果你对Python语言感兴趣,并且针对性的系统学习一下基础语言知识,下面的内容能够很好的满足你的需求,如果感觉不错,就收藏以后慢慢跟着学习吧。

一、变量赋值及命名规则

① 声明一个变量及赋值

#!/usr/bin/env python
# -*- coding:utf-8 -*-
# _author_soloLi
name1="solo"
name2=name1
print(name1,name2)
name1 = "hehe"
print(name1,name2)

#name1的值为hehe,name2的值为solo

② 变量命名的规则

1、变量名只能是 字母、数字或下划线的任意组合
2、变量名的第一个字符不能是数字
3、以下关键字不能声明为变量名['and', 'as', 'assert', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'exec', 'finally', 'for', 'from', 'global','if', 'import', 'in', 'is', 'lambda', 'not', 'or', 'pass', 'print', 'raise', 'return', 'try', 'while', 'with', 'yield']

二、字符编码

python解释器在加载 .py 文件中的代码时,会对内容进行编码(默认ascill)

ASCII:最多只能用 8位来表示(一个字节),即:2**8 = 256,所以,ASCII码最多只能表示 256 个符号。显然ASCII码无法将世界上的各种文字和符号全部表示。

Unicode:它为每种语言中的每个字符设定了统一并且唯一的二进制编码,规定虽有的字符和符号最少由 16 位来表示(2个字节),即:2 **16 = 65536,注:此处说的的是最少2个字节,可能更多。

UTF-8:是对Unicode编码的压缩和优化,他不再使用最少使用2个字节,而是将所有的字符和符号进行分类:ascii码中的内容用1个字节保存、欧洲的字符用2个字节保存,东亚的字符用3个字节保存...
注:python2.x版本,默认支持的字符编码为ASCll python3.x版本,默认支持的是Unicode,不用声明字符编码可以直接显示中文。

 扩展:字符编码和转码,bytes和str区别

    Python 3最重要的新特性大概要算是对文本和二进制数据作了更为清晰的区分。文本总是Unicode,由str类型表示,二进制数据则由bytes类型表示。Python 3不会以任意隐式的方式混用str和bytes(类似int和long之间自动转换),正是这使得两者的区分特别清晰。你不能拼接字符串和字节包,也无法在字节包里搜索字符串(反之亦然),也不能将字符串传入参数为字节包的函数(反之亦然)。这是件好事。不管怎样,字符串和字节包之间的界线是必然的,下面的图解非常重要,务请牢记于心:

字符串可以编码成字节包,而字节包可以解码成字符串:

#!/usr/bin/env python
# -*- coding:utf-8 -*-
#-Author-solo
msg = "里约奥运"
print(msg.encode("utf-8"))           #如果不指定编码格式,默认为utf-8
#b'\xe9\x87\x8c\xe7\xba\xa6\xe5\xa5\xa5\xe8\xbf\x90'
print(b'\xe9\x87\x8c\xe7\xba\xa6\xe5\xa5\xa5\xe8\xbf\x90'.decode("utf-8"))
#里约奥运

为什么要进行编码和转码?

  由于每个国家电脑的字符编码格式不统一(列中国:GBK),同一款软件放到不同国家的电脑上会出现乱码的情况,出现这种情况如何解决呢?! 当然由于所有国家的电脑都支持Unicode万国码,那么我们可以把Unicode为跳板,先把字符编码转换为Unicode,在把Unicode转换为另一个国家的字符编码(例韩国),则不会出现乱码的情况。当然这里只是转编码集并不是翻译成韩文不要弄混了。

① Python3.0进行编码转换(默认Unicode编码)

name = "脚本之家"           #此时name为Unicode编码
name1 = name.encode("utf-8")   #Unicode转为UTF-8
name2 = name1.decode("utf-8")   #UTF-8转为Unicode
name3 = name.encode("gbk")    #Unicode转为GBK
name4 = name3.decode("gbk")   #GBK转为Unicode

② Python2.0中的编码转换(默认ascii编码)

① 声明字符编码(utf-8)
# -*- coding:utf-8 -*-
name = "李伟"          #ascii码里是没有字符“你好”的,此时的name为uft-8
name1 = name.decode("utf-8")  #UTF-8转为Unicode
name2 = name1.encode("gbk")   #Unicode转为gbk
② 使用默认字符编码(ascii)
name = "nihao"       #英文字符,且第二行字符声明去掉,此刻name为ascii码
name1 = name.decode("ascii")   #ascii码转为unicode
name2 = name1.encode("utf-8") #unicode转为utf-8
name3 =name1.encode("gbk")   #unicode转为gbk

三、用户交互及字符串拼接

#!/usr/bin/env python
# -*- coding:utf-8 -*-
# _author_soloLi
# python2.X与python3.X区别:  python2.X raw_input = python3.X input
# 提示用户输入姓名、年龄、工作、工资并以信息列表的形式打印出
name = input("Please input your name:")
age = int(input("Please input your age:")) #str强制转换为int
job = input("Please input your job:")
salary = input("Please input your salary:")
info1 = '''
------------ Info of %s ---------
Name:%s
Age:%d
Job:%s
Salary:%s
''' %(name,name,age,job,salary)   #%s检测数据类型为字符串,%d检测数据类型为整数,%f检测数据类型为浮点数 强制
print(info1)
# info2 = '''
# ------------ Info of {_Name} ---------
# Name:{_Name}
# Age:{_Age}
# Job:{_Job}
# Salary:{_Salary}
# ''' .format(_Name=name,
#       _Age=age,
#       _Job=job,
#       _Salary=salary)
# print(info2)
# info3 = '''
# ------------ Info of {0} ---------
# Name:{0}
# Age:{1}
# Job:{2}
# Salary:{3}
# ''' .format(name,age,job,salary)
# print(info3)

对比分析:
1、% :无法同时传递一个变量和元组,又是要加()来保证不抛出typeerror异常
2、+ :每增加一个一个+就会开辟一块新的内存空间
3、.fomat :不会出现上述问题,有时使用为了兼容Python2版本。如使用logging库

四、循环语句(if、while、for、三元运算)

#!/usr/bin/env python
# -*- coding:utf-8 -*-
# _author_soloLi
################## if语句 ######################
# A = 66
#
# B = int(input("请输入0-100的幸运数字:"))
#
# if B == A:           #母级顶格写
#   print ("恭喜你猜对了!")  #子级强制缩进写
# elif B > A :
#   print ("猜小了")
# else:
#   print ("猜大了")


################## while语句 ######################
# A = 66
# count = 0          # 设置初始值count=0
#
# while count < 3 :
#
#   B = int(input("请输入0-100的数字:"))
#
#   if B == A:
#     print ("恭喜你猜对了!")
#     break
#   elif B > A :
#     print ("猜大了")
#   else:
#     print ("猜小了")
#   count += 1
# else:
#   print ("你猜的次数太多了!")


################## for语句 ######################
A = 66
i=1
for i in range(3):# while判断count是否小于3,如果小于3则:
  print("i=",i)
  B = int(input("请输入0-100的数字:"))
  if B == A:
    print ("恭喜你猜对了!")
    break
  elif B > A :
    print ("猜小了")
  else:
    print ("猜大了")
  i+=1
else:
  print ("你猜的次数太多了!")


################## 三元运算 ######################
 # esult = 值1 if 条件 else 值2
 # 如果条件成立,那么将 “值1” 赋值给result变量,否则,将“值2”赋值给result变量


五、基本数据类型

一、整型
如: 18、73、84
类型常用功能:

abs(x)   #返回绝对值
x+y,x-y,x*y,x/y #加减乘除
x/y     #取商,浮点数相除保留余数
x//y    #取商,浮点数相除余数为0
x%y     #取余
x**y     #幂次方
cmp(x,y)  #两个数比较,返回True或False相等则为0
coerce(x,y) #强制把两个数生成一个元组
divmod(x,y) #相除得到商和余数组成的元组
float(x)  #转换为浮点型
str(x)   #转换为字符串
hex(x)   #转换为16进制
oct(x)   #转换8进制

更多功能:

class int(object):
  """
  int(x=0) -> int or long
  int(x, base=10) -> int or long
  
  Convert a number or string to an integer, or return 0 if no arguments
  are given. If x is floating point, the conversion truncates towards zero.
  If x is outside the integer range, the function returns a long instead.
  
  If x is not a number or if base is given, then x must be a string or
  Unicode object representing an integer literal in the given base. The
  literal can be preceded by '+' or '-' and be surrounded by whitespace.
  The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to
  interpret the base from the string as an integer literal.
  >>> int('0b100', base=0)
  """
  def bit_length(self): 
    """ 返回表示该数字的时占用的最少位数 """
    """
    int.bit_length() -> int
    
    Number of bits necessary to represent self in binary.
    >>> bin(37)
    '0b100101'
    >>> (37).bit_length()
    """
    return 0

  def conjugate(self, *args, **kwargs): # real signature unknown
    """ 返回该复数的共轭复数 """
    """ Returns self, the complex conjugate of any int. """
    pass

  def __abs__(self):
    """ 返回绝对值 """
    """ x.__abs__() <==> abs(x) """
    pass

  def __add__(self, y):
    """ x.__add__(y) <==> x+y """
    pass

  def __and__(self, y):
    """ x.__and__(y) <==> x&y """
    pass

  def __cmp__(self, y): 
    """ 比较两个数大小 """
    """ x.__cmp__(y) <==> cmp(x,y) """
    pass

  def __coerce__(self, y):
    """ 强制生成一个元组 """ 
    """ x.__coerce__(y) <==> coerce(x, y) """
    pass

  def __divmod__(self, y): 
    """ 相除,得到商和余数组成的元组 """ 
    """ x.__divmod__(y) <==> divmod(x, y) """
    pass

  def __div__(self, y): 
    """ x.__div__(y) <==> x/y """
    pass

  def __float__(self): 
    """ 转换为浮点类型 """ 
    """ x.__float__() <==> float(x) """
    pass

  def __floordiv__(self, y): 
    """ x.__floordiv__(y) <==> x//y """
    pass

  def __format__(self, *args, **kwargs): # real signature unknown
    pass

  def __getattribute__(self, name): 
    """ x.__getattribute__('name') <==> x.name """
    pass

  def __getnewargs__(self, *args, **kwargs): # real signature unknown
    """ 内部调用 __new__方法或创建对象时传入参数使用 """ 
    pass

  def __hash__(self): 
    """如果对象object为哈希表类型,返回对象object的哈希值。哈希值为整数。在字典查找中,哈希值用于快速比较字典的键。两个数值如果相等,则哈希值也相等。"""
    """ x.__hash__() <==> hash(x) """
    pass

  def __hex__(self): 
    """ 返回当前数的 十六进制 表示 """ 
    """ x.__hex__() <==> hex(x) """
    pass

  def __index__(self): 
    """ 用于切片,数字无意义 """
    """ x[y:z] <==> x[y.__index__():z.__index__()] """
    pass

  def __init__(self, x, base=10): # known special case of int.__init__
    """ 构造方法,执行 x = 123 或 x = int(10) 时,自动调用,暂时忽略 """ 
    """
    int(x=0) -> int or long
    int(x, base=10) -> int or long
    
    Convert a number or string to an integer, or return 0 if no arguments
    are given. If x is floating point, the conversion truncates towards zero.
    If x is outside the integer range, the function returns a long instead.
    
    If x is not a number or if base is given, then x must be a string or
    Unicode object representing an integer literal in the given base. The
    literal can be preceded by '+' or '-' and be surrounded by whitespace.
    The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to
    interpret the base from the string as an integer literal.
    >>> int('0b100', base=0)
    # (copied from class doc)
    """
    pass

  def __int__(self): 
    """ 转换为整数 """ 
    """ x.__int__() <==> int(x) """
    pass

  def __invert__(self): 
    """ x.__invert__() <==> ~x """
    pass

  def __long__(self): 
    """ 转换为长整数 """ 
    """ x.__long__() <==> long(x) """
    pass

  def __lshift__(self, y): 
    """ x.__lshift__(y) <==> x<<y """
    pass

  def __mod__(self, y): 
    """ x.__mod__(y) <==> x%y """
    pass

  def __mul__(self, y): 
    """ x.__mul__(y) <==> x*y """
    pass

  def __neg__(self): 
    """ x.__neg__() <==> -x """
    pass

  @staticmethod # known case of __new__
  def __new__(S, *more): 
    """ T.__new__(S, ...) -> a new object with type S, a subtype of T """
    pass

  def __nonzero__(self): 
    """ x.__nonzero__() <==> x != 0 """
    pass

  def __oct__(self): 
    """ 返回改值的 八进制 表示 """ 
    """ x.__oct__() <==> oct(x) """
    pass

  def __or__(self, y): 
    """ x.__or__(y) <==> x|y """
    pass

  def __pos__(self): 
    """ x.__pos__() <==> +x """
    pass

  def __pow__(self, y, z=None): 
    """ 幂,次方 """ 
    """ x.__pow__(y[, z]) <==> pow(x, y[, z]) """
    pass

  def __radd__(self, y): 
    """ x.__radd__(y) <==> y+x """
    pass

  def __rand__(self, y): 
    """ x.__rand__(y) <==> y&x """
    pass

  def __rdivmod__(self, y): 
    """ x.__rdivmod__(y) <==> divmod(y, x) """
    pass

  def __rdiv__(self, y): 
    """ x.__rdiv__(y) <==> y/x """
    pass

  def __repr__(self): 
    """转化为解释器可读取的形式 """
    """ x.__repr__() <==> repr(x) """
    pass

  def __str__(self): 
    """转换为人阅读的形式,如果没有适于人阅读的解释形式的话,则返回解释器课阅读的形式"""
    """ x.__str__() <==> str(x) """
    pass

  def __rfloordiv__(self, y): 
    """ x.__rfloordiv__(y) <==> y//x """
    pass

  def __rlshift__(self, y): 
    """ x.__rlshift__(y) <==> y<<x """
    pass

  def __rmod__(self, y): 
    """ x.__rmod__(y) <==> y%x """
    pass

  def __rmul__(self, y): 
    """ x.__rmul__(y) <==> y*x """
    pass

  def __ror__(self, y): 
    """ x.__ror__(y) <==> y|x """
    pass

  def __rpow__(self, x, z=None): 
    """ y.__rpow__(x[, z]) <==> pow(x, y[, z]) """
    pass

  def __rrshift__(self, y): 
    """ x.__rrshift__(y) <==> y>>x """
    pass

  def __rshift__(self, y): 
    """ x.__rshift__(y) <==> x>>y """
    pass

  def __rsub__(self, y): 
    """ x.__rsub__(y) <==> y-x """
    pass

  def __rtruediv__(self, y): 
    """ x.__rtruediv__(y) <==> y/x """
    pass

  def __rxor__(self, y): 
    """ x.__rxor__(y) <==> y^x """
    pass

  def __sub__(self, y): 
    """ x.__sub__(y) <==> x-y """
    pass

  def __truediv__(self, y): 
    """ x.__truediv__(y) <==> x/y """
    pass

  def __trunc__(self, *args, **kwargs): 
    """ 返回数值被截取为整形的值,在整形中无意义 """
    pass

  def __xor__(self, y): 
    """ x.__xor__(y) <==> x^y """
    pass

  denominator = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
  """ 分母 = 1 """
  """the denominator of a rational number in lowest terms"""

  imag = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
  """ 虚数,无意义 """
  """the imaginary part of a complex number"""

  numerator = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
  """ 分子 = 数字大小 """
  """the numerator of a rational number in lowest terms"""

  real = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
  """ 实属,无意义 """
  """the real part of a complex number"""

int

int

二、长整型
如:2147483649、9223372036854775807
类型常用功能:

#长整型功能与整形基本类似

class long(object):
  """
  long(x=0) -> long
  long(x, base=10) -> long
  
  Convert a number or string to a long integer, or return 0L if no arguments
  are given. If x is floating point, the conversion truncates towards zero.
  
  If x is not a number or if base is given, then x must be a string or
  Unicode object representing an integer literal in the given base. The
  literal can be preceded by '+' or '-' and be surrounded by whitespace.
  The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to
  interpret the base from the string as an integer literal.
  >>> int('0b100', base=0)
  4L
  """
  def bit_length(self): # real signature unknown; restored from __doc__
    """
    long.bit_length() -> int or long
    
    Number of bits necessary to represent self in binary.
    >>> bin(37L)
    '0b100101'
    >>> (37L).bit_length()
    """
    return 0

  def conjugate(self, *args, **kwargs): # real signature unknown
    """ Returns self, the complex conjugate of any long. """
    pass

  def __abs__(self): # real signature unknown; restored from __doc__
    """ x.__abs__() <==> abs(x) """
    pass

  def __add__(self, y): # real signature unknown; restored from __doc__
    """ x.__add__(y) <==> x+y """
    pass

  def __and__(self, y): # real signature unknown; restored from __doc__
    """ x.__and__(y) <==> x&y """
    pass

  def __cmp__(self, y): # real signature unknown; restored from __doc__
    """ x.__cmp__(y) <==> cmp(x,y) """
    pass

  def __coerce__(self, y): # real signature unknown; restored from __doc__
    """ x.__coerce__(y) <==> coerce(x, y) """
    pass

  def __divmod__(self, y): # real signature unknown; restored from __doc__
    """ x.__divmod__(y) <==> divmod(x, y) """
    pass

  def __div__(self, y): # real signature unknown; restored from __doc__
    """ x.__div__(y) <==> x/y """
    pass

  def __float__(self): # real signature unknown; restored from __doc__
    """ x.__float__() <==> float(x) """
    pass

  def __floordiv__(self, y): # real signature unknown; restored from __doc__
    """ x.__floordiv__(y) <==> x//y """
    pass

  def __format__(self, *args, **kwargs): # real signature unknown
    pass

  def __getattribute__(self, name): # real signature unknown; restored from __doc__
    """ x.__getattribute__('name') <==> x.name """
    pass

  def __getnewargs__(self, *args, **kwargs): # real signature unknown
    pass

  def __hash__(self): # real signature unknown; restored from __doc__
    """ x.__hash__() <==> hash(x) """
    pass

  def __hex__(self): # real signature unknown; restored from __doc__
    """ x.__hex__() <==> hex(x) """
    pass

  def __index__(self): # real signature unknown; restored from __doc__
    """ x[y:z] <==> x[y.__index__():z.__index__()] """
    pass

  def __init__(self, x=0): # real signature unknown; restored from __doc__
    pass

  def __int__(self): # real signature unknown; restored from __doc__
    """ x.__int__() <==> int(x) """
    pass

  def __invert__(self): # real signature unknown; restored from __doc__
    """ x.__invert__() <==> ~x """
    pass

  def __long__(self): # real signature unknown; restored from __doc__
    """ x.__long__() <==> long(x) """
    pass

  def __lshift__(self, y): # real signature unknown; restored from __doc__
    """ x.__lshift__(y) <==> x<<y """
    pass

  def __mod__(self, y): # real signature unknown; restored from __doc__
    """ x.__mod__(y) <==> x%y """
    pass

  def __mul__(self, y): # real signature unknown; restored from __doc__
    """ x.__mul__(y) <==> x*y """
    pass

  def __neg__(self): # real signature unknown; restored from __doc__
    """ x.__neg__() <==> -x """
    pass

  @staticmethod # known case of __new__
  def __new__(S, *more): # real signature unknown; restored from __doc__
    """ T.__new__(S, ...) -> a new object with type S, a subtype of T """
    pass

  def __nonzero__(self): # real signature unknown; restored from __doc__
    """ x.__nonzero__() <==> x != 0 """
    pass

  def __oct__(self): # real signature unknown; restored from __doc__
    """ x.__oct__() <==> oct(x) """
    pass

  def __or__(self, y): # real signature unknown; restored from __doc__
    """ x.__or__(y) <==> x|y """
    pass

  def __pos__(self): # real signature unknown; restored from __doc__
    """ x.__pos__() <==> +x """
    pass

  def __pow__(self, y, z=None): # real signature unknown; restored from __doc__
    """ x.__pow__(y[, z]) <==> pow(x, y[, z]) """
    pass

  def __radd__(self, y): # real signature unknown; restored from __doc__
    """ x.__radd__(y) <==> y+x """
    pass

  def __rand__(self, y): # real signature unknown; restored from __doc__
    """ x.__rand__(y) <==> y&x """
    pass

  def __rdivmod__(self, y): # real signature unknown; restored from __doc__
    """ x.__rdivmod__(y) <==> divmod(y, x) """
    pass

  def __rdiv__(self, y): # real signature unknown; restored from __doc__
    """ x.__rdiv__(y) <==> y/x """
    pass

  def __repr__(self): # real signature unknown; restored from __doc__
    """ x.__repr__() <==> repr(x) """
    pass

  def __rfloordiv__(self, y): # real signature unknown; restored from __doc__
    """ x.__rfloordiv__(y) <==> y//x """
    pass

  def __rlshift__(self, y): # real signature unknown; restored from __doc__
    """ x.__rlshift__(y) <==> y<<x """
    pass

  def __rmod__(self, y): # real signature unknown; restored from __doc__
    """ x.__rmod__(y) <==> y%x """
    pass

  def __rmul__(self, y): # real signature unknown; restored from __doc__
    """ x.__rmul__(y) <==> y*x """
    pass

  def __ror__(self, y): # real signature unknown; restored from __doc__
    """ x.__ror__(y) <==> y|x """
    pass

  def __rpow__(self, x, z=None): # real signature unknown; restored from __doc__
    """ y.__rpow__(x[, z]) <==> pow(x, y[, z]) """
    pass

  def __rrshift__(self, y): # real signature unknown; restored from __doc__
    """ x.__rrshift__(y) <==> y>>x """
    pass

  def __rshift__(self, y): # real signature unknown; restored from __doc__
    """ x.__rshift__(y) <==> x>>y """
    pass

  def __rsub__(self, y): # real signature unknown; restored from __doc__
    """ x.__rsub__(y) <==> y-x """
    pass

  def __rtruediv__(self, y): # real signature unknown; restored from __doc__
    """ x.__rtruediv__(y) <==> y/x """
    pass

  def __rxor__(self, y): # real signature unknown; restored from __doc__
    """ x.__rxor__(y) <==> y^x """
    pass

  def __sizeof__(self, *args, **kwargs): # real signature unknown
    """ Returns size in memory, in bytes """
    pass

  def __str__(self): # real signature unknown; restored from __doc__
    """ x.__str__() <==> str(x) """
    pass

  def __sub__(self, y): # real signature unknown; restored from __doc__
    """ x.__sub__(y) <==> x-y """
    pass

  def __truediv__(self, y): # real signature unknown; restored from __doc__
    """ x.__truediv__(y) <==> x/y """
    pass

  def __trunc__(self, *args, **kwargs): # real signature unknown
    """ Truncating an Integral returns itself. """
    pass

  def __xor__(self, y): # real signature unknown; restored from __doc__
    """ x.__xor__(y) <==> x^y """
    pass

  denominator = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
  """the denominator of a rational number in lowest terms"""

  imag = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
  """the imaginary part of a complex number"""

  numerator = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
  """the numerator of a rational number in lowest terms"""

  real = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
  """the real part of a complex number"""

long

long

注:跟C语言不同,Python的长整数没有指定位宽,即:Python没有限制长整数数值的大小,但实际上由于机器内存有限,我们使用的长整数数值不可能无限大。自从Python2.2起,如果整数发生溢出,Python会自动将整数数据转换为长整数,所以如今在长整数数据后面不加字母L也不会导致严重后果

三、浮点型
如:3.14、2.88

类型常用功能:

#浮点型功能与整形基本类似

class float(object):
  """
  float(x) -> floating point number
  
  Convert a string or number to a floating point number, if possible.
  """
  def as_integer_ratio(self):  
    """ 获取改值的最简比 """
    """
    float.as_integer_ratio() -> (int, int)

    Return a pair of integers, whose ratio is exactly equal to the original
    float and with a positive denominator.
    Raise OverflowError on infinities and a ValueError on NaNs.

    >>> (10.0).as_integer_ratio()
    (10, 1)
    >>> (0.0).as_integer_ratio()
    (0, 1)
    >>> (-.25).as_integer_ratio()
    (-1, 4)
    """
    pass

  def conjugate(self, *args, **kwargs): # real signature unknown
    """ Return self, the complex conjugate of any float. """
    pass

  def fromhex(self, string):  
    """ 将十六进制字符串转换成浮点型 """
    """
    float.fromhex(string) -> float
    
    Create a floating-point number from a hexadecimal string.
    >>> float.fromhex('0x1.ffffp10')
    2047.984375
    >>> float.fromhex('-0x1p-1074')
    -4.9406564584124654e-324
    """
    return 0.0

  def hex(self):  
    """ 返回当前值的 16 进制表示 """
    """
    float.hex() -> string
    
    Return a hexadecimal representation of a floating-point number.
    >>> (-0.1).hex()
    '-0x1.999999999999ap-4'
    >>> 3.14159.hex()
    '0x1.921f9f01b866ep+1'
    """
    return ""

  def is_integer(self, *args, **kwargs): # real signature unknown
    """ Return True if the float is an integer. """
    pass

  def __abs__(self):  
    """ x.__abs__() <==> abs(x) """
    pass

  def __add__(self, y):  
    """ x.__add__(y) <==> x+y """
    pass

  def __coerce__(self, y):  
    """ x.__coerce__(y) <==> coerce(x, y) """
    pass

  def __divmod__(self, y):  
    """ x.__divmod__(y) <==> divmod(x, y) """
    pass

  def __div__(self, y):  
    """ x.__div__(y) <==> x/y """
    pass

  def __eq__(self, y):  
    """ x.__eq__(y) <==> x==y """
    pass

  def __float__(self):  
    """ x.__float__() <==> float(x) """
    pass

  def __floordiv__(self, y):  
    """ x.__floordiv__(y) <==> x//y """
    pass

  def __format__(self, format_spec):  
    """
    float.__format__(format_spec) -> string
    
    Formats the float according to format_spec.
    """
    return ""

  def __getattribute__(self, name):  
    """ x.__getattribute__('name') <==> x.name """
    pass

  def __getformat__(self, typestr):  
    """
    float.__getformat__(typestr) -> string
    
    You probably don't want to use this function. It exists mainly to be
    used in Python's test suite.
    
    typestr must be 'double' or 'float'. This function returns whichever of
    'unknown', 'IEEE, big-endian' or 'IEEE, little-endian' best describes the
    format of floating point numbers used by the C type named by typestr.
    """
    return ""

  def __getnewargs__(self, *args, **kwargs): # real signature unknown
    pass

  def __ge__(self, y):  
    """ x.__ge__(y) <==> x>=y """
    pass

  def __gt__(self, y):  
    """ x.__gt__(y) <==> x>y """
    pass

  def __hash__(self):  
    """ x.__hash__() <==> hash(x) """
    pass

  def __init__(self, x):  
    pass

  def __int__(self):  
    """ x.__int__() <==> int(x) """
    pass

  def __le__(self, y):  
    """ x.__le__(y) <==> x<=y """
    pass

  def __long__(self):  
    """ x.__long__() <==> long(x) """
    pass

  def __lt__(self, y):  
    """ x.__lt__(y) <==> x<y """
    pass

  def __mod__(self, y):  
    """ x.__mod__(y) <==> x%y """
    pass

  def __mul__(self, y):  
    """ x.__mul__(y) <==> x*y """
    pass

  def __neg__(self):  
    """ x.__neg__() <==> -x """
    pass

  @staticmethod # known case of __new__
  def __new__(S, *more):  
    """ T.__new__(S, ...) -> a new object with type S, a subtype of T """
    pass

  def __ne__(self, y):  
    """ x.__ne__(y) <==> x!=y """
    pass

  def __nonzero__(self):  
    """ x.__nonzero__() <==> x != 0 """
    pass

  def __pos__(self):  
    """ x.__pos__() <==> +x """
    pass

  def __pow__(self, y, z=None):  
    """ x.__pow__(y[, z]) <==> pow(x, y[, z]) """
    pass

  def __radd__(self, y):  
    """ x.__radd__(y) <==> y+x """
    pass

  def __rdivmod__(self, y):  
    """ x.__rdivmod__(y) <==> divmod(y, x) """
    pass

  def __rdiv__(self, y):  
    """ x.__rdiv__(y) <==> y/x """
    pass

  def __repr__(self):  
    """ x.__repr__() <==> repr(x) """
    pass

  def __rfloordiv__(self, y):  
    """ x.__rfloordiv__(y) <==> y//x """
    pass

  def __rmod__(self, y):  
    """ x.__rmod__(y) <==> y%x """
    pass

  def __rmul__(self, y):  
    """ x.__rmul__(y) <==> y*x """
    pass

  def __rpow__(self, x, z=None):  
    """ y.__rpow__(x[, z]) <==> pow(x, y[, z]) """
    pass

  def __rsub__(self, y):  
    """ x.__rsub__(y) <==> y-x """
    pass

  def __rtruediv__(self, y):  
    """ x.__rtruediv__(y) <==> y/x """
    pass

  def __setformat__(self, typestr, fmt):  
    """
    float.__setformat__(typestr, fmt) -> None
    
    You probably don't want to use this function. It exists mainly to be
    used in Python's test suite.
    
    typestr must be 'double' or 'float'. fmt must be one of 'unknown',
    'IEEE, big-endian' or 'IEEE, little-endian', and in addition can only be
    one of the latter two if it appears to match the underlying C reality.
    
    Override the automatic determination of C-level floating point type.
    This affects how floats are converted to and from binary strings.
    """
    pass

  def __str__(self):  
    """ x.__str__() <==> str(x) """
    pass

  def __sub__(self, y):  
    """ x.__sub__(y) <==> x-y """
    pass

  def __truediv__(self, y):  
    """ x.__truediv__(y) <==> x/y """
    pass

  def __trunc__(self, *args, **kwargs): # real signature unknown
    """ Return the Integral closest to x between 0 and x. """
    pass

  imag = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
  """the imaginary part of a complex number"""

  real = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
  """the real part of a complex number"""

 float

float

四、字符串
如:'wupeiqi'、'alex'、'solo'

类型常用功能:

name = "my name is solo"
print(name.capitalize())      #首字母大写
#My name is solo
print(name.count("l"))       #统计字符串出现某个字符的个数
#2
print(name.center(30,"-"))     #打印30个字符,不够的-补齐
#--------my name is solo--------
print(name.ljust(30,"-"))      #打印30个字符,不够的-补齐,字符串在左边
#my name is solo----------------
print(name.endswith("solo"))     #判断字符串是否以solo结尾
#True
print(name[name.find("na"):])    #find寻找na所在的索引下标 字符串也可以切片
#name is solo
print("5.3".isdigit())       #判断字符是否为整数
#False
print("a_1A".isidentifier())    #判断是不是一个合法的标识符(变量名)
#True
print("+".join(["1","2","3"]))   #把join后的内容加入到前面字符串中,以+为分割符
#1+2+3
print("\nsolo".strip())       #去换行符
#solo
print("1+2+3+4".split("+"))    #以+为分隔符生成新的列表,默认不写为空格
#['1', '2', '3', '4']
name = "my name is {name} and i an {year} old"
print(name.format(name="solo",year=20)
#my name is solo and i an 20 old
print(name.format_map({"name":"solo","year":20}))      #很少用
#my name is solo and i an 20 old
p = str.maketrans("abcdefli","12345678")     #转换 一一对应
print("lianzhilei".translate(p))
#781nzh8758
class str(basestring):
  """
  str(object='') -> string
  
  Return a nice string representation of the object.
  If the argument is a string, the return value is the same object.
  """
  def capitalize(self): 
    """ 首字母变大写 """
    """
    S.capitalize() -> string
    
    Return a copy of the string S with only its first character
    capitalized.
    """
    return ""

  def center(self, width, fillchar=None): 
    """ 内容居中,width:总长度;fillchar:空白处填充内容,默认无 """
    """
    S.center(width[, fillchar]) -> string
    
    Return S centered in a string of length width. Padding is
    done using the specified fill character (default is a space)
    """
    return ""

  def count(self, sub, start=None, end=None): 
    """ 子序列个数 """
    """
    S.count(sub[, start[, end]]) -> int
    
    Return the number of non-overlapping occurrences of substring sub in
    string S[start:end]. Optional arguments start and end are interpreted
    as in slice notation.
    """
    return 0

  def decode(self, encoding=None, errors=None): 
    """ 解码 """
    """
    S.decode([encoding[,errors]]) -> object
    
    Decodes S using the codec registered for encoding. encoding defaults
    to the default encoding. errors may be given to set a different error
    handling scheme. Default is 'strict' meaning that encoding errors raise
    a UnicodeDecodeError. Other possible values are 'ignore' and 'replace'
    as well as any other name registered with codecs.register_error that is
    able to handle UnicodeDecodeErrors.
    """
    return object()

  def encode(self, encoding=None, errors=None): 
    """ 编码,针对unicode """
    """
    S.encode([encoding[,errors]]) -> object
    
    Encodes S using the codec registered for encoding. encoding defaults
    to the default encoding. errors may be given to set a different error
    handling scheme. Default is 'strict' meaning that encoding errors raise
    a UnicodeEncodeError. Other possible values are 'ignore', 'replace' and
    'xmlcharrefreplace' as well as any other name registered with
    codecs.register_error that is able to handle UnicodeEncodeErrors.
    """
    return object()

  def endswith(self, suffix, start=None, end=None): 
    """ 是否以 xxx 结束 """
    """
    S.endswith(suffix[, start[, end]]) -> bool
    
    Return True if S ends with the specified suffix, False otherwise.
    With optional start, test S beginning at that position.
    With optional end, stop comparing S at that position.
    suffix can also be a tuple of strings to try.
    """
    return False

  def expandtabs(self, tabsize=None): 
    """ 将tab转换成空格,默认一个tab转换成8个空格 """
    """
    S.expandtabs([tabsize]) -> string
    
    Return a copy of S where all tab characters are expanded using spaces.
    If tabsize is not given, a tab size of 8 characters is assumed.
    """
    return ""

  def find(self, sub, start=None, end=None): 
    """ 寻找子序列位置,如果没找到,返回 -1 """
    """
    S.find(sub [,start [,end]]) -> int
    
    Return the lowest index in S where substring sub is found,
    such that sub is contained within S[start:end]. Optional
    arguments start and end are interpreted as in slice notation.
    
    Return -1 on failure.
    """
    return 0

  def format(*args, **kwargs): # known special case of str.format
    """ 字符串格式化,动态参数,将函数式编程时细说 """
    """
    S.format(*args, **kwargs) -> string
    
    Return a formatted version of S, using substitutions from args and kwargs.
    The substitutions are identified by braces ('{' and '}').
    """
    pass

  def index(self, sub, start=None, end=None): 
    """ 子序列位置,如果没找到,报错 """
    S.index(sub [,start [,end]]) -> int
    
    Like S.find() but raise ValueError when the substring is not found.
    """
    return 0

  def isalnum(self): 
    """ 是否是字母和数字 """
    """
    S.isalnum() -> bool
    
    Return True if all characters in S are alphanumeric
    and there is at least one character in S, False otherwise.
    """
    return False

  def isalpha(self): 
    """ 是否是字母 """
    """
    S.isalpha() -> bool
    
    Return True if all characters in S are alphabetic
    and there is at least one character in S, False otherwise.
    """
    return False

  def isdigit(self): 
    """ 是否是数字 """
    """
    S.isdigit() -> bool
    
    Return True if all characters in S are digits
    and there is at least one character in S, False otherwise.
    """
    return False

  def islower(self): 
    """ 是否小写 """
    """
    S.islower() -> bool
    
    Return True if all cased characters in S are lowercase and there is
    at least one cased character in S, False otherwise.
    """
    return False

  def isspace(self): 
    """
    S.isspace() -> bool
    
    Return True if all characters in S are whitespace
    and there is at least one character in S, False otherwise.
    """
    return False

  def istitle(self): 
    """
    S.istitle() -> bool
    
    Return True if S is a titlecased string and there is at least one
    character in S, i.e. uppercase characters may only follow uncased
    characters and lowercase characters only cased ones. Return False
    otherwise.
    """
    return False

  def isupper(self): 
    """
    S.isupper() -> bool
    
    Return True if all cased characters in S are uppercase and there is
    at least one cased character in S, False otherwise.
    """
    return False

  def join(self, iterable): 
    """ 连接 """
    """
    S.join(iterable) -> string
    
    Return a string which is the concatenation of the strings in the
    iterable. The separator between elements is S.
    """
    return ""

  def ljust(self, width, fillchar=None): 
    """ 内容左对齐,右侧填充 """
    """
    S.ljust(width[, fillchar]) -> string
    
    Return S left-justified in a string of length width. Padding is
    done using the specified fill character (default is a space).
    """
    return ""

  def lower(self): 
    """ 变小写 """
    """
    S.lower() -> string
    
    Return a copy of the string S converted to lowercase.
    """
    return ""

  def lstrip(self, chars=None): 
    """ 移除左侧空白 """
    """
    S.lstrip([chars]) -> string or unicode
    
    Return a copy of the string S with leading whitespace removed.
    If chars is given and not None, remove characters in chars instead.
    If chars is unicode, S will be converted to unicode before stripping
    """
    return ""

  def partition(self, sep): 
    """ 分割,前,中,后三部分 """
    """
    S.partition(sep) -> (head, sep, tail)
    
    Search for the separator sep in S, and return the part before it,
    the separator itself, and the part after it. If the separator is not
    found, return S and two empty strings.
    """
    pass

  def replace(self, old, new, count=None): 
    """ 替换 """
    """
    S.replace(old, new[, count]) -> string
    
    Return a copy of string S with all occurrences of substring
    old replaced by new. If the optional argument count is
    given, only the first count occurrences are replaced.
    """
    return ""

  def rfind(self, sub, start=None, end=None): 
    """
    S.rfind(sub [,start [,end]]) -> int
    
    Return the highest index in S where substring sub is found,
    such that sub is contained within S[start:end]. Optional
    arguments start and end are interpreted as in slice notation.
    
    Return -1 on failure.
    """
    return 0

  def rindex(self, sub, start=None, end=None): 
    """
    S.rindex(sub [,start [,end]]) -> int
    
    Like S.rfind() but raise ValueError when the substring is not found.
    """
    return 0

  def rjust(self, width, fillchar=None): 
    """
    S.rjust(width[, fillchar]) -> string
    
    Return S right-justified in a string of length width. Padding is
    done using the specified fill character (default is a space)
    """
    return ""

  def rpartition(self, sep): 
    """
    S.rpartition(sep) -> (head, sep, tail)
    
    Search for the separator sep in S, starting at the end of S, and return
    the part before it, the separator itself, and the part after it. If the
    separator is not found, return two empty strings and S.
    """
    pass

  def rsplit(self, sep=None, maxsplit=None): 
    """
    S.rsplit([sep [,maxsplit]]) -> list of strings
    
    Return a list of the words in the string S, using sep as the
    delimiter string, starting at the end of the string and working
    to the front. If maxsplit is given, at most maxsplit splits are
    done. If sep is not specified or is None, any whitespace string
    is a separator.
    """
    return []

  def rstrip(self, chars=None): 
    """
    S.rstrip([chars]) -> string or unicode
    
    Return a copy of the string S with trailing whitespace removed.
    If chars is given and not None, remove characters in chars instead.
    If chars is unicode, S will be converted to unicode before stripping
    """
    return ""

  def split(self, sep=None, maxsplit=None): 
    """ 分割, maxsplit最多分割几次 """
    """
    S.split([sep [,maxsplit]]) -> list of strings
    
    Return a list of the words in the string S, using sep as the
    delimiter string. If maxsplit is given, at most maxsplit
    splits are done. If sep is not specified or is None, any
    whitespace string is a separator and empty strings are removed
    from the result.
    """
    return []

  def splitlines(self, keepends=False): 
    """ 根据换行分割 """
    """
    S.splitlines(keepends=False) -> list of strings
    
    Return a list of the lines in S, breaking at line boundaries.
    Line breaks are not included in the resulting list unless keepends
    is given and true.
    """
    return []

  def startswith(self, prefix, start=None, end=None): 
    """ 是否起始 """
    """
    S.startswith(prefix[, start[, end]]) -> bool
    
    Return True if S starts with the specified prefix, False otherwise.
    With optional start, test S beginning at that position.
    With optional end, stop comparing S at that position.
    prefix can also be a tuple of strings to try.
    """
    return False

  def strip(self, chars=None): 
    """ 移除两段空白 """
    """
    S.strip([chars]) -> string or unicode
    
    Return a copy of the string S with leading and trailing
    whitespace removed.
    If chars is given and not None, remove characters in chars instead.
    If chars is unicode, S will be converted to unicode before stripping
    """
    return ""

  def swapcase(self): 
    """ 大写变小写,小写变大写 """
    """
    S.swapcase() -> string
    
    Return a copy of the string S with uppercase characters
    converted to lowercase and vice versa.
    """
    return ""

  def title(self): 
    """
    S.title() -> string
    
    Return a titlecased version of S, i.e. words start with uppercase
    characters, all remaining cased characters have lowercase.
    """
    return ""

  def translate(self, table, deletechars=None): 
    """
    转换,需要先做一个对应表,最后一个表示删除字符集合
    intab = "aeiou"
    outtab = "12345"
    trantab = maketrans(intab, outtab)
    str = "this is string example....wow!!!"
    print str.translate(trantab, 'xm')
    """

    """
    S.translate(table [,deletechars]) -> string
    
    Return a copy of the string S, where all characters occurring
    in the optional argument deletechars are removed, and the
    remaining characters have been mapped through the given
    translation table, which must be a string of length 256 or None.
    If the table argument is None, no translation is applied and
    the operation simply removes the characters in deletechars.
    """
    return ""

  def upper(self): 
    """
    S.upper() -> string
    
    Return a copy of the string S converted to uppercase.
    """
    return ""

  def zfill(self, width): 
    """方法返回指定长度的字符串,原字符串右对齐,前面填充0。"""
    """
    S.zfill(width) -> string
    
    Pad a numeric string S with zeros on the left, to fill a field
    of the specified width. The string S is never truncated.
    """
    return ""

  def _formatter_field_name_split(self, *args, **kwargs): # real signature unknown
    pass

  def _formatter_parser(self, *args, **kwargs): # real signature unknown
    pass

  def __add__(self, y): 
    """ x.__add__(y) <==> x+y """
    pass

  def __contains__(self, y): 
    """ x.__contains__(y) <==> y in x """
    pass

  def __eq__(self, y): 
    """ x.__eq__(y) <==> x==y """
    pass

  def __format__(self, format_spec): 
    """
    S.__format__(format_spec) -> string
    
    Return a formatted version of S as described by format_spec.
    """
    return ""

  def __getattribute__(self, name): 
    """ x.__getattribute__('name') <==> x.name """
    pass

  def __getitem__(self, y): 
    """ x.__getitem__(y) <==> x[y] """
    pass

  def __getnewargs__(self, *args, **kwargs): # real signature unknown
    pass

  def __getslice__(self, i, j): 
    """
    x.__getslice__(i, j) <==> x[i:j]
          
          Use of negative indices is not supported.
    """
    pass

  def __ge__(self, y): 
    """ x.__ge__(y) <==> x>=y """
    pass

  def __gt__(self, y): 
    """ x.__gt__(y) <==> x>y """
    pass

  def __hash__(self): 
    """ x.__hash__() <==> hash(x) """
    pass

  def __init__(self, string=''): # known special case of str.__init__
    """
    str(object='') -> string
    
    Return a nice string representation of the object.
    If the argument is a string, the return value is the same object.
    # (copied from class doc)
    """
    pass

  def __len__(self): 
    """ x.__len__() <==> len(x) """
    pass

  def __le__(self, y): 
    """ x.__le__(y) <==> x<=y """
    pass

  def __lt__(self, y): 
    """ x.__lt__(y) <==> x<y """
    pass

  def __mod__(self, y): 
    """ x.__mod__(y) <==> x%y """
    pass

  def __mul__(self, n): 
    """ x.__mul__(n) <==> x*n """
    pass

  @staticmethod # known case of __new__
  def __new__(S, *more): 
    """ T.__new__(S, ...) -> a new object with type S, a subtype of T """
    pass

  def __ne__(self, y): 
    """ x.__ne__(y) <==> x!=y """
    pass

  def __repr__(self): 
    """ x.__repr__() <==> repr(x) """
    pass

  def __rmod__(self, y): 
    """ x.__rmod__(y) <==> y%x """
    pass

  def __rmul__(self, n): 
    """ x.__rmul__(n) <==> n*x """
    pass

  def __sizeof__(self): 
    """ S.__sizeof__() -> size of S in memory, in bytes """
    pass

  def __str__(self): 
    """ x.__str__() <==> str(x) """
    pass

str

str

五、列表


如:[11,22,33,44,55]、['wupeiqi', 'alex','solo']
1、创建列表:

#两种创建方式
name_list = ['alex', 'seven', 'eric']
name_list = list(['alex', 'seven', 'eric'])

2、列表类常用功能:
① 切片

name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
print(name_list[0:3])    #取下标0至下标3之间的元素,包括0,不包括3
#['Alex', 'Tenglan', 'Eric']
print(name_list[:3])    #:前什么都不写,表示从0开始,效果跟上句一样
#['Alex', 'Tenglan', 'Eric']
print(name_list[3:])    #:后什么不写,表示取值到最后
#['Rain', 'Tom', 'Amy']
print(name_list[:])     #:前后都不写,表示取值所有
#['Alex', 'Tenglan', 'Eric', 'Rain', 'Tom', 'Amy']
print(name_list[-3:-1])   #从-3开始到-1,包括-3,不包括-1
#['Rain', 'Tom']
print(name_list[1:-1])   #从1开始到-1,下标有正有负时,正数在前负数在后
#['Tenglan', 'Eric', 'Rain', 'Tom']
print(name_list[::2])    #2表示,每个1个元素,就取一个
#['Alex', 'Eric', 'Tom']
#注:[-1:0] [0:0] [-1:2] 都是空

② 追加

name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
name_list.append("new")     #append追加,加到最后,只能添加一个
print(name_list)
#['Alex', 'Tenglan', 'Eric', 'Rain', 'Tom', 'Amy', 'new']

③ 插入

name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
name_list.insert(3,"new")     #insert插入,把"new"加到下标3的位置
print(name_list)

④ 修改

name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
name_list[2] = "solo"        #把下标2的字符串换成solo
print(name_list)

⑤ 删除

#3种删除方式
name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
del name_list[3]           #del删除,指定要删除的下标
print(name_list)
#['Alex', 'Tenglan', 'Eric', 'Tom', 'Amy']
name_list.remove("Tenglan")     #remove删除,指定要删除的字符
print(name_list)
#['Alex', 'Eric', 'Tom', 'Amy']
name_list.pop()            #pop删除,删除列表最后一个值
print(name_list)
#['Alex', 'Eric', 'Tom']

⑥ 扩展

name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
age_list = [11,22,33]
name_list.extend(age_list)        #extend扩展,把列表age_list添加到name_list列表
print(name_list)

⑦ 拷贝

name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
copy_list = name_list.copy()        #copy拷贝,对列表进行复制
print(copy_list)
#注:之后会整理深浅copy的详细区分

⑧ 统计

name_list = ["Alex","Tenglan","Eric","Amy","Tom","Amy"]
print(name_list.count("Amy"))        #count统计,统计列表Amy的个数
#2

⑨ 排序和翻转

name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy","1","2","3"]
name_list.sort()               #sort排序,对列表进行排序
print(name_list)
#['1', '2', '3', 'Alex', 'Amy', 'Eric', 'Rain', 'Tenglan', 'Tom']
name_list.reverse()              #reverse翻转,对列表进行翻转
print(name_list)
#['Tom', 'Tenglan', 'Rain', 'Eric', 'Amy', 'Alex', '3', '2', '1']

⑩ 获取下标

name_list = ["Alex","Tenglan","Eric","Rain","Tom","Amy"]
print(name_list.index("Tenglan"))       #index索引,获取字符的下标
#1

六、元组

如:(11,22,33,44,55)、('wupeiqi', 'alex','lzl')

1、创建元组:

#5种创建方式
age = 11,22,33,44,55      #直接写数字或者字符串,默认创建类型元组 字符串类型用引号'solo'
#输出: (11, 22, 33, 44, 55)  
age = (11,22,33,44,55)     #常见命名方式,()指定类型元组
#输出: (11, 22, 33, 44, 55)
age = tuple((11,22,33,44,55))  #tuple 以类的方式创建(()) 双括号 里面的()不可去掉
#输出: (11, 22, 33, 44, 55)
age = tuple([11,22,33,44,55])  #同(()) 效果一样 很少用 忘记它
#输出: (11, 22, 33, 44, 55)
age = tuple({11,22,33,44,55})  #({})创建的元组,随机排列 没卵用
#输出: (33, 11, 44, 22, 55)

2、元组类常用功能:

##count        #统计元组字符出现的次数   
name = ('wupeiqi', 'alex','solo')
print(name.count('alex'))       
# 1
##index       #查看字符串所在的索引位置
name = ('wupeiqi', 'alex','solo')
print(name.index('solo'))        
# solo

七、字典 无序


如:{'name': 'wupeiqi', 'age': 18} 、{'host': 'solo.solo.solo.solo', 'port': 80}
注:字典一种key:value 的数据类型,也称键值对。字典dict是无序的,key值必须是唯一的,不能有重复。循环时,默认循环的是key
 1、创建字典

#两种创建方式:
info_dic = {'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",}
print(info_dic)
#{'stu1102': 'LongZe Luola', 'stu1101': 'TengLan Wu', 'stu1103': 'XiaoZe Maliya'}
info_dic = dict({'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",})
print(info_dic)
#{'stu1102': 'LongZe Luola', 'stu1101': 'TengLan Wu', 'stu1103': 'XiaoZe Maliya'}

2、字典类常用功能:

① 增加

info_dic = {'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",}
info_dic['stu1104'] = "JingKong Cang"      #增加
print(info_dic)

② 修改

info_dic = {'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",}
info_dic["stu1101"] = "Jingkong Cang"     #有相应的key时为修改,没有为增加
print(info_dic)

 

③ 删除

#3种删除方式
info_dic = {'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",}
info_dic.pop('stu1101')            #pop删除,指定删除的key
print(info_dic)
#{'stu1103': 'XiaoZe Maliya', 'stu1102': 'LongZe Luola'}
del info_dic['stu1102']           #del删除,指定删除的key
print(info_dic)
#{'stu1103': 'XiaoZe Maliya'}
info_dic = {'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",}
info_dic.popitem()               #随机删除,没卵用
print(info_dic)
#{'stu1101': 'TengLan Wu', 'stu1103': 'XiaoZe Maliya'}

 

④ 查找value值

info_dic = {'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",}
print(info_dic.get('stu1102'))   #get查找,通过key查找value值
#LongZe Luola
print(info_dic['stu1102'])     #通过key直接查找,但是如果输入查找的key不存在的话,就会报错,get则不会
#LongZe Luola

 

⑤ 字典多级嵌套

  "欧美":{
    "www.youporn.com": ["很多免费的,世界最大的","质量一般"],
    "www.pornhub.com": ["很多免费的,也很大","质量比yourporn高点"],
    "letmedothistoyou.com": ["多是自拍,高质量图片很多","资源不多,更新慢"],
    "x-art.com":["质量很高,真的很高","全部收费,屌比请绕过"]
  },
  "日韩":{
    "tokyo-hot":["质量怎样不清楚,个人已经不喜欢日韩范了","听说是收费的"]
  },
  "大陆":{
    "1024":["全部免费,真好,好人一生平安","服务器在国外,慢"]
  }
}
 
av_catalog["大陆"]["1024"][1] += ",可以用爬虫爬下来"
print(av_catalog["大陆"]["1024"])
#['全部免费,真好,好人一生平安', '服务器在国外,慢,可以用爬虫爬下来']

⑥ 循环

info_dic = {'stu1101': "TengLan Wu",'stu1102': "LongZe Luola",'stu1103': "XiaoZe Maliya",}
for stu_nu in info_dic:
  print(stu_nu,info_dic[stu_nu])    #循环默认提取的是key
#stu1103 XiaoZe Maliya
#stu1101 TengLan Wu
#stu1102 LongZe Luola
for k,v in info_dic.items():       #先把dict生成list,数据量大的时候费时,不建议使用
  print(k,v)
#stu1103 XiaoZe Maliya
#stu1101 TengLan Wu
#stu1102 LongZe Luola

八、集合
如:{'solo', 33, 'alex', 22, 'eric', 'wupeiqi', 11}
注:集合是一个无序的,不重复的数据组合。去重性,把一个列表变成集合,就自动去重了。关系测试,测试两组数据之前的交集、差集、并集
1、创建集合

 

#标准创建方式
info_set = set(["alex","wupeiqi","eric","solo",11,22,33])
print(info_set,type(info_set))
#{33, 11, 'wupeiqi', 'solo', 'alex', 'eric', 22} <class 'set'>

2、集合类常用功能
① 添加

#添加的两种方式
set_1 = set(["alex","wupeiqi","eric","solo"])
set_1.add(11)             #add只能添加一个元素
print(set_1)
#{'alex', 'solo', 'eric', 11, 'wupeiqi'}
set_1 = set(["alex","wupeiqi","eric","solo"])
set_1.update([11,22,33])
print(set_1)              #update可以添加多个元素
#{33, 11, 'alex', 'wupeiqi', 'eric', 22, 'solo'}

② 删除

#删除的三种方式
set_1 = set(["alex","wupeiqi","eric","solo",11,22,33])
set_1.remove("alex")          #remove 删除指定元素
print(set_1)
#{'eric', 33, 'solo', 11, 22, 'wupeiqi'}
set_1.pop()               #pop 随机删除元素
print(set_1)
#{33, 'wupeiqi', 11, 22, 'solo'}
set_1.discard("solo")          #discard 删除指定元素,与remove区别在于,如果元素不存在也不会报错
set_1.discard(55)
print(set_1)
#{33, 'wupeiqi', 11, 22}

3、集合关系测试
① 交集

set_1 = set(["alex","wupeiqi","eric","solo",11,22,33])
set_2 = set([11,22,33,44,55,66])
print(set_1.intersection(set_2))      #intersection 取两个set的交集 set_1和set_2可以互换位置
#{33, 11, 22}

② 并集

set_1 = set(["alex","wupeiqi","eric","solo",11,22,33])
set_2 = set([11,22,33,44,55,66])
print(set_1.union(set_2))           #union 取两个set集合的并集 set_1和set_2可以互换位置
#{33, 66, 11, 44, 'eric', 55, 'solo', 22, 'wupeiqi', 'alex'}

③ 差集

set_1 = set(["alex","wupeiqi","eric","solo",11,22,33])
set_2 = set([11,22,33,44,55,66])
print(set_1.difference(set_2))         #difference 取两个set集合的差集 set_1有但是set_2没有的集合
#{'solo', 'eric', 'wupeiqi', 'alex'}

④ 子集、父集

set_1 = set(["alex","wupeiqi","eric","solo",11,22,33])
set_2 = set([11,22,33,44,55,66])
set_3 = set([11,22,33])
print(set_1.issubset(set_2))           #issubset 子集
#False
print(set_1.issuperset(set_3))          #issuperset 父集
#True

⑤ 对称差集

set_1 = set(["alex","wupeiqi","eric","solo",11,22,33])
set_2 = set([11,22,33,44,55,66])
print(set_1.symmetric_difference(set_2))      #symmetric_difference 对称差集=两个集合并集减去合集
#{66, 'solo', 'eric', 'alex', 55, 'wupeiqi', 44}

⑥ 运算符做关系测试

set_1 = set(["alex","wupeiqi","eric","solo",11,22,33])
set_2 = set([11,22,33,44,55,66])
set_union = set_1 | set_2      # 并集
set_intersection = set_1 & set_2  # 交集
set_difference = set_1 - set_2   # 差集
set_symmetric_difference = set_1 ^ set_2 # 对称差集

六、模块初识

Python有大量的模块,从而使得开发Python程序非常简洁。类库有包括三中:
① 、Python内部提供的模块
②、业内开源的模块
③、程序员自己开发的模块:Python脚本的名字不要与模块名相同

1、sys模块(系统内置)
① sys.argv 用来捕获执行python脚本时传入的参数
② sys.stdin 标准信息输入
③ sys.stdout 标准定向输出
④ sys.stdout.flush 强制刷新标准输出缓存

import time
import sys
for i in range(5):
  print(i),
  sys.stdout.flush()
  time.sleep(1)
# 这样设计是为了打印一个数每秒五秒钟,但如果您运行它,因为它是现在(取决于您的默认系统缓冲),
# 你可能看不到任何输出 CodeGo.net,直到再一次全部,你会看到0 1 2 3 4打印到屏幕上。
# 这是输出被缓冲,除非你sys.stdout之后每print你不会看到从输出中取出sys.stdout.flush()网上看到的差别

2、os模块(与系统进行交互)
① os.dir、os.popen调用当前系统命令

3、platform模块(识别当前运行的系统)

七、运算符
1、算数运算:

2、比较运算:

3、赋值运算:

4、逻辑运算:

5、成员运算:

6、身份运算:

7、位运算:

8、运算符优先级:

八、深浅拷贝剖析
1、对象赋值(创建列表变量Alex,变量包含子列表,通过变量Alex给变量solo赋值,然后对变量Alex的元素进行修改,此时solo会有什么变化呢?)

import copy           #import调用copy模块
 
Alex = ["Alex", 28, ["Python", "C#", "JavaScript"]]
solo = Alex            #直接赋值
 
#  修改前打印
print(id(Alex))
print(Alex)
print([id(adr) for adr in Alex])
# 输出: 7316664
#    ['Alex', 28, ['Python', 'C#', 'JavaScript']]
#    [2775776, 1398430400, 7318024]
print(id(solo))
print(solo)
print([id(adr) for adr in solo])
# 输出: 7316664
#    ['Alex', 28, ['Python', 'C#', 'JavaScript']]
#    [2775776, 1398430400, 7318024]
 
#  对变量进行修改
Alex[0]='Mr.Wu'
Alex[2].append('CSS')
print(id(Alex))
print(Alex)
print([id(adr) for adr in Alex])
# 输出: 7316664
#    ['Mr.Wu', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
#    [5170528, 1398430400, 7318024]
print(id(solo))
print(solo)
print([id(adr) for adr in solo])
# 输出: 7316664
#    ['Mr.Wu', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
#    [5170528, 1398430400, 7318024]

初始条件: Alex = ["Alex", 28, ["Python", "C#", "JavaScript"]]
对象赋值: solo = Alex           #直接赋值
对象赋值结果:solo = ["Alex", 28, ["Python", "C#", "JavaScript"]]
对象赋值时是进行对象引用(内存地址)的传递,被赋值的变量并没有开辟新内存,两个变量共用一个内存地址

修改对象赋值:solo = ['Mr.Wu', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
str是不可变类型,所以当修改元素Alex为Mr.Wu时,内存地址发生改变;list是可变类型,元素['Python', 'C#', 'JavaScript', 'CSS']修改完后,内存地址没有改变

2、浅拷贝(创建列表变量Alex,变量包含子列表,通过copy模块的浅拷贝函数copy()对变量Alex进行拷贝,当对Alex进行操作时,此时solo会如何变化?)

import copy           #import调用copy模块
Alex = ["Alex", 28, ["Python", "C#", "JavaScript"]]
solo = copy.copy(Alex)            #通过copy模块里面的浅拷贝函数copy()
#  修改前打印
print(id(Alex))
print(Alex)
print([id(adr) for adr in Alex])
# 输出: 10462472
#    ['Alex', 28, ['Python', 'C#', 'JavaScript']]
#    [5462752, 1359960768, 10463232]
print(id(solo))
print(solo)
print([id(adr) for adr in solo])
# 输出: 10201848
#    ['Alex', 28, ['Python', 'C#', 'JavaScript']]
#    [5462752, 1359960768, 10463232]
#  对变量进行修改
Alex[0]='Mr.Wu'
Alex[2].append('CSS')
print(id(Alex))
print(Alex)
print([id(adr) for adr in Alex])
# 输出: 10462472
#    ['Mr.Wu', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
#    [10151264, 1359960768, 10463232]
print(id(solo))
print(solo)
print([id(adr) for adr in solo])
# 输出: 10201848
#    ['Alex', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
#    [5462752, 1359960768, 10463232]

初始条件: Alex = ["Alex", 28, ["Python", "C#", "JavaScript"]]
浅拷贝: solo = copy.copy(Alex)         #通过copy模块里面的浅拷贝函数copy()
浅拷贝结果: solo = ["Alex", 28, ["Python", "C#", "JavaScript"]]
浅拷贝时变量solo新建了一块内存(10201848),此内存记录了list中元素的地址;对于list中的元素,浅拷贝会使用原始元素的引用(内存地址)

修改浅拷贝: solo = ['Mr.Wu', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
str是不可变类型,所以当修改元素Alex为Mr.Wu时,内存地址发生改变;list是可变类型,元素['Python', 'C#', 'JavaScript', 'CSS']修改完后,内存地址没有改变

3、深拷贝(创建列表变量Alex,变量包含子列表,通过copy模块的深拷贝函数deepcopy()对变量Alex进行拷贝,当对Alex进行操作时,此时solo会如何变化?)

#!/usr/bin/env python
# -*- coding:utf-8 -*-
#-Author-Lian
#  深拷贝
import copy           #import调用copy模块
 
Alex = ["Alex", 28, ["Python", "C#", "JavaScript"]]
solo = copy.deepcopy(Alex)            #通过copy模块里面的深拷贝函数deepcopy()
 
#  修改前打印
print(id(Alex))
print(Alex)
print([id(adr) for adr in Alex])
# 输出: 6202712
#    ['Alex', 28, ['Python', 'C#', 'JavaScript']]
#    [4086496, 1363237568, 6203472]
print(id(solo))
print(solo)
print([id(adr) for adr in solo])
# 输出: 6203032
#    ['Alex', 28, ['Python', 'C#', 'JavaScript']]
#    [4086496, 1363237568, 6203512]
 
#  对变量进行修改
Alex[0]='Mr.Wu'
Alex[2].append('CSS')
print(id(Alex))
print(Alex)
print([id(adr) for adr in Alex])
# 输出: 6202712
#    ['Mr.Wu', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
#    [5236064, 1363237568, 6203472]
print(id(solo))
print(solo)
print([id(adr) for adr in solo])
# 输出: 6203032
#    ['Alex', 28, ['Python', 'C#', 'JavaScript']]
#    [4086496, 1363237568, 6203512]

初始条件: Alex = ["Alex", 28, ["Python", "C#", "JavaScript"]]
深拷贝: solo = copy.deepcopy(Alex) #通过copy模块里面的深拷贝函数deepcopy()
深拷贝结果: solo = ["Alex", 28, ["Python", "C#", "JavaScript"]]
深拷贝时变量solo新建了一块内存(10201848),此内存记录了list中元素的地址;但是,对于list中第三个元素(['Python', 'C#', 'JavaScript'])重新生成了一个地址(6203512),此时两个变量的第三个元素的内存引用地址不同

修改深拷贝: solo = ['Mr.Wu', 28, ['Python', 'C#', 'JavaScript', 'CSS']]
str是不可变类型,所以当修改元素Alex为Mr.Wu时,内存地址发生改变;list是可变类型,元素['Python', 'C#', 'JavaScript', 'CSS']修改完后,内存地址没有改变,但是Alex和solo在第三个元素引用的本就不同

4、对于拷贝有一些特殊情况
(1)对于非容器类型(如数字、字符串、和其他'原子'类型的对象)没有拷贝这一说
(2)也就是说,对于这些类型,"obj is copy.copy(obj)" 、"obj is copy.deepcopy(obj)"
(3)如果元祖变量只包含原子类型对象,则不能深拷贝
①为什么要拷贝?
答:当进行修改时,想要保留原来的数据和修改后的数据
②数字字符串 和 集合 在修改时的差异? (深浅拷贝不同的终极原因)
答:在修改数据时:
               数字字符串:在内存中新建一份数据
               集合:修改内存中的同一份数据
③对于集合,如何保留其修改前和修改后的数据?
答:在内存中拷贝一份
④对于集合,如何拷贝其n层元素同时拷贝?
答:深拷贝

 

九、文件操作

(1)打开文件: 文件句柄 = file('文件路径', '模式')
python中打开文件有两种方式,即:open(...) 和 file(...) ,本质上前者在内部会调用后者来进行文件操作,推荐使用 open。
1、打开文件的模式:
  r, 只读模式【默认】
  w,只写模式【不可读;不存在则创建;存在则删除内容;】
  a, 追加模式【不可读;不存在则创建;存在则只追加内容;】
2、"+" 同时读写某个文件:
  r+,可读写文件。【可读;可写;可追加】
  w+,写读
  a+,追加读

总结1:r+模式下,如果在.write()进行写入内容前,有print()输出,则要写的内容会从文件尾部开始写入,使用的是读、追加模式;如果在.write()进行写入内容前,是seek()移动光标,则要写的内容会从移动到的光标开始进行写入,会把原来的内容覆盖掉,而不是整体后移,这点要记住;如果在.write()进行写入内容前,既没有print()也没有seek()光标移动,这种情况之前想的的情况,就是r+读写模式能先写后读吗?r+模式下默认光标在文件的首部,此时会直接从文件开头进行写入,效果等同于seek(0)。关于最后一点,参考a+模式。
总结2:读写模式一定要先写后读吗?能不能先读后写? 如果先读的话,由于用的是w+模式打开的文件,打开后会清空原文件内容,所有读取的到东西是空的。另W+模式后期用的很少,了解即可,包括a+追加读这种模式;另w+模式下,光标会跟随文件写入移到到文件末尾,不用seek移到光标的话,打印内容为空
注:w+模式下,关于.write()跟seek()和print()的关系与r+模式下是一样一样的。w+打开文件后先清空,然后追加写,如果.write()前有seek()的话会从光标位置覆盖写。
总结3:通过上面的程序可以得出,a+模式下光标位置为文件末尾,如果要print()的话要结合seek()进行使用;另外与r+、w+不同的是,.write()与seek()没有关系,只能写内容到文件末尾,一直都是追加模式!

小结

3、"U"表示在读取时,可以将 \r \n \r\n自动转换成 \n (与 r 或 r+ 模式同使用)
  rU
  r+U
4、"b"表示处理二进制文件(如:FTP发送上传ISO镜像文件,linux可忽略,windows处理二进制文件时需标注)
  rb 二进制读
  wb 二进制写(ab也一样)
  ab
(2)文件操作常用功能:
1、read()、readline()、readlines()的区别
  print(info_file.read()) #read参数,读取文件所有内容
  print(info_file.readline()) #readline,只读取文章中的一行内容
  print(info_file.readlines()) #readlines,把文章内容以换行符分割,并生成list格式,数据量大的话不建议使用
2、seek、tell光标
  data = info_file.read() #默认光标在起始位置,.read()读取完后,光标停留到文件末尾
  print(info_file.tell()) #tell 获取当前的光标位
  info_file.seek(0) #seek 移动光标到文件首部
3、文件循环
  for index,line in enumerate(info_file.readlines()): #先把文件内容以行为分割生成列表,数据量大不能用
  for line in info_file: #建议使用方法,每读取一行,内存会把之前的空间清空,不会占用太多内存
4、flush 刷新
  sys.stdout.flush() #flush 强制刷新缓存到内存的数据写入硬盘
5、truncate 截断
  truncate跟光标位置无关,从文件首部开始截取字符;如果是truncate(0)会把文件清空
6、with 语句
  为了避免打开文件后忘记关闭,可以通过管理上下文,即:
    with open('log','r') as f:
    ...
如此方式,当with代码块执行完毕时,内部会自动关闭并释放文件资源。在Python 2.7 后,with又支持同时对多个文件的上下文进行管理,即:
    with open('log1') as obj1, open('log2') as obj2:
    pass 
(3)文件修改方式:
1、把文件读取到内存当中,对内存进行修改,把修改后的内容写入到原文件(旧内容被清空)
2、如果在硬盘上直接写,会进行覆盖,硬盘上不能进行插入,原来的内容不会整体后移,而是直接覆盖掉
3、把文件读取到内存当中,对内存进行修改,把修改的内容另存为新的文件(旧文件保留)
  ① 另存方式
  ② r+模式
  ③ a+模式

十、函数

①格式

def 函数名(参数):
      ....
      函数体
      ....
      return 返回值
  函数名()

②形参:  def func(name): // name 叫做函数func的形式参数,简称:形参
③实参:  func("solo") // 'solo' 叫做函数func的实际参数,简称:实参
④默认参数: def stu_register(name,age,course,country="CN") // 位置参数
⑤关键参数: stu_register(age=22,name='lzl',course="python")  // 关键参数必须放在位置参数之后
⑥动态参数/非固定参数(*args 和 **kwargs):

(1)*args:*args会把多传入的实参变成一个元组的类型;即使传入的是list类型也会变成元组,成为元组中的一个元素;另函数中有*args与其他形参的时候,*args一定要写到其 他形参的后面,否则传入的实参都会被传入到*args当中打印成元组;还有如果没有多出传入的实参即*args没有值的时候,*args为空,不会报错。
(2)**kwargs:**kwargs会把多出的a=b这种类型的实参打印成字典的类型(要区分开与关键参数的区别,关键参数的实参有对应的形参),被当成多余的实参传入到了*args里面,所以**kwargs的值才为空,分别用*inf_list和**info_dict的方式传入到*args、**kwargs当中(stu_register("lzl",*info_list,**info_dict) //传入列表和字典)
总结:*args必须放到**kwargs前面(规定);位置参数一定要放到关键参数之前(规定);默认参数不能跟*args、**kwargs一块存在(会报错)。

⑦return 返回值: 如果不执行return,函数的默认返回值为None;当函数执行到return时,函数结束执行
⑧局部变量: name = "Alex Li" #定义变量name

def change_name(name):
name = "金角大王,一个有Tesla的男人"  #函数内部更改变量
函数内部对变量进行更改后,生效范围仅限于函数内部,对外部变量没有影响,这种变量称为局部变量;函数内部也可以让变量全局生效,需要加参数global,这种情况很少用。

⑨递归函数: 如果一个函数在内部调用自身本身,这个函数就是递归函数
条件: 有结束条件、更深一层递归规模比上次递归有所减少、效率不高,递归层次过多会导致栈溢出

写一个递归:
       def func(n1,n2):    #获取斐波那契数列100之前的数字
         if n1 > 100:
           return
         print(n1)
         n3 = n1 + n2
         func(n2,n3)
       func(0,1)

⑩匿名函数:不需要显式的指定函数

#普通函数        #换成匿名函数
def calc(n):      calc = lambda n:n**n
return n**n     print(calc(10)
print(calc(10))

⑪高阶函数: 变量可以指向函数,函数的参数能接收变量,那么一个函数就可以接收另一个函数作为参数,这种函数就称之为高阶函数。

def add(x,y,f):
return f(x) + f(y)
res = add(3,-6,abs)
print(res)

⑫内置函数

⑬函数的调用顺序:被调用函数要在执行之前被定义

#函数错误的调用方式
def func():           #定义函数func()
  print("in the func")
  foo()            #调用函数foo()
func()             #执行函数func()
def foo():           #定义函数foo()
  print("in the foo")

#函数正确的调用方式
def func():           #定义函数func()
  print("in the func")
  foo()            #调用函数foo()
def foo():           #定义函数foo()
  print("in the foo")
func()             #执行函数func()

⑭高阶函数:1、某一函数当做参数传入另一个函数中。2、函数的返回值包含一个或多个函数

⑮内嵌函数:在一个函数体内创建另外一个函数(内嵌函数中定义的函数在全局中是无法直接执行的)

⑯装饰器:本质是函数(装饰其他函数),为其他函数添加附加功能的。
遵循原则: 1.不能修改被装饰函数的源代码 2.不能修改被装饰函数的调用方式
组成:装饰器由高阶函数+内嵌函数组成

⑰生成器:调用时才会生成相应数据的机制,称为生成器:generator

应用:可通过yield实现在单线程的情况下实现并发运算的效果(协程)

#!/usr/bin/env python
# -*- coding:utf-8 -*-
#-Author-solo
import time
def consumer(name):
  print("%s 准备吃包子啦!" %name)
  while True:
    baozi = yield      #yield的作用:保存当前状态并返回
 
    print("包子[%s]来了,被[%s]吃了!" %(baozi,name))
def producer(name):
  c = consumer('A')      
  c2 = consumer('B')    
  c.__next__()        #c.__next__()等同于next(c)
  c2.__next__()        #next()作用:调用yield,不给yield传值
  print("老子开始准备做包子啦!")
  for i in range(10):
    time.sleep(1)
    print("%s做了2个包子!"%(name))
    c.send(i)        #send()作用:调用yield,给yield传值
    c2.send(i)
producer("solo")

协程

 

可迭代对象:可以直接作用于for循环的对象:Iterable
可以直接作用于for循环的数据类型有:1、集合数据类型,如list、tuple、dict、set、str等;2、生成器,包括generator和带yield的generator function;
可以用isinstance()去判断一个对象是否是Iterable对象

from collections import Iterable
print(isinstance([], Iterable))
# True

迭代器:可以被next()函数调用并不断返回下一个值的对象称为迭代器:Iterator。
用isinstance()判断一个对象是否是Iterator对象

from collections import Iterator
print(isinstance([], Iterator))
# True
小结:
1、凡是可作用于for循环的对象都是Iterable类型;
2、凡是可作用于next()函数的对象都是Iterator类型,它们表示一个惰性计算的序列;
3、集合数据类型如list、dict、str等是Iterable但不是Iterator,不过可以通过iter()函数获得一个Iterator对象;
4、Python的for循环本质上就是通过不断调用next()函数实现的
for x in [1, 2, 3, 4, 5]:
pass
首先获得Iterator对象:
it = iter([1, 2, 3, 4, 5])
# 循环:
while True:
  try:
    # 获得下一个值:
    x = next(it)
  except StopIteration:
    # 遇到StopIteration就退出循环
    break

等价效果(迭代器)

十一、常用模块

(一)、导入模块:导入模块的本质就是把python文件解释一遍;导入包的本质就是把包文件下面的init.py文件运行一遍

(二)、常用模块:

(1)time和datatime模块
时间相关的操作,时间有三种表示方式:1、时间戳 1970年1月1日之后的秒,即:time.time()
2、格式化的字符串 2014-11-11 11:11, 即:time.strftime('%Y-%m-%d')
3、结构化时间 元组包含了:年、日、星期等... time.struct_time 即:time.localtime()

import time
print(time.time())       #时间戳
#1472037866.0750718
print(time.localtime())    #结构化时间
#time.struct_time(tm_year=2016, tm_mon=8, tm_mday=25, tm_hour=8, tm_min=44, tm_sec=46, tm_wday=3, tm_yday=238, tm_isdst=0)
print(time.strftime('%Y-%m-%d'))  #格式化的字符串
#2016-08-25
print(time.strftime('%Y-%m-%d',time.localtime()))
#2016-08-25
print(time.gmtime())      #结构化时间
#time.struct_time(tm_year=2016, tm_mon=8, tm_mday=25, tm_hour=3, tm_min=8, tm_sec=48, tm_wday=3, tm_yday=238, tm_isdst=0)
print(time.strptime('2014-11-11', '%Y-%m-%d')) #结构化时间
#time.struct_time(tm_year=2014, tm_mon=11, tm_mday=11, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=1, tm_yday=315, tm_isdst=-1)
print(time.asctime())
#Thu Aug 25 11:15:10 2016
print(time.asctime(time.localtime()))
#Thu Aug 25 11:15:10 2016
print(time.ctime(time.time()))
#Thu Aug 25 11:15:10 2016

time模块
import datetime
print(datetime.date)  #表示日期的类。常用的属性有year, month, day
#<class 'datetime.date'>
print(datetime.time)  #表示时间的类。常用的属性有hour, minute, second, microsecond
#<class 'datetime.time'>
print(datetime.datetime)    #表示日期时间
#<class 'datetime.datetime'>
print(datetime.timedelta)    #表示时间间隔,即两个时间点之间的长度
#<class 'datetime.timedelta'>
print(datetime.datetime.now())
#2016-08-25 14:21:07.722285
print(datetime.datetime.now() - datetime.timedelta(days=5))
#2016-08-20 14:21:28.275460

datetime模块
import time

str = '2017-03-26 3:12'
str2 = '2017-05-26 13:12'
date1 = time.strptime(str, '%Y-%m-%d %H:%M')
date2 = time.strptime(str2, '%Y-%m-%d %H:%M')
if float(time.time()) >= float(time.mktime(date1)) and float(time.time()) <= float(time.mktime(date2)):
  print 'cccccccc'


import datetime

str = '2017-03-26 3:12'
str2 = '2017-05-26 13:12'
date1 = datetime.datetime.strptime(str,'%Y-%m-%d %H:%M')
date2 = datetime.datetime.strptime(str2,'%Y-%m-%d %H:%M')
datenow = datetime.datetime.now()
if datenow <date1:
  print 'dddddd'

时间比较

时间比较

(2)andom模块:生成随机数(验证码)

#random随机数模块
import random
print(random.random())   #生成0到1的随机数
#0.7308387398872364
print(random.randint(1,3)) #生成1-3随机数
#3
print(random.randrange(1,3)) #生成1-2随机数,不包含3
#2
print(random.choice("hello")) #随机选取字符串
#e
print(random.sample("hello",2))   #随机选取特定的字符
#['l', 'h']
items = [1,2,3,4,5,6,7]
random.shuffle(items)
print(items)
#[2, 3, 1, 6, 4, 7, 5]

生成随机数
import random
checkcode = ''
for i in range(4):
  current = random.randrange(0,4)
  if current != i:
    temp = chr(random.randint(65,90))
  else:
    temp = random.randint(0,9)
  checkcode += str(temp)
print(checkcode)
#51T6

验证码

(3)os模块:用于提供系统级别的操作(比如目录、路径等的操作)

import os
os.getcwd() #获取当前工作目录,即当前python脚本工作的目录路径
os.chdir("dirname") #改变当前脚本工作目录;相当于shell下cd
os.curdir #返回当前目录: ('.')
os.pardir #获取当前目录的父目录字符串名:('..')
os.makedirs('dirname1/dirname2')  #可生成多层递归目录
os.removedirs('dirname1')  # 若目录为空,则删除,并递归到上一级目录,如若也为空,则删除,依此类推
os.mkdir('dirname')  # 生成单级目录;相当于shell中mkdir dirname
os.rmdir('dirname')  #删除单级空目录,若目录不为空则无法删除,报错;相当于shell中rmdir dirname
os.listdir('dirname')  #列出指定目录下的所有文件和子目录,包括隐藏文件,并以列表方式打印
os.remove() # 删除一个文件
os.rename("oldname","newname") # 重命名文件/目录
os.stat('path/filename') # 获取文件/目录信息
os.sep  #输出操作系统特定的路径分隔符,win下为"\\",Linux下为"/"
os.linesep  #输出当前平台使用的行终止符,win下为"\t\n",Linux下为"\n"
os.pathsep  #输出用于分割文件路径的字符串
os.name  #输出字符串指示当前使用平台。win->'nt'; Linux->'posix'
os.system("bash command") #运行shell命令,直接显示
os.environ #获取系统环境变量
os.path.abspath(path) #返回path规范化的绝对路径
os.path.split(path) #将path分割成目录和文件名二元组返回
os.path.dirname(path) # 返回path的目录。其实就是os.path.split(path)的第一个元素
os.path.basename(path) # 返回path最后的文件名。如何path以/或\结尾,那么就会返回空值。即os.path.split(path)的第二个元素
os.path.exists(path) #如果path存在,返回True;如果path不存在,返回False
os.path.isabs(path) #如果path是绝对路径,返回True
os.path.isfile(path) #如果path是一个存在的文件,返回True。否则返回False
os.path.isdir(path) #如果path是一个存在的目录,则返回True。否则返回False
os.path.join(path1[, path2[, ...]]) # 将多个路径组合后返回,第一个绝对路径之前的参数将被忽略
os.path.getatime(path) #返回path所指向的文件或者目录的最后存取时间
os.path.getmtime(path) #返回path所指向的文件或者目录的最后修改时间

os模块

(4)sys模块:用于提供对解释器相关的操作(比如退出程序、版本信息等)

import sys
sys.argv      #命令行参数List,第一个元素是程序本身路径
sys.exit(n)    #退出程序,正常退出时exit(0)
sys.version    # 获取Python解释程序的版本信息
sys.maxint     #最大的Int值
sys.path      #返回模块的搜索路径,初始化时使用PYTHONPATH环境变量的值
sys.platform   #返回操作系统平台名称
sys.stdout.write('please:')
val = sys.stdin.readline()[:-1]

sys模块

(5)shutil模块:高级的(文件、文件夹、压缩包)处理模块 (比如文件的拷贝、压缩等)

① shutil.copyfileobj 将文件内容拷贝到另一个文件中,可以部分内容

def copyfileobj(fsrc, fdst, length=16*1024):
  """copy data from file-like object fsrc to file-like object fdst"""
  while 1:
    buf = fsrc.read(length)
    if not buf:
      break
    fdst.write(buf)

shutil.copyfileobj
import shutil
f1 = open("fsrc",encoding="utf-8")
f2 = open("fdst",encoding="utf-8")
shutil.copyfile(f1,f2)
#把文件f1里的内容拷贝到f2当中

② shutil.copyfile 文件拷贝

def copyfile(src, dst):
  """Copy data from src to dst"""
  if _samefile(src, dst):
    raise Error("`%s` and `%s` are the same file" % (src, dst))
  for fn in [src, dst]:
    try:
      st = os.stat(fn)
    except OSError:
      # File most likely does not exist
      pass
    else:
      # XXX What about other special files? (sockets, devices...)
      if stat.S_ISFIFO(st.st_mode):
        raise SpecialFileError("`%s` is a named pipe" % fn)
  with open(src, 'rb') as fsrc:
    with open(dst, 'wb') as fdst:
      copyfileobj(fsrc, fdst)

shutil.copyfile
import shutil
shutil.copyfile("f1","f2")
#把文件f1里的内容拷贝到f2当中

③ shutil.copymode(src, dst) 仅拷贝权限。内容、组、用户均不变

def copymode(src, dst):
  """Copy mode bits from src to dst"""
  if hasattr(os, 'chmod'):
    st = os.stat(src)
    mode = stat.S_IMODE(st.st_mode)
    os.chmod(dst, mode)

shutil.copymode

④ shutil.copystat(src, dst) 拷贝状态的信息,包括:mode bits, atime, mtime, flags

def copystat(src, dst):
  """Copy all stat info (mode bits, atime, mtime, flags) from src to dst"""
  st = os.stat(src)
  mode = stat.S_IMODE(st.st_mode)
  if hasattr(os, 'utime'):
    os.utime(dst, (st.st_atime, st.st_mtime))
  if hasattr(os, 'chmod'):
    os.chmod(dst, mode)
  if hasattr(os, 'chflags') and hasattr(st, 'st_flags'):
    try:
      os.chflags(dst, st.st_flags)
    except OSError, why:
      for err in 'EOPNOTSUPP', 'ENOTSUP':
        if hasattr(errno, err) and why.errno == getattr(errno, err):
          break
      else:
        raise

shutil.copystat

⑤ shutil.copy(src, dst) 拷贝文件和权限

def copy(src, dst):
  """Copy data and mode bits ("cp src dst").

  The destination may be a directory.

  """
  if os.path.isdir(dst):
    dst = os.path.join(dst, os.path.basename(src))
  copyfile(src, dst)
  copymode(src, dst)

shutil.copy

⑥ shutil.copy2(src, dst) 拷贝文件和状态信息

def copy2(src, dst):
  """Copy data and all stat info ("cp -p src dst").

  The destination may be a directory.

  """
  if os.path.isdir(dst):
    dst = os.path.join(dst, os.path.basename(src))
  copyfile(src, dst)
  copystat(src, dst)

shutil.copy2

⑦ shutil.copytree(src, dst, symlinks=False, ignore=None) 递归的去拷贝文件 拷贝多层目录

def ignore_patterns(*patterns):
  """Function that can be used as copytree() ignore parameter.

  Patterns is a sequence of glob-style patterns
  that are used to exclude files"""
  def _ignore_patterns(path, names):
    ignored_names = []
    for pattern in patterns:
      ignored_names.extend(fnmatch.filter(names, pattern))
    return set(ignored_names)
  return _ignore_patterns
def copytree(src, dst, symlinks=False, ignore=None):
  """Recursively copy a directory tree using copy2().

  The destination directory must not already exist.
  If exception(s) occur, an Error is raised with a list of reasons.

  If the optional symlinks flag is true, symbolic links in the
  source tree result in symbolic links in the destination tree; if
  it is false, the contents of the files pointed to by symbolic
  links are copied.

  The optional ignore argument is a callable. If given, it
  is called with the `src` parameter, which is the directory
  being visited by copytree(), and `names` which is the list of
  `src` contents, as returned by os.listdir():

    callable(src, names) -> ignored_names

  Since copytree() is called recursively, the callable will be
  called once for each directory that is copied. It returns a
  list of names relative to the `src` directory that should
  not be copied.

  XXX Consider this example code rather than the ultimate tool.

  """
  names = os.listdir(src)
  if ignore is not None:
    ignored_names = ignore(src, names)
  else:
    ignored_names = set()

  os.makedirs(dst)
  errors = []
  for name in names:
    if name in ignored_names:
      continue
    srcname = os.path.join(src, name)
    dstname = os.path.join(dst, name)
    try:
      if symlinks and os.path.islink(srcname):
        linkto = os.readlink(srcname)
        os.symlink(linkto, dstname)
      elif os.path.isdir(srcname):
        copytree(srcname, dstname, symlinks, ignore)
      else:
        # Will raise a SpecialFileError for unsupported file types        copy2(srcname, dstname)
    # catch the Error from the recursive copytree so that we can
    # continue with other files
    except Error, err:
      errors.extend(err.args[0])
    except EnvironmentError, why:
      errors.append((srcname, dstname, str(why)))
  try:
    copystat(src, dst)
  except OSError, why:
    if WindowsError is not None and isinstance(why, WindowsError):
      # Copying file access times may fail on Windows
      pass
    else:
      errors.append((src, dst, str(why)))
  if errors:
    raise Error, errors

shutil.copytree

⑧ shutil.rmtree(path[, ignore_errors[, onerror]]) 递归的去删除文件

def rmtree(path, ignore_errors=False, onerror=None):
  """Recursively delete a directory tree.

  If ignore_errors is set, errors are ignored; otherwise, if onerror
  is set, it is called to handle the error with arguments (func,
  path, exc_info) where func is os.listdir, os.remove, or os.rmdir;
  path is the argument to that function that caused it to fail; and
  exc_info is a tuple returned by sys.exc_info(). If ignore_errors
  is false and onerror is None, an exception is raised.

  """
  if ignore_errors:
    def onerror(*args):
      pass
  elif onerror is None:
    def onerror(*args):
      raise
  try:
    if os.path.islink(path):
      # symlinks to directories are forbidden, see bug #1669
      raise OSError("Cannot call rmtree on a symbolic link")
  except OSError:
    onerror(os.path.islink, path, sys.exc_info())
    # can't continue even if onerror hook returns
    return
  names = []
  try:
    names = os.listdir(path)
  except os.error, err:
    onerror(os.listdir, path, sys.exc_info())
  for name in names:
    fullname = os.path.join(path, name)
    try:
      mode = os.lstat(fullname).st_mode
    except os.error:
      mode = 0
    if stat.S_ISDIR(mode):
      rmtree(fullname, ignore_errors, onerror)
    else:
      try:
        os.remove(fullname)
      except os.error, err:
        onerror(os.remove, fullname, sys.exc_info())
  try:
    os.rmdir(path)
  except os.error:
    onerror(os.rmdir, path, sys.exc_info())

shutil.rmtree

⑨ shutil.move(src, dst) 递归的去移动文件

def move(src, dst):
  """Recursively move a file or directory to another location. This is
  similar to the Unix "mv" command.

  If the destination is a directory or a symlink to a directory, the source
  is moved inside the directory. The destination path must not already
  exist.

  If the destination already exists but is not a directory, it may be
  overwritten depending on os.rename() semantics.

  If the destination is on our current filesystem, then rename() is used.
  Otherwise, src is copied to the destination and then removed.
  A lot more could be done here... A look at a mv.c shows a lot of
  the issues this implementation glosses over.

  """
  real_dst = dst
  if os.path.isdir(dst):
    if _samefile(src, dst):
      # We might be on a case insensitive filesystem,
      # perform the rename anyway.      os.rename(src, dst)
      return

    real_dst = os.path.join(dst, _basename(src))
    if os.path.exists(real_dst):
      raise Error, "Destination path '%s' already exists" % real_dst
  try:
    os.rename(src, real_dst)
  except OSError:
    if os.path.isdir(src):
      if _destinsrc(src, dst):
        raise Error, "Cannot move a directory '%s' into itself '%s'." % (src, dst)
      copytree(src, real_dst, symlinks=True)
      rmtree(src)
    else:
      copy2(src, real_dst)
      os.unlink(src)

shutil.move

⑩ shutil.make_archive(base_name, format,...) 创建压缩包并返回文件路径,例如:zip、tar
base_name: 压缩包的文件名,也可以是压缩包的路径。只是文件名时,则保存至当前目录,否则保存至指定路径,
        如:www                        =>保存至当前路径
        如:/Users/wupeiqi/www =>保存至/Users/wupeiqi/
format: 压缩包种类,“zip”, “tar”, “bztar”,“gztar”
root_dir: 要压缩的文件夹路径(默认当前目录)
owner: 用户,默认当前用户
group: 组,默认当前组
logger: 用于记录日志,通常是logging.Logger对象

def make_archive(base_name, format, root_dir=None, base_dir=None, verbose=0,
         dry_run=0, owner=None, group=None, logger=None):
  """Create an archive file (eg. zip or tar).

  'base_name' is the name of the file to create, minus any format-specific
  extension; 'format' is the archive format: one of "zip", "tar", "bztar"
  or "gztar".

  'root_dir' is a directory that will be the root directory of the
  archive; ie. we typically chdir into 'root_dir' before creating the
  archive. 'base_dir' is the directory where we start archiving from;
  ie. 'base_dir' will be the common prefix of all files and
  directories in the archive. 'root_dir' and 'base_dir' both default
  to the current directory. Returns the name of the archive file.

  'owner' and 'group' are used when creating a tar archive. By default,
  uses the current owner and group.
  """
  save_cwd = os.getcwd()
  if root_dir is not None:
    if logger is not None:
      logger.debug("changing into '%s'", root_dir)
    base_name = os.path.abspath(base_name)
    if not dry_run:
      os.chdir(root_dir)

  if base_dir is None:
    base_dir = os.curdir

  kwargs = {'dry_run': dry_run, 'logger': logger}

  try:
    format_info = _ARCHIVE_FORMATS[format]
  except KeyError:
    raise ValueError, "unknown archive format '%s'" % format

  func = format_info[0]
  for arg, val in format_info[1]:
    kwargs[arg] = val

  if format != 'zip':
    kwargs['owner'] = owner
    kwargs['group'] = group

  try:
    filename = func(base_name, base_dir, **kwargs)
  finally:
    if root_dir is not None:
      if logger is not None:
        logger.debug("changing back to '%s'", save_cwd)
      os.chdir(save_cwd)

  return filename

源码

源码

shutil 对压缩包的处理是调用 ZipFile 和 TarFile 两个模块来进行的,详细:

import zipfile

# 压缩
z = zipfile.ZipFile('laxi.zip', 'w')
z.write('a.log')
z.write('data.data')
z.close()

# 解压
z = zipfile.ZipFile('laxi.zip', 'r')
z.extractall()
z.close()

zipfile 压缩解压

zipfile 压缩解压

zipfile 压缩解压
import tarfile

# 压缩
tar = tarfile.open('your.tar','w')
tar.add('/Users/wupeiqi/PycharmProjects/bbs2.zip', arcname='bbs2.zip')
tar.add('/Users/wupeiqi/PycharmProjects/cmdb.zip', arcname='cmdb.zip')
tar.close()

# 解压
tar = tarfile.open('your.tar','r')
tar.extractall() # 可设置解压地址
tar.close()

tarfile 压缩解压

tarfile 压缩解压

tarfile 压缩解压
class ZipFile(object):
  """ Class with methods to open, read, write, close, list zip files.

  z = ZipFile(file, mode="r", compression=ZIP_STORED, allowZip64=False)

  file: Either the path to the file, or a file-like object.
     If it is a path, the file will be opened and closed by ZipFile.
  mode: The mode can be either read "r", write "w" or append "a".
  compression: ZIP_STORED (no compression) or ZIP_DEFLATED (requires zlib).
  allowZip64: if True ZipFile will create files with ZIP64 extensions when
        needed, otherwise it will raise an exception when this would
        be necessary.

  """

  fp = None          # Set here since __del__ checks it

  def __init__(self, file, mode="r", compression=ZIP_STORED, allowZip64=False):
    """Open the ZIP file with mode read "r", write "w" or append "a"."""
    if mode not in ("r", "w", "a"):
      raise RuntimeError('ZipFile() requires mode "r", "w", or "a"')

    if compression == ZIP_STORED:
      pass
    elif compression == ZIP_DEFLATED:
      if not zlib:
        raise RuntimeError,\
           "Compression requires the (missing) zlib module"
    else:
      raise RuntimeError, "That compression method is not supported"

    self._allowZip64 = allowZip64
    self._didModify = False
    self.debug = 0 # Level of printing: 0 through 3
    self.NameToInfo = {}  # Find file info given name
    self.filelist = []   # List of ZipInfo instances for archive
    self.compression = compression # Method of compression
    self.mode = key = mode.replace('b', '')[0]
    self.pwd = None
    self._comment = ''

    # Check if we were passed a file-like object
    if isinstance(file, basestring):
      self._filePassed = 0
      self.filename = file
      modeDict = {'r' : 'rb', 'w': 'wb', 'a' : 'r+b'}
      try:
        self.fp = open(file, modeDict[mode])
      except IOError:
        if mode == 'a':
          mode = key = 'w'
          self.fp = open(file, modeDict[mode])
        else:
          raise
    else:
      self._filePassed = 1
      self.fp = file
      self.filename = getattr(file, 'name', None)

    try:
      if key == 'r':
        self._RealGetContents()
      elif key == 'w':
        # set the modified flag so central directory gets written
        # even if no files are added to the archive
        self._didModify = True
      elif key == 'a':
        try:
          # See if file is a zip file
          self._RealGetContents()
          # seek to start of directory and overwrite
          self.fp.seek(self.start_dir, 0)
        except BadZipfile:
          # file is not a zip file, just append
          self.fp.seek(0, 2)

          # set the modified flag so central directory gets written
          # even if no files are added to the archive
          self._didModify = True
      else:
        raise RuntimeError('Mode must be "r", "w" or "a"')
    except:
      fp = self.fp
      self.fp = None
      if not self._filePassed:
        fp.close()
      raise

  def __enter__(self):
    return self

  def __exit__(self, type, value, traceback):
    self.close()

  def _RealGetContents(self):
    """Read in the table of contents for the ZIP file."""
    fp = self.fp
    try:
      endrec = _EndRecData(fp)
    except IOError:
      raise BadZipfile("File is not a zip file")
    if not endrec:
      raise BadZipfile, "File is not a zip file"
    if self.debug > 1:
      print endrec
    size_cd = endrec[_ECD_SIZE]       # bytes in central directory
    offset_cd = endrec[_ECD_OFFSET]     # offset of central directory
    self._comment = endrec[_ECD_COMMENT]  # archive comment

    # "concat" is zero, unless zip was concatenated to another file
    concat = endrec[_ECD_LOCATION] - size_cd - offset_cd
    if endrec[_ECD_SIGNATURE] == stringEndArchive64:
      # If Zip64 extension structures are present, account for them
      concat -= (sizeEndCentDir64 + sizeEndCentDir64Locator)

    if self.debug > 2:
      inferred = concat + offset_cd
      print "given, inferred, offset", offset_cd, inferred, concat
    # self.start_dir: Position of start of central directory
    self.start_dir = offset_cd + concat
    fp.seek(self.start_dir, 0)
    data = fp.read(size_cd)
    fp = cStringIO.StringIO(data)
    total = 0
    while total < size_cd:
      centdir = fp.read(sizeCentralDir)
      if len(centdir) != sizeCentralDir:
        raise BadZipfile("Truncated central directory")
      centdir = struct.unpack(structCentralDir, centdir)
      if centdir[_CD_SIGNATURE] != stringCentralDir:
        raise BadZipfile("Bad magic number for central directory")
      if self.debug > 2:
        print centdir
      filename = fp.read(centdir[_CD_FILENAME_LENGTH])
      # Create ZipInfo instance to store file information
      x = ZipInfo(filename)
      x.extra = fp.read(centdir[_CD_EXTRA_FIELD_LENGTH])
      x.comment = fp.read(centdir[_CD_COMMENT_LENGTH])
      x.header_offset = centdir[_CD_LOCAL_HEADER_OFFSET]
      (x.create_version, x.create_system, x.extract_version, x.reserved,
        x.flag_bits, x.compress_type, t, d,
        x.CRC, x.compress_size, x.file_size) = centdir[1:12]
      x.volume, x.internal_attr, x.external_attr = centdir[15:18]
      # Convert date/time code to (year, month, day, hour, min, sec)
      x._raw_time = t
      x.date_time = ( (d>>9)+1980, (d>>5)&0xF, d&0x1F,
                   t>>11, (t>>5)&0x3F, (t&0x1F) * 2 )

      x._decodeExtra()
      x.header_offset = x.header_offset + concat
      x.filename = x._decodeFilename()
      self.filelist.append(x)
      self.NameToInfo[x.filename] = x

      # update total bytes read from central directory
      total = (total + sizeCentralDir + centdir[_CD_FILENAME_LENGTH]
           + centdir[_CD_EXTRA_FIELD_LENGTH]
           + centdir[_CD_COMMENT_LENGTH])

      if self.debug > 2:
        print "total", total


  def namelist(self):
    """Return a list of file names in the archive."""
    l = []
    for data in self.filelist:
      l.append(data.filename)
    return l

  def infolist(self):
    """Return a list of class ZipInfo instances for files in the
    archive."""
    return self.filelist

  def printdir(self):
    """Print a table of contents for the zip file."""
    print "%-46s %19s %12s" % ("File Name", "Modified  ", "Size")
    for zinfo in self.filelist:
      date = "%d-%02d-%02d %02d:%02d:%02d" % zinfo.date_time[:6]
      print "%-46s %s %12d" % (zinfo.filename, date, zinfo.file_size)

  def testzip(self):
    """Read all the files and check the CRC."""
    chunk_size = 2 ** 20
    for zinfo in self.filelist:
      try:
        # Read by chunks, to avoid an OverflowError or a
        # MemoryError with very large embedded files.
        with self.open(zinfo.filename, "r") as f:
          while f.read(chunk_size):   # Check CRC-32
            pass
      except BadZipfile:
        return zinfo.filename

  def getinfo(self, name):
    """Return the instance of ZipInfo given 'name'."""
    info = self.NameToInfo.get(name)
    if info is None:
      raise KeyError(
        'There is no item named %r in the archive' % name)

    return info

  def setpassword(self, pwd):
    """Set default password for encrypted files."""
    self.pwd = pwd

  @property
  def comment(self):
    """The comment text associated with the ZIP file."""
    return self._comment

  @comment.setter
  def comment(self, comment):
    # check for valid comment length
    if len(comment) > ZIP_MAX_COMMENT:
      import warnings
      warnings.warn('Archive comment is too long; truncating to %d bytes'
             % ZIP_MAX_COMMENT, stacklevel=2)
      comment = comment[:ZIP_MAX_COMMENT]
    self._comment = comment
    self._didModify = True

  def read(self, name, pwd=None):
    """Return file bytes (as a string) for name."""
    return self.open(name, "r", pwd).read()

  def open(self, name, mode="r", pwd=None):
    """Return file-like object for 'name'."""
    if mode not in ("r", "U", "rU"):
      raise RuntimeError, 'open() requires mode "r", "U", or "rU"'
    if not self.fp:
      raise RuntimeError, \
         "Attempt to read ZIP archive that was already closed"

    # Only open a new file for instances where we were not
    # given a file object in the constructor
    if self._filePassed:
      zef_file = self.fp
      should_close = False
    else:
      zef_file = open(self.filename, 'rb')
      should_close = True

    try:
      # Make sure we have an info object
      if isinstance(name, ZipInfo):
        # 'name' is already an info object
        zinfo = name
      else:
        # Get info object for name
        zinfo = self.getinfo(name)

      zef_file.seek(zinfo.header_offset, 0)

      # Skip the file header:
      fheader = zef_file.read(sizeFileHeader)
      if len(fheader) != sizeFileHeader:
        raise BadZipfile("Truncated file header")
      fheader = struct.unpack(structFileHeader, fheader)
      if fheader[_FH_SIGNATURE] != stringFileHeader:
        raise BadZipfile("Bad magic number for file header")

      fname = zef_file.read(fheader[_FH_FILENAME_LENGTH])
      if fheader[_FH_EXTRA_FIELD_LENGTH]:
        zef_file.read(fheader[_FH_EXTRA_FIELD_LENGTH])

      if fname != zinfo.orig_filename:
        raise BadZipfile, \
            'File name in directory "%s" and header "%s" differ.' % (
              zinfo.orig_filename, fname)

      # check for encrypted flag & handle password
      is_encrypted = zinfo.flag_bits & 0x1
      zd = None
      if is_encrypted:
        if not pwd:
          pwd = self.pwd
        if not pwd:
          raise RuntimeError, "File %s is encrypted, " \
            "password required for extraction" % name

        zd = _ZipDecrypter(pwd)
        # The first 12 bytes in the cypher stream is an encryption header
        # used to strengthen the algorithm. The first 11 bytes are
        # completely random, while the 12th contains the MSB of the CRC,
        # or the MSB of the file time depending on the header type
        # and is used to check the correctness of the password.
        bytes = zef_file.read(12)
        h = map(zd, bytes[0:12])
        if zinfo.flag_bits & 0x8:
          # compare against the file type from extended local headers
          check_byte = (zinfo._raw_time >> 8) & 0xff
        else:
          # compare against the CRC otherwise
          check_byte = (zinfo.CRC >> 24) & 0xff
        if ord(h[11]) != check_byte:
          raise RuntimeError("Bad password for file", name)

      return ZipExtFile(zef_file, mode, zinfo, zd,
          close_fileobj=should_close)
    except:
      if should_close:
        zef_file.close()
      raise

  def extract(self, member, path=None, pwd=None):
    """Extract a member from the archive to the current working directory,
      using its full name. Its file information is extracted as accurately
      as possible. `member' may be a filename or a ZipInfo object. You can
      specify a different directory using `path'.
    """
    if not isinstance(member, ZipInfo):
      member = self.getinfo(member)

    if path is None:
      path = os.getcwd()

    return self._extract_member(member, path, pwd)

  def extractall(self, path=None, members=None, pwd=None):
    """Extract all members from the archive to the current working
      directory. `path' specifies a different directory to extract to.
      `members' is optional and must be a subset of the list returned
      by namelist().
    """
    if members is None:
      members = self.namelist()

    for zipinfo in members:
      self.extract(zipinfo, path, pwd)

  def _extract_member(self, member, targetpath, pwd):
    """Extract the ZipInfo object 'member' to a physical
      file on the path targetpath.
    """
    # build the destination pathname, replacing
    # forward slashes to platform specific separators.
    arcname = member.filename.replace('/', os.path.sep)

    if os.path.altsep:
      arcname = arcname.replace(os.path.altsep, os.path.sep)
    # interpret absolute pathname as relative, remove drive letter or
    # UNC path, redundant separators, "." and ".." components.
    arcname = os.path.splitdrive(arcname)[1]
    arcname = os.path.sep.join(x for x in arcname.split(os.path.sep)
          if x not in ('', os.path.curdir, os.path.pardir))
    if os.path.sep == '\\':
      # filter illegal characters on Windows
      illegal = ':<>|"?*'
      if isinstance(arcname, unicode):
        table = {ord(c): ord('_') for c in illegal}
      else:
        table = string.maketrans(illegal, '_' * len(illegal))
      arcname = arcname.translate(table)
      # remove trailing dots
      arcname = (x.rstrip('.') for x in arcname.split(os.path.sep))
      arcname = os.path.sep.join(x for x in arcname if x)

    targetpath = os.path.join(targetpath, arcname)
    targetpath = os.path.normpath(targetpath)

    # Create all upper directories if necessary.
    upperdirs = os.path.dirname(targetpath)
    if upperdirs and not os.path.exists(upperdirs):
      os.makedirs(upperdirs)

    if member.filename[-1] == '/':
      if not os.path.isdir(targetpath):
        os.mkdir(targetpath)
      return targetpath

    with self.open(member, pwd=pwd) as source, \
       file(targetpath, "wb") as target:
      shutil.copyfileobj(source, target)

    return targetpath

  def _writecheck(self, zinfo):
    """Check for errors before writing a file to the archive."""
    if zinfo.filename in self.NameToInfo:
      import warnings
      warnings.warn('Duplicate name: %r' % zinfo.filename, stacklevel=3)
    if self.mode not in ("w", "a"):
      raise RuntimeError, 'write() requires mode "w" or "a"'
    if not self.fp:
      raise RuntimeError, \
         "Attempt to write ZIP archive that was already closed"
    if zinfo.compress_type == ZIP_DEFLATED and not zlib:
      raise RuntimeError, \
         "Compression requires the (missing) zlib module"
    if zinfo.compress_type not in (ZIP_STORED, ZIP_DEFLATED):
      raise RuntimeError, \
         "That compression method is not supported"
    if not self._allowZip64:
      requires_zip64 = None
      if len(self.filelist) >= ZIP_FILECOUNT_LIMIT:
        requires_zip64 = "Files count"
      elif zinfo.file_size > ZIP64_LIMIT:
        requires_zip64 = "Filesize"
      elif zinfo.header_offset > ZIP64_LIMIT:
        requires_zip64 = "Zipfile size"
      if requires_zip64:
        raise LargeZipFile(requires_zip64 +
                  " would require ZIP64 extensions")

  def write(self, filename, arcname=None, compress_type=None):
    """Put the bytes from filename into the archive under the name
    arcname."""
    if not self.fp:
      raise RuntimeError(
         "Attempt to write to ZIP archive that was already closed")

    st = os.stat(filename)
    isdir = stat.S_ISDIR(st.st_mode)
    mtime = time.localtime(st.st_mtime)
    date_time = mtime[0:6]
    # Create ZipInfo instance to store file information
    if arcname is None:
      arcname = filename
    arcname = os.path.normpath(os.path.splitdrive(arcname)[1])
    while arcname[0] in (os.sep, os.altsep):
      arcname = arcname[1:]
    if isdir:
      arcname += '/'
    zinfo = ZipInfo(arcname, date_time)
    zinfo.external_attr = (st[0] & 0xFFFF) << 16L   # Unix attributes
    if compress_type is None:
      zinfo.compress_type = self.compression
    else:
      zinfo.compress_type = compress_type

    zinfo.file_size = st.st_size
    zinfo.flag_bits = 0x00
    zinfo.header_offset = self.fp.tell()  # Start of header bytes

    self._writecheck(zinfo)
    self._didModify = True

    if isdir:
      zinfo.file_size = 0
      zinfo.compress_size = 0
      zinfo.CRC = 0
      zinfo.external_attr |= 0x10 # MS-DOS directory flag
      self.filelist.append(zinfo)
      self.NameToInfo[zinfo.filename] = zinfo
      self.fp.write(zinfo.FileHeader(False))
      return

    with open(filename, "rb") as fp:
      # Must overwrite CRC and sizes with correct data later
      zinfo.CRC = CRC = 0
      zinfo.compress_size = compress_size = 0
      # Compressed size can be larger than uncompressed size
      zip64 = self._allowZip64 and \
          zinfo.file_size * 1.05 > ZIP64_LIMIT
      self.fp.write(zinfo.FileHeader(zip64))
      if zinfo.compress_type == ZIP_DEFLATED:
        cmpr = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,
           zlib.DEFLATED, -15)
      else:
        cmpr = None
      file_size = 0
      while 1:
        buf = fp.read(1024 * 8)
        if not buf:
          break
        file_size = file_size + len(buf)
        CRC = crc32(buf, CRC) & 0xffffffff
        if cmpr:
          buf = cmpr.compress(buf)
          compress_size = compress_size + len(buf)
        self.fp.write(buf)
    if cmpr:
      buf = cmpr.flush()
      compress_size = compress_size + len(buf)
      self.fp.write(buf)
      zinfo.compress_size = compress_size
    else:
      zinfo.compress_size = file_size
    zinfo.CRC = CRC
    zinfo.file_size = file_size
    if not zip64 and self._allowZip64:
      if file_size > ZIP64_LIMIT:
        raise RuntimeError('File size has increased during compressing')
      if compress_size > ZIP64_LIMIT:
        raise RuntimeError('Compressed size larger than uncompressed size')
    # Seek backwards and write file header (which will now include
    # correct CRC and file sizes)
    position = self.fp.tell()    # Preserve current position in file
    self.fp.seek(zinfo.header_offset, 0)
    self.fp.write(zinfo.FileHeader(zip64))
    self.fp.seek(position, 0)
    self.filelist.append(zinfo)
    self.NameToInfo[zinfo.filename] = zinfo

  def writestr(self, zinfo_or_arcname, bytes, compress_type=None):
    """Write a file into the archive. The contents is the string
    'bytes'. 'zinfo_or_arcname' is either a ZipInfo instance or
    the name of the file in the archive."""
    if not isinstance(zinfo_or_arcname, ZipInfo):
      zinfo = ZipInfo(filename=zinfo_or_arcname,
              date_time=time.localtime(time.time())[:6])

      zinfo.compress_type = self.compression
      if zinfo.filename[-1] == '/':
        zinfo.external_attr = 0o40775 << 16  # drwxrwxr-x
        zinfo.external_attr |= 0x10      # MS-DOS directory flag
      else:
        zinfo.external_attr = 0o600 << 16   # ?rw-------
    else:
      zinfo = zinfo_or_arcname

    if not self.fp:
      raise RuntimeError(
         "Attempt to write to ZIP archive that was already closed")

    if compress_type is not None:
      zinfo.compress_type = compress_type

    zinfo.file_size = len(bytes)      # Uncompressed size
    zinfo.header_offset = self.fp.tell()  # Start of header bytes
    self._writecheck(zinfo)
    self._didModify = True
    zinfo.CRC = crc32(bytes) & 0xffffffff    # CRC-32 checksum
    if zinfo.compress_type == ZIP_DEFLATED:
      co = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,
         zlib.DEFLATED, -15)
      bytes = co.compress(bytes) + co.flush()
      zinfo.compress_size = len(bytes)  # Compressed size
    else:
      zinfo.compress_size = zinfo.file_size
    zip64 = zinfo.file_size > ZIP64_LIMIT or \
        zinfo.compress_size > ZIP64_LIMIT
    if zip64 and not self._allowZip64:
      raise LargeZipFile("Filesize would require ZIP64 extensions")
    self.fp.write(zinfo.FileHeader(zip64))
    self.fp.write(bytes)
    if zinfo.flag_bits & 0x08:
      # Write CRC and file sizes after the file data
      fmt = '<LQQ' if zip64 else '<LLL'
      self.fp.write(struct.pack(fmt, zinfo.CRC, zinfo.compress_size,
         zinfo.file_size))
    self.fp.flush()
    self.filelist.append(zinfo)
    self.NameToInfo[zinfo.filename] = zinfo

  def __del__(self):
    """Call the "close()" method in case the user forgot."""
    self.close()

  def close(self):
    """Close the file, and for mode "w" and "a" write the ending
    records."""
    if self.fp is None:
      return

    try:
      if self.mode in ("w", "a") and self._didModify: # write ending records
        pos1 = self.fp.tell()
        for zinfo in self.filelist:     # write central directory
          dt = zinfo.date_time
          dosdate = (dt[0] - 1980) << 9 | dt[1] << 5 | dt[2]
          dostime = dt[3] << 11 | dt[4] << 5 | (dt[5] // 2)
          extra = []
          if zinfo.file_size > ZIP64_LIMIT \
              or zinfo.compress_size > ZIP64_LIMIT:
            extra.append(zinfo.file_size)
            extra.append(zinfo.compress_size)
            file_size = 0xffffffff
            compress_size = 0xffffffff
          else:
            file_size = zinfo.file_size
            compress_size = zinfo.compress_size

          if zinfo.header_offset > ZIP64_LIMIT:
            extra.append(zinfo.header_offset)
            header_offset = 0xffffffffL
          else:
            header_offset = zinfo.header_offset

          extra_data = zinfo.extra
          if extra:
            # Append a ZIP64 field to the extra's
            extra_data = struct.pack(
                '<HH' + 'Q'*len(extra),
                1, 8*len(extra), *extra) + extra_data

            extract_version = max(45, zinfo.extract_version)
            create_version = max(45, zinfo.create_version)
          else:
            extract_version = zinfo.extract_version
            create_version = zinfo.create_version

          try:
            filename, flag_bits = zinfo._encodeFilenameFlags()
            centdir = struct.pack(structCentralDir,
            stringCentralDir, create_version,
            zinfo.create_system, extract_version, zinfo.reserved,
            flag_bits, zinfo.compress_type, dostime, dosdate,
            zinfo.CRC, compress_size, file_size,
            len(filename), len(extra_data), len(zinfo.comment),
            0, zinfo.internal_attr, zinfo.external_attr,
            header_offset)
          except DeprecationWarning:
            print >>sys.stderr, (structCentralDir,
            stringCentralDir, create_version,
            zinfo.create_system, extract_version, zinfo.reserved,
            zinfo.flag_bits, zinfo.compress_type, dostime, dosdate,
            zinfo.CRC, compress_size, file_size,
            len(zinfo.filename), len(extra_data), len(zinfo.comment),
            0, zinfo.internal_attr, zinfo.external_attr,
            header_offset)
            raise
          self.fp.write(centdir)
          self.fp.write(filename)
          self.fp.write(extra_data)
          self.fp.write(zinfo.comment)

        pos2 = self.fp.tell()
        # Write end-of-zip-archive record
        centDirCount = len(self.filelist)
        centDirSize = pos2 - pos1
        centDirOffset = pos1
        requires_zip64 = None
        if centDirCount > ZIP_FILECOUNT_LIMIT:
          requires_zip64 = "Files count"
        elif centDirOffset > ZIP64_LIMIT:
          requires_zip64 = "Central directory offset"
        elif centDirSize > ZIP64_LIMIT:
          requires_zip64 = "Central directory size"
        if requires_zip64:
          # Need to write the ZIP64 end-of-archive records
          if not self._allowZip64:
            raise LargeZipFile(requires_zip64 +
                      " would require ZIP64 extensions")
          zip64endrec = struct.pack(
              structEndArchive64, stringEndArchive64,
              44, 45, 45, 0, 0, centDirCount, centDirCount,
              centDirSize, centDirOffset)
          self.fp.write(zip64endrec)

          zip64locrec = struct.pack(
              structEndArchive64Locator,
              stringEndArchive64Locator, 0, pos2, 1)
          self.fp.write(zip64locrec)
          centDirCount = min(centDirCount, 0xFFFF)
          centDirSize = min(centDirSize, 0xFFFFFFFF)
          centDirOffset = min(centDirOffset, 0xFFFFFFFF)

        endrec = struct.pack(structEndArchive, stringEndArchive,
                  0, 0, centDirCount, centDirCount,
                  centDirSize, centDirOffset, len(self._comment))
        self.fp.write(endrec)
        self.fp.write(self._comment)
        self.fp.flush()
    finally:
      fp = self.fp
      self.fp = None
      if not self._filePassed:
        fp.close()

ZipFile

ZipFile 源码

ZipFile 源码
class TarFile(object):
  """The TarFile Class provides an interface to tar archives.
  """

  debug = 0          # May be set from 0 (no msgs) to 3 (all msgs)

  dereference = False     # If true, add content of linked file to the
                # tar file, else the link.

  ignore_zeros = False    # If true, skips empty or invalid blocks and
                # continues processing.

  errorlevel = 1       # If 0, fatal errors only appear in debug
                # messages (if debug >= 0). If > 0, errors
                # are passed to the caller as exceptions.

  format = DEFAULT_FORMAT   # The format to use when creating an archive.

  encoding = ENCODING     # Encoding for 8-bit character strings.

  errors = None        # Error handler for unicode conversion.

  tarinfo = TarInfo      # The default TarInfo class to use.

  fileobject = ExFileObject  # The default ExFileObject class to use.

  def __init__(self, name=None, mode="r", fileobj=None, format=None,
      tarinfo=None, dereference=None, ignore_zeros=None, encoding=None,
      errors=None, pax_headers=None, debug=None, errorlevel=None):
    """Open an (uncompressed) tar archive `name'. `mode' is either 'r' to
      read from an existing archive, 'a' to append data to an existing
      file or 'w' to create a new file overwriting an existing one. `mode'
      defaults to 'r'.
      If `fileobj' is given, it is used for reading or writing data. If it
      can be determined, `mode' is overridden by `fileobj's mode.
      `fileobj' is not closed, when TarFile is closed.
    """
    modes = {"r": "rb", "a": "r+b", "w": "wb"}
    if mode not in modes:
      raise ValueError("mode must be 'r', 'a' or 'w'")
    self.mode = mode
    self._mode = modes[mode]

    if not fileobj:
      if self.mode == "a" and not os.path.exists(name):
        # Create nonexistent files in append mode.
        self.mode = "w"
        self._mode = "wb"
      fileobj = bltn_open(name, self._mode)
      self._extfileobj = False
    else:
      if name is None and hasattr(fileobj, "name"):
        name = fileobj.name
      if hasattr(fileobj, "mode"):
        self._mode = fileobj.mode
      self._extfileobj = True
    self.name = os.path.abspath(name) if name else None
    self.fileobj = fileobj

    # Init attributes.
    if format is not None:
      self.format = format
    if tarinfo is not None:
      self.tarinfo = tarinfo
    if dereference is not None:
      self.dereference = dereference
    if ignore_zeros is not None:
      self.ignore_zeros = ignore_zeros
    if encoding is not None:
      self.encoding = encoding

    if errors is not None:
      self.errors = errors
    elif mode == "r":
      self.errors = "utf-8"
    else:
      self.errors = "strict"

    if pax_headers is not None and self.format == PAX_FORMAT:
      self.pax_headers = pax_headers
    else:
      self.pax_headers = {}

    if debug is not None:
      self.debug = debug
    if errorlevel is not None:
      self.errorlevel = errorlevel

    # Init datastructures.
    self.closed = False
    self.members = []    # list of members as TarInfo objects
    self._loaded = False  # flag if all members have been read
    self.offset = self.fileobj.tell()
                # current position in the archive file
    self.inodes = {}    # dictionary caching the inodes of
                # archive members already added

    try:
      if self.mode == "r":
        self.firstmember = None
        self.firstmember = self.next()

      if self.mode == "a":
        # Move to the end of the archive,
        # before the first empty block.
        while True:
          self.fileobj.seek(self.offset)
          try:
            tarinfo = self.tarinfo.fromtarfile(self)
            self.members.append(tarinfo)
          except EOFHeaderError:
            self.fileobj.seek(self.offset)
            break
          except HeaderError, e:
            raise ReadError(str(e))

      if self.mode in "aw":
        self._loaded = True

        if self.pax_headers:
          buf = self.tarinfo.create_pax_global_header(self.pax_headers.copy())
          self.fileobj.write(buf)
          self.offset += len(buf)
    except:
      if not self._extfileobj:
        self.fileobj.close()
      self.closed = True
      raise

  def _getposix(self):
    return self.format == USTAR_FORMAT
  def _setposix(self, value):
    import warnings
    warnings.warn("use the format attribute instead", DeprecationWarning,
           2)
    if value:
      self.format = USTAR_FORMAT
    else:
      self.format = GNU_FORMAT
  posix = property(_getposix, _setposix)

  #--------------------------------------------------------------------------
  # Below are the classmethods which act as alternate constructors to the
  # TarFile class. The open() method is the only one that is needed for
  # public use; it is the "super"-constructor and is able to select an
  # adequate "sub"-constructor for a particular compression using the mapping
  # from OPEN_METH.
  #
  # This concept allows one to subclass TarFile without losing the comfort of
  # the super-constructor. A sub-constructor is registered and made available
  # by adding it to the mapping in OPEN_METH.

  @classmethod
  def open(cls, name=None, mode="r", fileobj=None, bufsize=RECORDSIZE, **kwargs):
    """Open a tar archive for reading, writing or appending. Return
      an appropriate TarFile class.

      mode:
      'r' or 'r:*' open for reading with transparent compression
      'r:'     open for reading exclusively uncompressed
      'r:gz'    open for reading with gzip compression
      'r:bz2'   open for reading with bzip2 compression
      'a' or 'a:' open for appending, creating the file if necessary
      'w' or 'w:' open for writing without compression
      'w:gz'    open for writing with gzip compression
      'w:bz2'   open for writing with bzip2 compression

      'r|*'    open a stream of tar blocks with transparent compression
      'r|'     open an uncompressed stream of tar blocks for reading
      'r|gz'    open a gzip compressed stream of tar blocks
      'r|bz2'   open a bzip2 compressed stream of tar blocks
      'w|'     open an uncompressed stream for writing
      'w|gz'    open a gzip compressed stream for writing
      'w|bz2'   open a bzip2 compressed stream for writing
    """

    if not name and not fileobj:
      raise ValueError("nothing to open")

    if mode in ("r", "r:*"):
      # Find out which *open() is appropriate for opening the file.
      for comptype in cls.OPEN_METH:
        func = getattr(cls, cls.OPEN_METH[comptype])
        if fileobj is not None:
          saved_pos = fileobj.tell()
        try:
          return func(name, "r", fileobj, **kwargs)
        except (ReadError, CompressionError), e:
          if fileobj is not None:
            fileobj.seek(saved_pos)
          continue
      raise ReadError("file could not be opened successfully")

    elif ":" in mode:
      filemode, comptype = mode.split(":", 1)
      filemode = filemode or "r"
      comptype = comptype or "tar"

      # Select the *open() function according to
      # given compression.
      if comptype in cls.OPEN_METH:
        func = getattr(cls, cls.OPEN_METH[comptype])
      else:
        raise CompressionError("unknown compression type %r" % comptype)
      return func(name, filemode, fileobj, **kwargs)

    elif "|" in mode:
      filemode, comptype = mode.split("|", 1)
      filemode = filemode or "r"
      comptype = comptype or "tar"

      if filemode not in ("r", "w"):
        raise ValueError("mode must be 'r' or 'w'")

      stream = _Stream(name, filemode, comptype, fileobj, bufsize)
      try:
        t = cls(name, filemode, stream, **kwargs)
      except:
        stream.close()
        raise
      t._extfileobj = False
      return t

    elif mode in ("a", "w"):
      return cls.taropen(name, mode, fileobj, **kwargs)

    raise ValueError("undiscernible mode")

  @classmethod
  def taropen(cls, name, mode="r", fileobj=None, **kwargs):
    """Open uncompressed tar archive name for reading or writing.
    """
    if mode not in ("r", "a", "w"):
      raise ValueError("mode must be 'r', 'a' or 'w'")
    return cls(name, mode, fileobj, **kwargs)

  @classmethod
  def gzopen(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):
    """Open gzip compressed tar archive name for reading or writing.
      Appending is not allowed.
    """
    if mode not in ("r", "w"):
      raise ValueError("mode must be 'r' or 'w'")

    try:
      import gzip
      gzip.GzipFile
    except (ImportError, AttributeError):
      raise CompressionError("gzip module is not available")

    try:
      fileobj = gzip.GzipFile(name, mode, compresslevel, fileobj)
    except OSError:
      if fileobj is not None and mode == 'r':
        raise ReadError("not a gzip file")
      raise

    try:
      t = cls.taropen(name, mode, fileobj, **kwargs)
    except IOError:
      fileobj.close()
      if mode == 'r':
        raise ReadError("not a gzip file")
      raise
    except:
      fileobj.close()
      raise
    t._extfileobj = False
    return t

  @classmethod
  def bz2open(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):
    """Open bzip2 compressed tar archive name for reading or writing.
      Appending is not allowed.
    """
    if mode not in ("r", "w"):
      raise ValueError("mode must be 'r' or 'w'.")

    try:
      import bz2
    except ImportError:
      raise CompressionError("bz2 module is not available")

    if fileobj is not None:
      fileobj = _BZ2Proxy(fileobj, mode)
    else:
      fileobj = bz2.BZ2File(name, mode, compresslevel=compresslevel)

    try:
      t = cls.taropen(name, mode, fileobj, **kwargs)
    except (IOError, EOFError):
      fileobj.close()
      if mode == 'r':
        raise ReadError("not a bzip2 file")
      raise
    except:
      fileobj.close()
      raise
    t._extfileobj = False
    return t

  # All *open() methods are registered here.
  OPEN_METH = {
    "tar": "taropen",  # uncompressed tar
    "gz": "gzopen",  # gzip compressed tar
    "bz2": "bz2open"  # bzip2 compressed tar
  }

  #--------------------------------------------------------------------------
  # The public methods which TarFile provides:

  def close(self):
    """Close the TarFile. In write-mode, two finishing zero blocks are
      appended to the archive.
    """
    if self.closed:
      return

    if self.mode in "aw":
      self.fileobj.write(NUL * (BLOCKSIZE * 2))
      self.offset += (BLOCKSIZE * 2)
      # fill up the end with zero-blocks
      # (like option -b20 for tar does)
      blocks, remainder = divmod(self.offset, RECORDSIZE)
      if remainder > 0:
        self.fileobj.write(NUL * (RECORDSIZE - remainder))

    if not self._extfileobj:
      self.fileobj.close()
    self.closed = True

  def getmember(self, name):
    """Return a TarInfo object for member `name'. If `name' can not be
      found in the archive, KeyError is raised. If a member occurs more
      than once in the archive, its last occurrence is assumed to be the
      most up-to-date version.
    """
    tarinfo = self._getmember(name)
    if tarinfo is None:
      raise KeyError("filename %r not found" % name)
    return tarinfo

  def getmembers(self):
    """Return the members of the archive as a list of TarInfo objects. The
      list has the same order as the members in the archive.
    """
    self._check()
    if not self._loaded:  # if we want to obtain a list of
      self._load()    # all members, we first have to
                # scan the whole archive.
    return self.members

  def getnames(self):
    """Return the members of the archive as a list of their names. It has
      the same order as the list returned by getmembers().
    """
    return [tarinfo.name for tarinfo in self.getmembers()]

  def gettarinfo(self, name=None, arcname=None, fileobj=None):
    """Create a TarInfo object for either the file `name' or the file
      object `fileobj' (using os.fstat on its file descriptor). You can
      modify some of the TarInfo's attributes before you add it using
      addfile(). If given, `arcname' specifies an alternative name for the
      file in the archive.
    """
    self._check("aw")

    # When fileobj is given, replace name by
    # fileobj's real name.
    if fileobj is not None:
      name = fileobj.name

    # Building the name of the member in the archive.
    # Backward slashes are converted to forward slashes,
    # Absolute paths are turned to relative paths.
    if arcname is None:
      arcname = name
    drv, arcname = os.path.splitdrive(arcname)
    arcname = arcname.replace(os.sep, "/")
    arcname = arcname.lstrip("/")

    # Now, fill the TarInfo object with
    # information specific for the file.
    tarinfo = self.tarinfo()
    tarinfo.tarfile = self

    # Use os.stat or os.lstat, depending on platform
    # and if symlinks shall be resolved.
    if fileobj is None:
      if hasattr(os, "lstat") and not self.dereference:
        statres = os.lstat(name)
      else:
        statres = os.stat(name)
    else:
      statres = os.fstat(fileobj.fileno())
    linkname = ""

    stmd = statres.st_mode
    if stat.S_ISREG(stmd):
      inode = (statres.st_ino, statres.st_dev)
      if not self.dereference and statres.st_nlink > 1 and \
          inode in self.inodes and arcname != self.inodes[inode]:
        # Is it a hardlink to an already
        # archived file?
        type = LNKTYPE
        linkname = self.inodes[inode]
      else:
        # The inode is added only if its valid.
        # For win32 it is always 0.
        type = REGTYPE
        if inode[0]:
          self.inodes[inode] = arcname
    elif stat.S_ISDIR(stmd):
      type = DIRTYPE
    elif stat.S_ISFIFO(stmd):
      type = FIFOTYPE
    elif stat.S_ISLNK(stmd):
      type = SYMTYPE
      linkname = os.readlink(name)
    elif stat.S_ISCHR(stmd):
      type = CHRTYPE
    elif stat.S_ISBLK(stmd):
      type = BLKTYPE
    else:
      return None

    # Fill the TarInfo object with all
    # information we can get.
    tarinfo.name = arcname
    tarinfo.mode = stmd
    tarinfo.uid = statres.st_uid
    tarinfo.gid = statres.st_gid
    if type == REGTYPE:
      tarinfo.size = statres.st_size
    else:
      tarinfo.size = 0L
    tarinfo.mtime = statres.st_mtime
    tarinfo.type = type
    tarinfo.linkname = linkname
    if pwd:
      try:
        tarinfo.uname = pwd.getpwuid(tarinfo.uid)[0]
      except KeyError:
        pass
    if grp:
      try:
        tarinfo.gname = grp.getgrgid(tarinfo.gid)[0]
      except KeyError:
        pass

    if type in (CHRTYPE, BLKTYPE):
      if hasattr(os, "major") and hasattr(os, "minor"):
        tarinfo.devmajor = os.major(statres.st_rdev)
        tarinfo.devminor = os.minor(statres.st_rdev)
    return tarinfo

  def list(self, verbose=True):
    """Print a table of contents to sys.stdout. If `verbose' is False, only
      the names of the members are printed. If it is True, an `ls -l'-like
      output is produced.
    """
    self._check()

    for tarinfo in self:
      if verbose:
        print filemode(tarinfo.mode),
        print "%s/%s" % (tarinfo.uname or tarinfo.uid,
                 tarinfo.gname or tarinfo.gid),
        if tarinfo.ischr() or tarinfo.isblk():
          print "%10s" % ("%d,%d" \
                  % (tarinfo.devmajor, tarinfo.devminor)),
        else:
          print "%10d" % tarinfo.size,
        print "%d-%02d-%02d %02d:%02d:%02d" \
           % time.localtime(tarinfo.mtime)[:6],

      print tarinfo.name + ("/" if tarinfo.isdir() else ""),

      if verbose:
        if tarinfo.issym():
          print "->", tarinfo.linkname,
        if tarinfo.islnk():
          print "link to", tarinfo.linkname,
      print

  def add(self, name, arcname=None, recursive=True, exclude=None, filter=None):
    """Add the file `name' to the archive. `name' may be any type of file
      (directory, fifo, symbolic link, etc.). If given, `arcname'
      specifies an alternative name for the file in the archive.
      Directories are added recursively by default. This can be avoided by
      setting `recursive' to False. `exclude' is a function that should
      return True for each filename to be excluded. `filter' is a function
      that expects a TarInfo object argument and returns the changed
      TarInfo object, if it returns None the TarInfo object will be
      excluded from the archive.
    """
    self._check("aw")

    if arcname is None:
      arcname = name

    # Exclude pathnames.
    if exclude is not None:
      import warnings
      warnings.warn("use the filter argument instead",
          DeprecationWarning, 2)
      if exclude(name):
        self._dbg(2, "tarfile: Excluded %r" % name)
        return

    # Skip if somebody tries to archive the archive...
    if self.name is not None and os.path.abspath(name) == self.name:
      self._dbg(2, "tarfile: Skipped %r" % name)
      return

    self._dbg(1, name)

    # Create a TarInfo object from the file.
    tarinfo = self.gettarinfo(name, arcname)

    if tarinfo is None:
      self._dbg(1, "tarfile: Unsupported type %r" % name)
      return

    # Change or exclude the TarInfo object.
    if filter is not None:
      tarinfo = filter(tarinfo)
      if tarinfo is None:
        self._dbg(2, "tarfile: Excluded %r" % name)
        return

    # Append the tar header and data to the archive.
    if tarinfo.isreg():
      with bltn_open(name, "rb") as f:
        self.addfile(tarinfo, f)

    elif tarinfo.isdir():
      self.addfile(tarinfo)
      if recursive:
        for f in os.listdir(name):
          self.add(os.path.join(name, f), os.path.join(arcname, f),
              recursive, exclude, filter)

    else:
      self.addfile(tarinfo)

  def addfile(self, tarinfo, fileobj=None):
    """Add the TarInfo object `tarinfo' to the archive. If `fileobj' is
      given, tarinfo.size bytes are read from it and added to the archive.
      You can create TarInfo objects using gettarinfo().
      On Windows platforms, `fileobj' should always be opened with mode
      'rb' to avoid irritation about the file size.
    """
    self._check("aw")

    tarinfo = copy.copy(tarinfo)

    buf = tarinfo.tobuf(self.format, self.encoding, self.errors)
    self.fileobj.write(buf)
    self.offset += len(buf)

    # If there's data to follow, append it.
    if fileobj is not None:
      copyfileobj(fileobj, self.fileobj, tarinfo.size)
      blocks, remainder = divmod(tarinfo.size, BLOCKSIZE)
      if remainder > 0:
        self.fileobj.write(NUL * (BLOCKSIZE - remainder))
        blocks += 1
      self.offset += blocks * BLOCKSIZE

    self.members.append(tarinfo)

  def extractall(self, path=".", members=None):
    """Extract all members from the archive to the current working
      directory and set owner, modification time and permissions on
      directories afterwards. `path' specifies a different directory
      to extract to. `members' is optional and must be a subset of the
      list returned by getmembers().
    """
    directories = []

    if members is None:
      members = self

    for tarinfo in members:
      if tarinfo.isdir():
        # Extract directories with a safe mode.
        directories.append(tarinfo)
        tarinfo = copy.copy(tarinfo)
        tarinfo.mode = 0700
      self.extract(tarinfo, path)

    # Reverse sort directories.
    directories.sort(key=operator.attrgetter('name'))
    directories.reverse()

    # Set correct owner, mtime and filemode on directories.
    for tarinfo in directories:
      dirpath = os.path.join(path, tarinfo.name)
      try:
        self.chown(tarinfo, dirpath)
        self.utime(tarinfo, dirpath)
        self.chmod(tarinfo, dirpath)
      except ExtractError, e:
        if self.errorlevel > 1:
          raise
        else:
          self._dbg(1, "tarfile: %s" % e)

  def extract(self, member, path=""):
    """Extract a member from the archive to the current working directory,
      using its full name. Its file information is extracted as accurately
      as possible. `member' may be a filename or a TarInfo object. You can
      specify a different directory using `path'.
    """
    self._check("r")

    if isinstance(member, basestring):
      tarinfo = self.getmember(member)
    else:
      tarinfo = member

    # Prepare the link target for makelink().
    if tarinfo.islnk():
      tarinfo._link_target = os.path.join(path, tarinfo.linkname)

    try:
      self._extract_member(tarinfo, os.path.join(path, tarinfo.name))
    except EnvironmentError, e:
      if self.errorlevel > 0:
        raise
      else:
        if e.filename is None:
          self._dbg(1, "tarfile: %s" % e.strerror)
        else:
          self._dbg(1, "tarfile: %s %r" % (e.strerror, e.filename))
    except ExtractError, e:
      if self.errorlevel > 1:
        raise
      else:
        self._dbg(1, "tarfile: %s" % e)

  def extractfile(self, member):
    """Extract a member from the archive as a file object. `member' may be
      a filename or a TarInfo object. If `member' is a regular file, a
      file-like object is returned. If `member' is a link, a file-like
      object is constructed from the link's target. If `member' is none of
      the above, None is returned.
      The file-like object is read-only and provides the following
      methods: read(), readline(), readlines(), seek() and tell()
    """
    self._check("r")

    if isinstance(member, basestring):
      tarinfo = self.getmember(member)
    else:
      tarinfo = member

    if tarinfo.isreg():
      return self.fileobject(self, tarinfo)

    elif tarinfo.type not in SUPPORTED_TYPES:
      # If a member's type is unknown, it is treated as a
      # regular file.
      return self.fileobject(self, tarinfo)

    elif tarinfo.islnk() or tarinfo.issym():
      if isinstance(self.fileobj, _Stream):
        # A small but ugly workaround for the case that someone tries
        # to extract a (sym)link as a file-object from a non-seekable
        # stream of tar blocks.
        raise StreamError("cannot extract (sym)link as file object")
      else:
        # A (sym)link's file object is its target's file object.
        return self.extractfile(self._find_link_target(tarinfo))
    else:
      # If there's no data associated with the member (directory, chrdev,
      # blkdev, etc.), return None instead of a file object.
      return None

  def _extract_member(self, tarinfo, targetpath):
    """Extract the TarInfo object tarinfo to a physical
      file called targetpath.
    """
    # Fetch the TarInfo object for the given name
    # and build the destination pathname, replacing
    # forward slashes to platform specific separators.
    targetpath = targetpath.rstrip("/")
    targetpath = targetpath.replace("/", os.sep)

    # Create all upper directories.
    upperdirs = os.path.dirname(targetpath)
    if upperdirs and not os.path.exists(upperdirs):
      # Create directories that are not part of the archive with
      # default permissions.
      os.makedirs(upperdirs)

    if tarinfo.islnk() or tarinfo.issym():
      self._dbg(1, "%s -> %s" % (tarinfo.name, tarinfo.linkname))
    else:
      self._dbg(1, tarinfo.name)

    if tarinfo.isreg():
      self.makefile(tarinfo, targetpath)
    elif tarinfo.isdir():
      self.makedir(tarinfo, targetpath)
    elif tarinfo.isfifo():
      self.makefifo(tarinfo, targetpath)
    elif tarinfo.ischr() or tarinfo.isblk():
      self.makedev(tarinfo, targetpath)
    elif tarinfo.islnk() or tarinfo.issym():
      self.makelink(tarinfo, targetpath)
    elif tarinfo.type not in SUPPORTED_TYPES:
      self.makeunknown(tarinfo, targetpath)
    else:
      self.makefile(tarinfo, targetpath)

    self.chown(tarinfo, targetpath)
    if not tarinfo.issym():
      self.chmod(tarinfo, targetpath)
      self.utime(tarinfo, targetpath)

  #--------------------------------------------------------------------------
  # Below are the different file methods. They are called via
  # _extract_member() when extract() is called. They can be replaced in a
  # subclass to implement other functionality.

  def makedir(self, tarinfo, targetpath):
    """Make a directory called targetpath.
    """
    try:
      # Use a safe mode for the directory, the real mode is set
      # later in _extract_member().
      os.mkdir(targetpath, 0700)
    except EnvironmentError, e:
      if e.errno != errno.EEXIST:
        raise

  def makefile(self, tarinfo, targetpath):
    """Make a file called targetpath.
    """
    source = self.extractfile(tarinfo)
    try:
      with bltn_open(targetpath, "wb") as target:
        copyfileobj(source, target)
    finally:
      source.close()

  def makeunknown(self, tarinfo, targetpath):
    """Make a file from a TarInfo object with an unknown type
      at targetpath.
    """
    self.makefile(tarinfo, targetpath)
    self._dbg(1, "tarfile: Unknown file type %r, " \
           "extracted as regular file." % tarinfo.type)

  def makefifo(self, tarinfo, targetpath):
    """Make a fifo called targetpath.
    """
    if hasattr(os, "mkfifo"):
      os.mkfifo(targetpath)
    else:
      raise ExtractError("fifo not supported by system")

  def makedev(self, tarinfo, targetpath):
    """Make a character or block device called targetpath.
    """
    if not hasattr(os, "mknod") or not hasattr(os, "makedev"):
      raise ExtractError("special devices not supported by system")

    mode = tarinfo.mode
    if tarinfo.isblk():
      mode |= stat.S_IFBLK
    else:
      mode |= stat.S_IFCHR

    os.mknod(targetpath, mode,
         os.makedev(tarinfo.devmajor, tarinfo.devminor))

  def makelink(self, tarinfo, targetpath):
    """Make a (symbolic) link called targetpath. If it cannot be created
     (platform limitation), we try to make a copy of the referenced file
     instead of a link.
    """
    if hasattr(os, "symlink") and hasattr(os, "link"):
      # For systems that support symbolic and hard links.
      if tarinfo.issym():
        if os.path.lexists(targetpath):
          os.unlink(targetpath)
        os.symlink(tarinfo.linkname, targetpath)
      else:
        # See extract().
        if os.path.exists(tarinfo._link_target):
          if os.path.lexists(targetpath):
            os.unlink(targetpath)
          os.link(tarinfo._link_target, targetpath)
        else:
          self._extract_member(self._find_link_target(tarinfo), targetpath)
    else:
      try:
        self._extract_member(self._find_link_target(tarinfo), targetpath)
      except KeyError:
        raise ExtractError("unable to resolve link inside archive")

  def chown(self, tarinfo, targetpath):
    """Set owner of targetpath according to tarinfo.
    """
    if pwd and hasattr(os, "geteuid") and os.geteuid() == 0:
      # We have to be root to do so.
      try:
        g = grp.getgrnam(tarinfo.gname)[2]
      except KeyError:
        g = tarinfo.gid
      try:
        u = pwd.getpwnam(tarinfo.uname)[2]
      except KeyError:
        u = tarinfo.uid
      try:
        if tarinfo.issym() and hasattr(os, "lchown"):
          os.lchown(targetpath, u, g)
        else:
          if sys.platform != "os2emx":
            os.chown(targetpath, u, g)
      except EnvironmentError, e:
        raise ExtractError("could not change owner")

  def chmod(self, tarinfo, targetpath):
    """Set file permissions of targetpath according to tarinfo.
    """
    if hasattr(os, 'chmod'):
      try:
        os.chmod(targetpath, tarinfo.mode)
      except EnvironmentError, e:
        raise ExtractError("could not change mode")

  def utime(self, tarinfo, targetpath):
    """Set modification time of targetpath according to tarinfo.
    """
    if not hasattr(os, 'utime'):
      return
    try:
      os.utime(targetpath, (tarinfo.mtime, tarinfo.mtime))
    except EnvironmentError, e:
      raise ExtractError("could not change modification time")

  #--------------------------------------------------------------------------
  def next(self):
    """Return the next member of the archive as a TarInfo object, when
      TarFile is opened for reading. Return None if there is no more
      available.
    """
    self._check("ra")
    if self.firstmember is not None:
      m = self.firstmember
      self.firstmember = None
      return m

    # Read the next block.
    self.fileobj.seek(self.offset)
    tarinfo = None
    while True:
      try:
        tarinfo = self.tarinfo.fromtarfile(self)
      except EOFHeaderError, e:
        if self.ignore_zeros:
          self._dbg(2, "0x%X: %s" % (self.offset, e))
          self.offset += BLOCKSIZE
          continue
      except InvalidHeaderError, e:
        if self.ignore_zeros:
          self._dbg(2, "0x%X: %s" % (self.offset, e))
          self.offset += BLOCKSIZE
          continue
        elif self.offset == 0:
          raise ReadError(str(e))
      except EmptyHeaderError:
        if self.offset == 0:
          raise ReadError("empty file")
      except TruncatedHeaderError, e:
        if self.offset == 0:
          raise ReadError(str(e))
      except SubsequentHeaderError, e:
        raise ReadError(str(e))
      break

    if tarinfo is not None:
      self.members.append(tarinfo)
    else:
      self._loaded = True

    return tarinfo

  #--------------------------------------------------------------------------
  # Little helper methods:

  def _getmember(self, name, tarinfo=None, normalize=False):
    """Find an archive member by name from bottom to top.
      If tarinfo is given, it is used as the starting point.
    """
    # Ensure that all members have been loaded.
    members = self.getmembers()

    # Limit the member search list up to tarinfo.
    if tarinfo is not None:
      members = members[:members.index(tarinfo)]

    if normalize:
      name = os.path.normpath(name)

    for member in reversed(members):
      if normalize:
        member_name = os.path.normpath(member.name)
      else:
        member_name = member.name

      if name == member_name:
        return member

  def _load(self):
    """Read through the entire archive file and look for readable
      members.
    """
    while True:
      tarinfo = self.next()
      if tarinfo is None:
        break
    self._loaded = True

  def _check(self, mode=None):
    """Check if TarFile is still open, and if the operation's mode
      corresponds to TarFile's mode.
    """
    if self.closed:
      raise IOError("%s is closed" % self.__class__.__name__)
    if mode is not None and self.mode not in mode:
      raise IOError("bad operation for mode %r" % self.mode)

  def _find_link_target(self, tarinfo):
    """Find the target member of a symlink or hardlink member in the
      archive.
    """
    if tarinfo.issym():
      # Always search the entire archive.
      linkname = "/".join(filter(None, (os.path.dirname(tarinfo.name), tarinfo.linkname)))
      limit = None
    else:
      # Search the archive before the link, because a hard link is
      # just a reference to an already archived file.
      linkname = tarinfo.linkname
      limit = tarinfo

    member = self._getmember(linkname, tarinfo=limit, normalize=True)
    if member is None:
      raise KeyError("linkname %r not found" % linkname)
    return member

  def __iter__(self):
    """Provide an iterator object.
    """
    if self._loaded:
      return iter(self.members)
    else:
      return TarIter(self)

  def _dbg(self, level, msg):
    """Write debugging output to sys.stderr.
    """
    if level <= self.debug:
      print >> sys.stderr, msg

  def __enter__(self):
    self._check()
    return self

  def __exit__(self, type, value, traceback):
    if type is None:
      self.close()
    else:
      # An exception occurred. We must not call close() because
      # it would try to write end-of-archive blocks and padding.
      if not self._extfileobj:
        self.fileobj.close()
      self.closed = True
# class TarFile

TarFile

TarFile 源码

TarFile 源码

(6)json 和 pickle模块:文件只能存二进制或字符串,不能存其他类型,所以用到了用于序列化的两个模块

(7)shelve模块:shelve模块内部对pickle进行了封装,shelve模块是一个简单的k,v将内存数据通过文件持久化的模块,可以持久化任何pickle可支持的python数据格式 (可以存储数据、获取数据、给数据重新赋值)

#!/usr/bin/env python
# -*- coding:utf-8 -*-
#-Author-solo
import shelve
# k,v方式存储数据
s = shelve.open("shelve_test") # 打开一个文件
tuple = (1, 2, 3, 4)
list = ['a', 'b', 'c', 'd']
info = {"name": "lzl", "age": 18}
s["tuple"] = tuple # 持久化元组
s["list"] = list
s["info"] = info
s.close()
# 通过key获取value值
d = shelve.open("shelve_test") # 打开一个文件
print(d["tuple"]) # 读取
print(d.get("list"))
print(d.get("info"))
# (1, 2, 3, 4)
# ['a', 'b', 'c', 'd']
# {'name': 'lzl', 'age': 18}
d.close()
# 循环打印key值
s = shelve.open("shelve_test") # 打开一个文件
for k in s.keys():       # 循环key值
  print(k)
# list
# tuple
# info
s.close()
# 更新key的value值
s = shelve.open("shelve_test") # 打开一个文件
s.update({"list":[22,33]})   #重新赋值或者s["list"] = [22,33]
print(s["list"])
#[22, 33]
s.close()

(8)xml模块:xml是实现不同语言或程序之间进行数据交换的协议,跟json差不多,但json使用起来更简单(通过<>节点来区别数据结构)

<?xml version="1.0"?><data>
  <country name="Liechtenstein">
    <rank updated="yes">2</rank>
    <year>2008</year>
    <gdppc>141100</gdppc>
    <neighbor name="Austria" direction="E"/>
    <neighbor name="Switzerland" direction="W"/>
  </country>
  <country name="Singapore">
    <rank updated="yes">5</rank>
    <year>2011</year>
    <gdppc>59900</gdppc>
    <neighbor name="Malaysia" direction="N"/>
  </country>
  <country name="Panama">
    <rank updated="yes">69</rank>
    <year>2011</year>
    <gdppc>13600</gdppc>
    <neighbor name="Costa Rica" direction="W"/>
    <neighbor name="Colombia" direction="E"/>
  </country></data>

文件
import xml.etree.ElementTree as ET
tree = ET.parse("xmltest.xml")
root = tree.getroot()
print(root.tag)
#遍历xml文档
for child in root:
  print(child.tag, child.attrib)
  for i in child:
    print(i.tag,i.text)
#只遍历year 节点
for node in root.iter('year'):
  print(node.tag,node.text)
#修改
for node in root.iter('year'):
  new_year = int(node.text) + 1
  node.text = str(new_year)
  node.set("updated","yes")
tree.write("xmltest.xml")
#删除node
for country in root.findall('country'):
  rank = int(country.find('rank').text)
  if rank > 50:
   root.remove(country)
tree.write('output.xml')

###########自己创建xml文档
import xml.etree.ElementTree as ET 
new_xml = ET.Element("namelist")
name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})
age = ET.SubElement(name,"age",attrib={"checked":"no"})
sex = ET.SubElement(name,"sex")
sex.text = '33'
name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"})
age = ET.SubElement(name2,"age")
age.text = '19'
et = ET.ElementTree(new_xml) #生成文档对象
et.write("test.xml", encoding="utf-8",xml_declaration=True)
ET.dump(new_xml) #打印生成的格式

操作

(9)configparser模块:用于生成和修改配置文档(很少在程序中修改配置文件)

(10)hashlib模块:用于加密相关的操作,3.x里代替了md5模块和sha模块,主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ,MD5 算法

import hashlib
m = hashlib.md5()
m.update(b"Hello")
m.update(b"It's me")
print(m.digest())
m.update(b"It's been a long time since last time we ...")
print(m.digest()) #2进制格式hash
print(len(m.hexdigest())) #16进制格式hash
'''
def digest(self, *args, **kwargs): # real signature unknown
  """ Return the digest value as a string of binary data. """
  pass
def hexdigest(self, *args, **kwargs): # real signature unknown
  """ Return the digest value as a string of hexadecimal digits. """
  pass
'''
import hashlib 
# ######## md5 ########
hash = hashlib.md5()
hash.update('admin')
print(hash.hexdigest())
# ######## sha1 ########
hash = hashlib.sha1()
hash.update('admin')
print(hash.hexdigest())
# ######## sha256 ########
hash = hashlib.sha256()
hash.update('admin')
print(hash.hexdigest())
# ######## sha384 ########
hash = hashlib.sha384()
hash.update('admin')
print(hash.hexdigest())
# ######## sha512 ########
hash = hashlib.sha512()
hash.update('admin')
print(hash.hexdigest())

hashlib
import hmac
h = hmac.new('wueiqi')
h.update('hellowo')
print h.hexdigest()

(11)re模块:用于对python的正则表达式的操作;匹配(动态模糊的匹配);关键是匹配条件

'.'   默认匹配除\n之外的任意一个字符,若指定flag DOTALL,则匹配任意字符,包括换行
'^'   匹配字符开头,若指定flags MULTILINE,这种也可以匹配上(r"^a","\nabc\neee",flags=re.MULTILINE)
'$'   匹配字符结尾,或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也可以
'*'   匹配*号前的字符0次或多次,re.findall("ab*","cabb3abcbbac") 结果为['abb', 'ab', 'a']
'+'   匹配前一个字符1次或多次,re.findall("ab+","ab+cd+abb+bba") 结果['ab', 'abb']
'?'   匹配前一个字符1次或0次
'{m}'  匹配前一个字符m次
'{n,m}' 匹配前一个字符n到m次,re.findall("ab{1,3}","abb abc abbcbbb") 结果'abb', 'ab', 'abb']
'|'   匹配|左或|右的字符,re.search("abc|ABC","ABCBabcCD").group() 结果'ABC'
'(...)' 分组匹配,re.search("(abc){2}a(123|456)c", "abcabca456c").group() 结果 abcabca456c
'[a-z]' 匹配a到z任意一个字符
'[^()]' 匹配除()以外的任意一个字符 
'\A'  只从字符开头匹配,re.search("\Aabc","alexabc") 是匹配不到的
'\Z'  匹配字符结尾,同$
'\d'  匹配数字0-9
'\D'  匹配非数字
'\w'  匹配[A-Za-z0-9]
'\W'  匹配非[A-Za-z0-9]
'\s'  匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 结果 '\t' 
'(?P<name>...)' 分组匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city")
结果{'province': '3714', 'city': '81', 'birthday': '1993'}

正则表达式

①、match:从起始位置开始去匹配

#match
import re               
obj = re.match('\d+', '123uua123sf')    #从第一个字符开始匹配一个到多个数字
print(obj)                
#<_sre.SRE_Match object; span=(0, 3), match='123'>
if obj:                  #如果有匹配到字符则执行,为空不执行
  print(obj.group())          #打印匹配到的内容
#123

②、search:最前面去匹配(不一定是最开始位置),匹配最前

#search
import re
obj = re.search('\d+', 'a123uu234asf')   #从数字开始匹配一个到多个数字
print(obj)
#<_sre.SRE_Match object; span=(1, 4), match='123'>
if obj:                  #如果有匹配到字符则执行,为空不执行
  print(obj.group())          #打印匹配到的内容
#123
import re
obj = re.search('\([^()]+\)', 'sdds(a1fwewe2(3uusfdsf2)34as)f')   #匹配最里面()的内容
print(obj)
#<_sre.SRE_Match object; span=(13, 24), match='(3uusfdsf2)'>
if obj:                  #如果有匹配到字符则执行,为空不执行
  print(obj.group())          #打印匹配到的内容
#(3uusfdsf2)

③、group与groups的区别

#group与groups的区别
import re
a = "123abc456"
b = re.search("([0-9]*)([a-z]*)([0-9]*)", a)
print(b)
#<_sre.SRE_Match object; span=(0, 9), match='123abc456'>
print(b.group())
#123abc456
print(b.group(0))
#123abc456
print(b.group(1))
#123
print(b.group(2))
#abc
print(b.group(3))
#456
print(b.groups())
#('123', 'abc', '456')

④、findall上述两中方式均用于匹配单值,即:只能匹配字符串中的一个,如果想要匹配到字符串中所有符合条件的元素,则需要使用 findall;findall没有group 用法

#findall
import re
obj = re.findall('\d+', 'a123uu234asf')   #匹配多个
if obj:                  #如果有匹配到字符则执行,为空不执行
  print(obj)               #生成的内容为列表
#['123', '234']

⑤、sub:用于替换匹配的字符串

#sub
import re
content = "123abc456"
new_content = re.sub('\d+', 'ABC', content)
print(new_content)
#ABCabcABC

⑥、split:根据指定匹配进行分组(分割)

#split
import re
content = "1 - 2 * ((60-30+1*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2) )"
new_content = re.split('\*', content)    #用*进行分割,分割为列表
print(new_content)
#['1 - 2 ', ' ((60-30+1', '(9-2', '5/3+7/3', '99/4', '2998+10', '568/14))-(-4', '3)/(16-3', '2) )'] 
content = "'1 - 2 * ((60-30+1*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2) )'"
new_content = re.split('[\+\-\*\/]+', content)
# new_content = re.split('\*', content, 1)
print(new_content)
#["'1 ", ' 2 ', ' ((60', '30', '1', '(9', '2', '5', '3', '7', '3', '99', '4', '2998', '10', '568', '14))',
# '(', '4', '3)', '(16', '3', "2) )'"]
inpp = '1-2*((60-30 +(-40-5)*(9-2*5/3 + 7 /3*99/4*2998 +10 * 568/14 )) - (-4*3)/ (16-3*2))'
inpp = re.sub('\s*','',inpp)        #把空白字符去掉
print(inpp)
new_content = re.split('\(([\+\-\*\/]?\d+[\+\-\*\/]?\d+){1}\)', inpp, 1)
print(new_content)
#['1-2*((60-30+', '-40-5', '*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2))']

(12)urllib模块:提供了一系列用于操作URL的功能(利用程序去执行各种HTTP请求。如果要模拟浏览器完成特定功能,需要把请求伪装成浏览器。伪装的方法是先监控浏览器发出的请求,再根据浏览器的请求头来伪装,User-Agent头就是用来标识浏览器的。)

#!/usr/bin/env python
# -*- coding:utf-8 -*-
#-Author-solo
import urllib.request
def getdata():
  url = "http://www.baidu.com"
  data = urllib.request.urlopen(url).read()
  data = data.decode("utf-8")
  print(data)
getdata()

###urlopen返回的类文件对象支持close、read、readline、和readlines方法

 

十二、面向对象

面向过程编程:通过代码的层层堆积来实现功能。不易迭代和维护。
函数式编程:将某功能代码封装到函数中,仅调用函数即可
面向对象编程:利用“类”和“对象”来创建各种模型来实现对真实世界的描述;使用面向对象编程的原因一方面是因为它可以使程序的维护和扩展变得更简单,并且可以大大提高程序开发效率 ,另外,基于面向对象的程序可以使它人更加容易理解你的代码逻辑,从而使团队开发变得更从容。

#经典类
class A():
  def __init__(self):
    print("A")
class B(A):
  pass
class C(A):
  def __init__(self):
    print("C")
class D(B,C):
  pass
obj = D()
#A
#新式类
class A(object):
  def __init__(self):
    print("A")
class B(A):
  pass
class C(A):
  def __init__(self):
    print("C")
class D(B,C):
  pass
obj = D()
#C

经典类、新式类

#属性方法
class Flight(object):
  def __init__(self, name):
    self.flight_name = name
  def checking_status(self):
    print("checking flight %s status " % self.flight_name)
    return 1
  @property
  def flight_status(self):
    status = self.checking_status()
    if status == 0:
      print("flight got canceled...")
    elif status == 1:
      print("flight is arrived...")
    elif status == 2:
      print("flight has departured already...")
    else:
      print("cannot confirm the flight status...,please check later")
  @flight_status.setter # 修改   执行修改操作时触发
  def flight_status(self, status):
    status_dic = {
    0: "canceled",
    1:"arrived",
    2: "departured"
    }
    print("\033[31;1mHas changed the flight status to \033[0m", status_dic.get(status))
  @flight_status.deleter # 删除
  def flight_status(self):
    print("status got removed...")
f = Flight("CA980")
f.flight_status = 0 # 触发@flight_status.setter 只执行setter装饰的代码
del f.flight_status # 触发@flight_status.deleter 只执行deleter装饰的代码

#执行相应的操作,触发相应的装饰器,此时不会再触发原来的属性,只执行装饰器下面的代码,需要做相应的操作可在代码块里添加(修改,删除);只是触发了而已,装饰器并没有做什么操作

航班查询

类的特殊成员方法:
① __doc__  表示类的描述信息

#__doc__
class Foo:
  """ 描述类信息,这是用于看片的神奇 """
  def func(self):
    pass
print(Foo.__doc__)
# 描述类信息,这是用于看片的神奇

② __module__ 和 __class__
__module__ 表示当前操作的对象在哪个模块
__class__ 表示当前操作的对象的类是什么

# __module__ 和 __class__
class Foo:
  """ 描述类信息,这是用于看片的神奇 """
  def func(self):
    pass
A = Foo()
print(A.__module__)
print(A.__class__)
# __main__
# <class '__main__.Foo'>

③ __init__ 构造方法,通过类创建对象时,自动触发执行

④ __del__析构方法,当对象在内存中被释放时,自动触发执行

⑤ __call__ 对象后面加括号,触发执行
注:__init__的执行是由创建对象触发的,即:对象 = 类名() ;而对于 __call__ 方法的执行是由对象后加括号触发的,即:对象() 或者 类()()

# __call__
class Foo:
  def __init__(self):
    pass
  def __call__(self, *args, **kwargs):
    print('__call__')
obj = Foo() # 执行 __init__
obj() # 执行 __call__
#__call__

⑥ __dict__ 查看类或对象中的所有成员

class Province:
  country = 'China'
  def __init__(self, name, count):
    self.name = name
    self.count = count
  def func(self, *args, **kwargs):
    print('func')
# 获取类的成员,即:静态字段、方法、
print(Province.__dict__)
# 输出:{'__init__': <function Province.__init__ at 0x0054D588>, '__dict__': <attribute '__dict__' of 'Province' objects>,
# '__doc__': None, 'func': <function Province.func at 0x0054D4B0>, '__weakref__': <attribute '__weakref__' of 'Province' objects>,
# 'country': 'China', '__module__': '__main__'}
obj1 = Province('HeBei', 10000)
print(obj1.__dict__)
# 获取 对象obj1 的成员
# 输出:{'count': 10000, 'name': 'HeBei'}

⑦ __str__ 如果一个类中定义了__str__方法,那么在打印 对象 时,默认输出该方法的返回值

#__str__
class Foo:
  def __str__(self):
    return 'solo'
obj = Foo()
print(obj)       #输出__str__返回值 而不是内存地址
# 输出:solo

⑧ __getitem__、__setitem__、__delitem__
用于索引操作,如字典。以上分别表示获取、设置、删除数据

#__getitem__、__setitem__、__delitem__
class Foo(object):
  def __getitem__(self, key):
    print('__getitem__', key)
  def __setitem__(self, key, value):
    print('__setitem__', key, value)
  def __delitem__(self, key):
    print('__delitem__', key)
obj = Foo()
result = obj['k1'] # 自动触发执行 __getitem__
obj['k2'] = 'solo' # 自动触发执行 __setitem__
del obj['k1']
# __getitem__ k1
# __setitem__ k2 solo
# __delitem__ k1

⑨ __new__ \ __metaclass__

 print type(f) # 输出:<class '__main__.Foo'>       表示,obj 对象由Foo类创建
2 print type(Foo) # 输出:<type 'type'>              表示,Foo类对象由 type 类创建

f对象是Foo类的一个实例,Foo类对象是 type 类的一个实例,即:Foo类对象 是通过type类的构造方法创建

是由 type 类实例化产生那么问题来了,类默认是由 type 类实例化产生,type类中如何实现的创建类?类又是如何创建对象?
答:类中有一个属性 __metaclass__,其用来表示该类由 谁 来实例化创建,所以,我们可以为 __metaclass__ 设置一个type类的派生类,从而查看 类 创建的过程

class MyType(type):
  def __init__(self, what, bases=None, dict=None):
    print("--MyType init---")
    super(MyType, self).__init__(what, bases, dict)
  def __call__(self, *args, **kwargs):
    print("--MyType call---")
    obj = self.__new__(self, *args, **kwargs)
    self.__init__(obj, *args, **kwargs)
class Foo(object):
  __metaclass__ = MyType
  def __init__(self, name):
    self.name = name
    print("Foo ---init__")
  def __new__(cls, *args, **kwargs):
    print("Foo --new--")
    return object.__new__(cls)
# 第一阶段:解释器从上到下执行代码创建Foo类
# 第二阶段:通过Foo类创建obj对象
obj = Foo("solo")

反射:通过字符串映射或修改程序运行时的状态、属性、方法。 有以下4个方法
① hasattr(obj,str) 判断一个对象obj里是否有对应的str字符串的方法
② getattr(obj,str) 根据字符串去获取obj对象里的对应的方法的内存地址

class Foo(object):
  def __init__(self,name):
    self.name = name
  def func(self):
    print("func",self.name)
obj = Foo("alex")
str = "func"
print(hasattr(obj,str))  # 检查是否含有成员 有没有obj.str属性
if hasattr(obj,str):
  getattr(obj,str)()   #getattr(obj,str) = obj.str
# True
# func alex

③ setattr(obj,'y','z') obj.y = z 通过字符串添加属性

def bulk(self):
  print("%s is yelling"%self.name)
class Foo(object):
  def __init__(self,name):
    self.name = name
  def func(self):
    print("func",self.name)
obj = Foo("alex")
str = "talk"
print(hasattr(obj,str))  # 检查是否含有成员 有没有obj.str属性
if hasattr(obj,str):
  getattr(obj,str)()   # getattr(obj,str) = obj.str
else:
  setattr(obj,str,bulk)  # setattr(obj,str,bulk 相当于 obj.str = bulk
  getattr(obj,str)()
# False
# alex is yelling

④ delattr(obj,str) 删除obj.str 通过字符串删除属性

class Foo(object):
  def __init__(self,name):
    self.name = name
  def func(self):
    print("func",self.name)
obj = Foo("alex")
str = "name"
if hasattr(obj,str):
  delattr(obj,str)   # 删除属性obj.str
print(obj.name)
# Traceback (most recent call last):
#  File "C:/Users/L/PycharmProjects/s14/preview/Day7/main.py", line 40, in <module>
#   print(obj.name)
# AttributeError: 'Foo' object has no attribute 'name'

相关文章

  • python人工智能tensorflow函数tf.assign使用方法

    python人工智能tensorflow函数tf.assign使用方法

    这篇文章主要为大家介绍了python人工智能tensorflow函数tf.assign使用方法,有需要的朋友可以借鉴参考下,希望能够有所帮助,祝大家多多进步,早日升职加薪
    2022-05-05
  • python自动化之如何利用allure生成测试报告

    python自动化之如何利用allure生成测试报告

    这篇文章主要给大家介绍了关于python自动化之如何利用allure生成测试报告的相关资料,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学习学习吧
    2021-05-05
  • python matplotlib模块基本图形绘制方法小结【直线,曲线,直方图,饼图等】

    python matplotlib模块基本图形绘制方法小结【直线,曲线,直方图,饼图等】

    这篇文章主要介绍了python matplotlib模块基本图形绘制方法,结合实例形式总结分析了Python使用matplotlib模块绘制直线,曲线,直方图,饼图等图形的相关操作技巧,需要的朋友可以参考下
    2020-04-04
  • Python元组的定义及使用

    Python元组的定义及使用

    这篇文章主要介绍了Python元组的定义及使用,在Python中元组是一个和列表非常类似的数据类型,不同之处就是列表中的元素可以修改,而元组之中的元素不可以修改。想具体了解的下小伙伴请参考下面文章的具体内容,希望对你有所帮助
    2021-11-11
  • Python中jieba库的使用方法

    Python中jieba库的使用方法

    jieba库是一款优秀的 Python 第三方中文分词库,本文主要介绍了Python中jieba库的使用方法,具有一定的参考价值,感兴趣的小伙伴们可以参考一下
    2021-06-06
  • Python中getattr函数和hasattr函数作用详解

    Python中getattr函数和hasattr函数作用详解

    这篇文章主要介绍了Python中getattr函数和hasattr函数作用的相关知识,非常不错具有参考借鉴价值,需要的朋友可以参考下
    2016-06-06
  • python3访问字典里的值实例方法

    python3访问字典里的值实例方法

    在本篇内容里小编给大家整理的是一篇关于python3访问字典里的值实例方法,有兴趣的朋友们可以学习参考下。
    2020-11-11
  • 手动安装Anaconda环境变量的实现教程

    手动安装Anaconda环境变量的实现教程

    这篇文章主要介绍了手动安装Anaconda环境变量的实现教程,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学习学习吧
    2023-01-01
  • python3.6、opencv安装环境搭建过程(图文教程)

    python3.6、opencv安装环境搭建过程(图文教程)

    这篇文章主要介绍了python3.6、opencv安装环境搭建,本文图文并茂给大家介绍的非常详细,具有一定的参考借鉴价值,需要的朋友可以参考下
    2019-11-11
  • Keras 中Leaky ReLU等高级激活函数的用法

    Keras 中Leaky ReLU等高级激活函数的用法

    这篇文章主要介绍了Keras 中Leaky ReLU等高级激活函数的用法,具有很好的参考价值,希望对大家有所帮助。一起跟随小编过来看看吧
    2020-07-07

最新评论