石头记
Always Creative.
Welcome!

2021-03-25 | Python, Code

Python精度问题处理

Python中浮点类型之间的运算，其结果并不像我们想象的那样，例如：

>>> 0.1+0.2
0.30000000000000004
>>> 0.1+0.1-0.2
0.0
>>> 0.1+0.1+0.1-0.3
5.551115123125783e-17
>>> 0.1+0.1+0.1-0.2
0.10000000000000003

为什么在计算这么简单的问题上，计算机会出现这样的低级错误呢？真正的原因在于十进制和数和二进制数的转换。以类似 0.1 这样的浮点数为例，如果手动将其转换成二进制，其结果为：

1	0.1(10)=0.00011001100110011...(2)

可以看到，结果是无限循环的，也就是说，0.1 转换成二进制数后，无法精确到等于十进制数的 0.1。同时，由于计算机存储的位数是有限制的，所以如果要存储的二进制位数超过了计算机存储位数的最大值，其后续位数会被舍弃（舍弃的原则是“0 舍 1 入”）。

这种问题不仅在 Python 中存在，在所有支持浮点数运算的编程语言中都会遇到，它不光是 Python 的 Bug。

浮点型数据计算

当我们利用python进行数据计算时，通常会对浮点数保留相应的位数，这时候就会用到round函数，相信各位朋友在进行使用时会遇到各种问题，关于round函数保留精度、保留方法的问题，本文会做一些解释和说明。首先，先将结论告诉大家：round函数采用的是四舍六入五成双的计数保留方法，不是四舍五入！

1、什么是四舍六入五成双？

四舍六入五成双是一种比较科学的计数保留方法。具体的保留方法为：

1) 小于等于4的舍去；
2) 大于等于6的进一；
3) 5的话要看后面有没有有效数字，有的话进一，没有的话要按照5前面数字的奇偶来处理，若5前面为奇数，则进一，若5前面为偶数，舍5不进。

为了便于理解举个例子吧：比如是1.15—>1.2, 1.25—>1.2， 1.250—>1.2, 1.25012—>1.3

2、python中round函数使用

开始在python中使用round函数时，你会发现:round(1.15,1)–>1.1，看到这个结果时是不是有点奇怪，但这个是正常的。这是因为python中对于小数的处理方法造成的，为了保证输入数字的精确性，python内部真正输入的数字是由两个数值相除得出来的，如0.33和0.333等都无法准确表示1/3，写成分数的形式对于输入的数字来说更加精确。

3、为什么python中使用round(1.15)会是1.1呢？

当我们输入1.15时，python内部其实输入的是

当我们对这个数字保留小数位时，根据四舍原则，当然是1.1了

同理，1.55要进一，1.25内部会舍去，看到这里你应该明白了吧！

4、就是要四舍五入怎么办？

如果说非要进行四舍五入，就要用到decimal模块，进行下面处理以后就可以得到：

import decimal
a = decimal.Decimal("10.0")
b = decimal.Decimal("3")
print(10.0/3)
print(a/b)

运行结果为：

1 2	3.3333333333333335 3.333333333333333333333333333

如果 decimal 模块还是无法满足需求，还可以使用 fractions 模块，例如：

1
2
3

from fractions import Fraction
print(10/3)
print(Fraction(10,3))

运行结果为：

1 2	3.3333333333333335 10/3

可以看到，通过 fractions 模块能很好地解决浮点类型数之间运算的问题。

自定义round方法

为保证参数的一致性，copy一份原始的方法，将参数保留并换个名字，如下：

def round(number, ndigits=None): # real signature unknown; restored from __doc__
    """
    原始方法
    round(number[, ndigits]) -> floating point number
    
    Round a number to a given precision in decimal digits (default 0 digits).
    This always returns a floating point number.  Precision may be negative.
    """
    return 0.0

只考虑浮点型和普通整型数值的情况下，只要精度为正数基本上都可以处理：

import math  # 引入数学包，处理幂次
from decimal import Decimal, ROUND_HALF_UP

def my_round1(number, percision=None):
    """
    自定义浮点型处理
    Round a number to a given precision in decimal digits (default 0 digits).
    This always returns a floating point number. 
    :param number: 数值
    :param percision: 浮点型的精度位数
    :return:
    """
    if not percision:
        percision = 0
    percision = str(math.pow(0.1, percision))
    return Decimal(number).quantize(Decimal(percision), rounding=ROUND_HALF_UP)

if __name__ == '__main__':
    print(round(3.14, 2), my_round(3.14, 2))
    print(round(6.25, 3), my_round(6.25, 3))
    print(round(1.15, 1), my_round(1.15, 1))  # 特殊值
    print(round(1.25, 1), my_round(1.25, 1))
    print(round(1.250, 2), my_round(1.25, 2))
    print(round(1.25012, 1), my_round(1.25012, 1))
    print round(6.25, -1), my_round(6.25, -1)  # 异常值
    print round(6.25, -2), my_round(6.25, -2)  # 异常值

运行结果如下：

(3.14, Decimal('3.14'))
(6.25, Decimal('6.250'))
(1.1, Decimal('1.1'))
(1.3, Decimal('1.3'))
(1.25, Decimal('1.25'))
(1.3, Decimal('1.3'))
(10, Decimal('6.3'))
(0.0, Decimal('6.3'))

但是对于精度传值为负数的情况，处理会出问题，比如 round(6,25,-1)实际值为10，但是my_round(6.25, -1) 为6.3，因此需要特别处理，就以纯数学的方式补充：

def my_round2(number, percision=None):
    """
    自定义浮点型处理
    Round a number to a given precision in decimal digits (default 0 digits).
    This always returns a floating point number. Precision may be negative.
    :param number: 数值
    :param percision: 浮点型的精度位数
    :return:
    """
    if not percision:
        percision = 0
    if percision <= 0:
        # 整数或者负精度的数字的处理方式
        return int(number * 10 ** percision + 0.5) / 10 ** percision
    percision = str(math.pow(0.1, percision))
    return Decimal(number).quantize(Decimal(percision), rounding=ROUND_HALF_UP)

测试代码如下：

if __name__ == '__main__':
    print round(3.14, 2), my_round(3.14, 2)
    print round(6.25, -1), my_round(6.25, -1)
    print round(6.25, -2), my_round(6.25, -2)
    print round(6.25, 3), my_round(6.25, 3)
    print round(1.15, 1), my_round(1.15, 1)  # 特殊值
    print round(1.25, 1), my_round(1.25, 1)
    print round(1.250, 2), my_round(1.25, 2)
    print round(1.25012, 1), my_round(1.25012, 1)
    print round(-1.25012, 0), my_round(-1.25012, 0)  # 异常值

运行结果：

3.14 3.14
10.0 10.0
0.0 0.0
6.25 6.250
1.1 1.1
1.3 1.3
1.25 1.25
1.3 1.3
-1.0 0

发现对于负数的计算，存在不同的进位方式，与正数处理略有不同，因此考虑先提取符号：

def my_round3(number, percision=None):
    """
    自定义浮点型处理
    Round a number to a given precision in decimal digits (default 0 digits).
    This always returns a floating point number.Precision may be negative.
    :param number: 输入数值
    :param percision: 浮点型的精度位数
    :return:
    """
    negative = 1  # 默认是正数
    if not percision:
        percision = 0
    if number < 0:
        negative = -1
        number = number / -1
    if percision <= 0:
        # 整数或者负精度的数字的处理方式
        return int(number * 10 ** percision + 0.5) / 10 ** percision * negative
    percision = str(math.pow(0.1, percision))
    return Decimal(number).quantize(Decimal(percision), rounding=ROUND_HALF_UP) * negative


if __name__ == '__main__':
    print round(3.14, 2), my_round(3.14, 2)
    print round(6.25, -1), my_round(6.25, -1)
    print round(6.25, -2), my_round(6.25, -2)
    print round(6.25, 3), my_round(6.25, 3)
    print round(1.15, 1), my_round(1.15, 1)  # 特殊值
    print round(1.25, 1), my_round(1.25, 1)
    print round(1.250, 2), my_round(1.25, 2)
    print round(1.25012, 1), my_round(1.25012, 1)
    print round(-1.25012, 0), my_round(-1.25012, 0)
    print round(-6.251, 1), my_round(-6.251, 1)
    print round(-6.151, 1), my_round(-6.151, 1)
    print round(-1.25012, 2), my_round(-1.25012, 2)
    print round(-100.1, 2), my_round(-100.1, 2)

运行结果如下，某些例子上新的方法反而更准确：

3.14 3.14
10.0 10.0
0.0 0.0
6.25 6.250
1.1 1.1
1.3 1.3
1.25 1.25
1.3 1.3
-1.0 -1
-6.3 -6.3
-6.2 -6.2
-1.25 -1.25
-100.1 -100.10

加入批量的单元测试例子, 示例如下，输入值随机、整数部分随机位数，符号随机分别验证：

def test_my_round():
	for i in range(1, 100):
        # 随机100个数来验证此方法与原来的round方法的计算一致性
        a = random.random()  # 随机一个0-1以内的浮点数
        bit = random.choice([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
        negative = random.choice([-1, 1])
        number = a * 10 ** bit   # 扩充实际位数以及量级
        percision = negative * bit  
        print(number, percision, round(number, percision), my_round(number, percision))
        if round(number, percision) != my_round(number, percision):
            assert "&#123;&#125; != &#123;&#125;".format(round(number, percision),
                                     my_round(number, percision))

经确认无误，所以形成最后的方法，在实际使用时直接替换掉round的调用与使用即可。

结语

因为浮点格式采用“科学计数法” 的关系，用浮点数表示一些极大和极小的数值比较方便，但是在处理浮点数的时候经常会发生精度损失的问题。为此在实际使用过程中，自定义浮点型的数据处理会更有效，虽然麻烦些，但是不会导致数据显示异常等情况。

最后给一个完整的可用的funtion，可以直接拿来用, 替换旧的round系统方法:

def my_round(number, percision=None):
    """
    自定义浮点型处理
    Round a number to a given precision in decimal digits (default 0 digits).
    This always returns a floating point number.  Precision may be negative.
    :param number: 输入数值
    :param percision: 浮点型的精度位数
    :return:
    """
    sign = 1  # 正负号处理， 默认是正号
    if number < 0:
        sign = -1  # 符号变更
        number = number / -1
    if not percision:
        percision = 0
    if percision <= 0:
        # 整数或者负精度的数字的处理方式
        return sign * int(number * 10 ** percision + 0.5) / 10 ** percision
    percision = str(math.pow(0.1, percision))
    return Decimal(number).quantize(Decimal(percision), rounding=ROUND_HALF_UP) * sign