Python dictdiffer模块

Let’s start with an example on how to find the diff between two dictionaries using diff() method:

from dictdiffer import diff, patch, swap, revert

first = {
    "title": "hello",
    "fork_count": 20,
    "stargazers": ["/users/20", "/users/30"],
    "settings": {
        "assignees": [100, 101, 201],
    }
}

second = {
    "title": "hellooo",
    "fork_count": 20,
    "stargazers": ["/users/20", "/users/30", "/users/40"],
    "settings": {
        "assignees": [100, 101, 202],
    }
}

result = diff(first, second)

assert list(result) == [
    ('change', ['settings', 'assignees', 2], (201, 202)),
    ('add', 'stargazers', [(2, '/users/40')]),
    ('change', 'title', ('hello', 'hellooo'))]
Now we can apply the diff result with patch() method:

result = diff(first, second)
patched = patch(result, first)

assert patched == second
Also we can swap the diff result with swap() method:

result = diff(first, second)
swapped = swap(result)

assert list(swapped) == [
    ('change', ['settings', 'assignees', 2], (202, 201)),
    ('remove', 'stargazers', [(2, '/users/40')]),
    ('change', 'title', ('hellooo', 'hello'))]
Let’s revert the last changes:

result = diff(first, second)
reverted = revert(result, patched)
assert reverted == first
A tolerance can be used to consider closed values as equal. The tolerance parameter only applies for int and float.

Let’s try with a tolerance of 10% with the values 10 and 10.5:

first = {'a': 10.0}
second = {'a': 10.5}

result = diff(first, second, tolerance=0.1)

assert list(result) == []
Now with a tolerance of 1%:

result = diff(first, second, tolerance=0.01)

assert list(result) == ('change', 'a', (10.0, 10.5))
API
Dictdiffer is a helper module to diff and patch dictionaries.

dictdiffer.diff(first, second, node=None, ignore=None, path_limit=None, expand=False, tolerance=2.220446049250313e-16)
Compare two dictionary/list/set objects, and returns a diff result.

Return an iterator with differences between two objects. The diff items represent addition/deletion/change and the item value is a deep copy from the corresponding source or destination objects.

>>> from dictdiffer import diff
>>> result = diff({'a': 'b'}, {'a': 'c'})
>>> list(result)
[('change', 'a', ('b', 'c'))]
The keys can be skipped from difference calculation when they are included in ignore argument of type collections.Container.

>>> list(diff({'a': 1, 'b': 2}, {'a': 3, 'b': 4}, ignore=set(['a'])))
[('change', 'b', (2, 4))]
>>> class IgnoreCase(set):
...     def __contains__(self, key):
...         return set.__contains__(self, str(key).lower())
>>> list(diff({'a': 1, 'b': 2}, {'A': 3, 'b': 4}, ignore=IgnoreCase('a')))
[('change', 'b', (2, 4))]
The difference calculation can be limitted to certain path:

>>> list(diff({}, {'a': {'b': 'c'}}))
[('add', '', [('a', {'b': 'c'})])]
>>> from dictdiffer.utils import PathLimit
>>> list(diff({}, {'a': {'b': 'c'}}, path_limit=PathLimit()))
[('add', '', [('a', {})]), ('add', 'a', [('b', 'c')])]
>>> from dictdiffer.utils import PathLimit
>>> list(diff({}, {'a': {'b': 'c'}}, path_limit=PathLimit([('a',)])))
[('add', '', [('a', {'b': 'c'})])]
>>> from dictdiffer.utils import PathLimit
>>> list(diff({}, {'a': {'b': 'c'}},
...           path_limit=PathLimit([('a', 'b')])))
[('add', '', [('a', {})]), ('add', 'a', [('b', 'c')])]
The patch can be expanded to small units e.g. when adding multiple values:

>>> list(diff({'fruits': []}, {'fruits': ['apple', 'mango']}))
[('add', 'fruits', [(0, 'apple'), (1, 'mango')])]
>>> list(diff({'fruits': []}, {'fruits': ['apple', 'mango']}, expand=True))
[('add', 'fruits', [(0, 'apple')]), ('add', 'fruits', [(1, 'mango')])]
Parameters:    
first – The original dictionary, list or set.
second – New dictionary, list or set.
node – Key for comparison that can be used in dot_lookup().
ignore – List of keys that should not be checked.
path_limit – List of path limit tuples or dictdiffer.utils.Pathlimit object to limit the diff recursion depth.
expand – Expand the patches.
tolerance – Threshold to consider when comparing two float numbers.
Changed in version 0.3: Added ignore parameter.

Changed in version 0.4: Arguments first and second can now contain a set.

Changed in version 0.5: Added path_limit parameter. Added expand paramter. Added tolerance parameter.

Changed in version 0.7: Diff items are deep copies from its corresponding objects.

dictdiffer.patch(diff_result, destination)
Patch the diff result to the old dictionary.

dictdiffer.swap(diff_result)
Swap the diff result.

It uses following mapping:

remove -> add
add -> remove
In addition, swap the changed values for change flag.

>>> from dictdiffer import swap
>>> swapped = swap([('add', 'a.b.c', [('a', 'b'), ('c', 'd')])])
>>> next(swapped)
('remove', 'a.b.c', [('c', 'd'), ('a', 'b')])
>>> swapped = swap([('change', 'a.b.c', ('a', 'b'))])
>>> next(swapped)
('change', 'a.b.c', ('b', 'a'))
dictdiffer.revert(diff_result, destination)
Call swap function to revert patched dictionary object.

Usage example:

>>> from dictdiffer import diff, revert
>>> first = {'a': 'b'}
>>> second = {'a': 'c'}
>>> revert(diff(first, second), second)
{'a': 'b'}
dictdiffer.dot_lookup(source, lookup, parent=False)
Allow you to reach dictionary items with string or list lookup.

Recursively find value by lookup key split by ‘.’.

>>> from dictdiffer.utils import dot_lookup
>>> dot_lookup({'a': {'b': 'hello'}}, 'a.b')
'hello'
If parent argument is True, returns the parent node of matched object.

>>> dot_lookup({'a': {'b': 'hello'}}, 'a.b', parent=True)
{'b': 'hello'}
If node is empty value, returns the whole dictionary object.

>>> dot_lookup({'a': {'b': 'hello'}}, '')
{'a': {'b': 'hello'}}

 

posted on 2020-03-31 19:24  不要挡着我晒太阳  阅读(1310)  评论(0编辑  收藏  举报

导航