PEP 468: Preserving Keyword Argument Order

Python 3.6.0 optimized the dictionary. The new dictionary is faster, takes up less memory, and is amazing.Looking at the data on the Internet, most of them point to [Python-Dev] More compact dictionaries with faster iteration This article is creative in outlining the differences between the old dictionary and the new dictionary, as well as the optimizations.

However, I was very curious how such a structure was implemented in C, so I looked at the source code.I found the Python dictionary source code for versions 3.5.9 and 3.6.0, respectively, and compared it to find that the implementation of dictionaries in Python is full of god's actions, which is exciting.Thus, an idea arises, it would be better to talk about PEP 468's improvement on dictionary from the source point of view. [Python-Dev] More compact dictionaries with faster iteration To supplement.

Comparing the code differences between 3.5.9 and 3.6.0 won't make it clear, so I'll have to go into a bit more verbosity and analyze the dictionary data structure completely before you can happily compare the differences:)

The default refers to Python version 3.6.0 without special instructions.

New features

On Python's new feature Change Recording page, you can see that Python started with version 3.6, supports ordered dictionaries, and consumes less memory.

Python 3.6.0 beta 1

Release date: 2016-09-12

Core and Builtins

...

bpo-27350: dict implementation is changed like PyPy. It is more compact and preserves insertion order. (Concept developed by Raymond Hettinger and patch by Inada Naoki.)

...

A brief description of dict data structure

I wanted to compare the structure of Python 3.5.9, but later I thought it was unnecessary.There is not much difference between the two, and it doesn't take much trouble to compare them. I'll just introduce the dictionary structure of Python 3.6.0 and then compare the source details directly.Re-reference [Python-Dev] More compact dictionaries with faster iteration Is clearer.

structure

There are three main structures related to dictionary objects:

PyDictObject (./Include/dictobject.h)

PyDictKeysObject (./Objects/dict-common.h) is the struct _dictkeysobject inside the header file

PyDictKeyEntry (./Objects/dict-common.h)

Following is a sequence of definitions for each data:

PyDictObject

Dictionary objects, all dictionaries in Python, are either created by ourselves with dict() or the u dict_u attribute of a class.

PyObject_HEAD

Everything in Python is an object, and these objects are typeless, so think about these two questions: In C language, the type is fixed, and there is no runtime type checking, so how to implement dynamic calls?How do I recognize the type after a dynamic call?

Yes, it is "Look at the face". If each object has such a PyObject_HEAD that contains type information, it can be called dynamically with the pointer, and then the type can be identified dynamically based on the type information.For example, if you have many, many objects, each TA has a fixed height and weight. What if you appoint this one today and that one tomorrow, the "type" has changed?That's okay. It makes sense to use your mobile phone to "call dynamically", listen to voices or meet "recognize types". Ha-ha-ha-ha...

To add another word, be sure to do a good "type check". If your object finds you are thinking of someone else, it will roll over!Then the program crashes!

ma_used

The number of item s in the current dictionary.

ma_version_tag

There is a 64-bit unsigned global variable pydict_global_version, which is given a global variable of + 1 for dictionary creation and every change, and then assigned (not a reference) to ma_version_tag.So at different times, just watch the ma_version_tag change to know that the dictionary has not changed, and there is no need to check the contents of the dictionary.This feature optimizes the program.Reference resources PEP 509 -- Add a private version to dict

PyDictKeysObject *ma_keys

Dictionary key object pointer, although this object is called KeysObject, it also has value, which is valid in combined mode (i.e. me_value); in splitted mode, me_value is invalid, PyObject **ma_values is used instead.

dk_refcnt

In "splitted" mode, a PyDictKeysObject is shared by many PyDictObject s, and this reference count works.

dk_size

Dictionary hash table space size refers to the actual requested memory space, similar to the meaning of the capacity attribute of the vector s in C++.

This value is also the size of the array dk_indices, which must be an integer power of 2 and be dynamically expanded when not enough is needed.

Although Curiosity Kills the cat, why must it be an integer power of 2?I take out a few lines of code.

This understands that the data type of the hash value is size_t, which may cause the hash table to cross the bounds, so to take the remainder of the hash table length, dk_size is specified as an integer power of 2 in order to speed up the remainder operation with operations.

dk_lookup

Find the function to find the specified element from the hash table.There are four functions.

/* Function to lookup in the hash table (dk_indices): ​ - lookdict(): general-purpose, and may return DKIX_ERROR if (and ​ only if) a comparison raises an exception.

​ - lookdict_unicode(): specialized to Unicode string keys, comparison of ​ which can never raise an exception; that function can never return ​ DKIX_ERROR.

​ - lookdict_unicode_nodummy(): similar to lookdict_unicode() but further ​ specialized for Unicode string keys that cannot be the value.

​ - lookdict_split(): Version of lookdict() for split tables. */

Python uses a large number of dictionaries that use strings as keys, so it has been specifically optimized to use strings as keys as much as possible!

dk_usable

The number of entries (hash-key-value) available in the dictionary, which accounts for only 2/3 of dk_size in order to reduce hash collisions, is set by the USABLE_FRACTION macro.

This value is also the size of the array dk_entries or ma_values at initialization and is dynamically expanded when not enough is needed.

dk_nentries

The number of entries used for the array dk_entries or ma_values.

dk_indices

Hash index table array, which is a hash table, but stores the index of the elements in dk_entries.

In Python, a hash-key-value combination is called an entry, and this concept often occurs.Notice it andma_values The difference is that dk_entries is an array, with the storage area immediately following dk_indices, and ma_values is a pointer to a storage area that is not at the end of PyDictObject.When you analyze the dictresize() function, you see the impact of this feature.

me_hash

me_key

me_value

... (Next PyDictKeyEntry)

PyObject **ma_values

Python 3.3 introduces a new dictionary implementation: splittedDict, which is a structure implemented for class attributes, imagine an application scenario in which a class, with its name unchanged after definition (assuming it does not change dynamically), has many different instances with different attribute values, but with the same attribute name, can save memory if these u dict_ share a set of key s.Reference resources PEP 412 -- Key-Sharing Dictionary

In this mode, the ma_keys pointer inside multiple different PyDictObject objects points to the same PyDictKeysObject object.The entry(hash-key-value) in the original dictionary was a whole, which was very touching. Now key merging means entry merging. Value is also forced to merge, but we cannot let value merge because the keys of different PyDictKeysObject objects in this mode are the same, but the values are different. There is no way but to add value outside the entry structureArray, instead of the entry->value that was forced to merge, this additional value array is appended to several different PyDictObject objects.This separates key from value, so it's called "splitted"

/* If ma_values is NULL, the table is "combined": keys and values ​ are stored in ma_keys.

​ If ma_values is not NULL, the table is splitted: ​ keys are stored in ma_keys and values are stored in ma_values */

PyObject **ma_values;

In addition, the splitted pattern has two conditions that match the class properties:

Only string (unicode) keys are allowed. All dicts sharing same key must have same insertion order.

Source code

Compare the structure of Python 3.5.9 with that of 3.6.0. Python 3.6.0 adds a lot of comments that are not available in Python 3.5.9. Very good behavior!!However, only comments from different sections are retained here.

DK_SIZE acquisitiondk_size That is, the number of elements in the array dk_indices

DK_IXSIZE sets the current number of bytes per element of dk_indices based on dk_size

DK_ENTRIES obtains the first address of the dk_entries array from the DK object (PyDictKeysObject object)

Thus, dk_indices.as_1 [DK_SIZE(dk)* DK_IXSIZE(dk)] will be located at the first address after dk_indices, which is just beyond the bounds of dk_indices.What?Cross the border?Yes, because the space corresponding to dk_entries is immediately followed, this address obtained by the DK_ENTRIES macro is the first address of the dk_entries array.What an interesting way to play:)

Why bother?Is it not good to directly define dk_entries as in Python 3.5.9?I think this is probably because dk_indices are also dynamic.If dk_entries are defined directly, their first address is fixed relative to the structure, and when the length of the dk_indices array changes dynamically, statements such as &dk->dk_entries[0] will give the wrong address.The new_keys_object() function is also required for the specific memory distribution.

Why small

Next, the new_keys_object() function is analyzed, from which you can see the memory distribution of the PyDictKeysObject object.I commented on the key points and omitted some code that affected the understanding process.

Comparing the memory space requested by these two PyObject_MALLOC() functions shows why the new dictionary takes up less memory.

After analyzing the memory layout, the improvement of PEP 468 is very clear. Now you can check with the information provided by PEP 468 that if you have a feeling of openness, that's right. If not, dense hair may be preventing you from getting stronger, so shaving is recommended.Follow PEP 468 instructions link to find [Python-Dev] More compact dictionaries with faster iteration , the first entries array described inside corresponds to version 3.5.9 dk_entries; the subsequent indices and entries correspond to version 3.6.0 dk_indices and dk_entries arrays, which correspond to the code above.

The current memory layout for dictionaries is unnecessarily inefficient. It has a sparse table of 24-byte entries containing the hash value, key pointer, and value pointer.

Instead, the 24-byte entries should be stored in a dense table referenced by a sparse table of indices.

Only the data layout needs to change. The hash table algorithms would stay the same. All of the current optimizations would be kept, including key-sharing dicts and custom lookup functions for string-only dicts. There is no change to the hash functions, the table search order, or collision statistics.

Having read the source code, I can understand the instructions more deeply. Hehey:)

However, this is not the end of the story. The new_keys_object() function simply creates a PyDictKeysObject object with the ultimate goal being PyDictObject. The function that creates the PyDictObject object is PyDict_New()

Here, even when a dict object is formally created, we can start playing happily in Python.Note, however, that the dictionary created here is in the combined mode.The "splitted" mode dictionary also initializes ma_values based on the "combined" mode, and I am not lazy to go into details here.

Why Ordered

From the data structure analyzed earlier, we know that dictionary elements are stored in the dk_entries array.When a data structure is ordered, it means that the elements within it are in the same order as they were inserted.The index of the element insertion hash table is calculated by the hash function and should be out of order, which is why the previous dictionary elements were out of order.Python 3.6.0 introduced dk_indices arrays, which specifically record hash table information, so that the order information in which elements are inserted is preserved in the dk_entries array.To satisfy your curiosity, here's an analysis of the insertion function.

In the insert function, the first focus should be on the code IX = mp->ma_keys->dk_lookup (mp, key, hash, &value_addr, &hashpos).dk_lookup Is a function pointer to one of the four search functions. It is necessary to explain the parameters and return values here:

parameter

PyDictObject *mp (known parameter)

Dictionary object, in which to look.

PyObject *key (known parameter)

The key in entry, representing a reference to the key object, is used for the first determination, if the reference is the same, it is found; if it is different, hash is determined

Py_hash_t hash (known parameter)

Hash in entry, used for the second decision, is found if the hash values are the same; if they are different, they are not found.

PyObject ***value_addr (unknown parameter, return data with pointer)

If an element is found, value_addr returns a pointer to the corresponding me_value; if it is not found, *value_addr is NULL

Py_ssize_t *hashpos (unknown parameter, return data with pointer)

hashpos returns the position of the element in the hash table.

Return value

Py_ssize_t ix

Returns the index of the element in the dk_entries array.If it is not a valid element, ix may be one of DKIX_EMPTY, DKIX_DUMMY, or DKIX_ERROR, which represents the new empty space in the dk_entries array, deletes the empty space left by the old value, and is an error.

Knowing what each parameter does, you can continue to enjoy the code.Then you see the phrase EP = &ep0[mp->ma_keys->dk_nentries], which, according to the code below, is where the new element is inserted, representing a PyDictKeyEntry object pointer, whereas mp->ma_keys->dk_nentries is at the end of the dk_entries array.That is, each time a new element is inserted into the dictionary, it is placed in the dk_entries array, keeping the insertion order.So where does the hash function calculate the insertion position?The answer is in the dk_set_index (mp->ma_keys, hashpos, mp->ma_keys->dk_nentries) function.

You can see that the insertion location calculated by the hash function is saved in the dk_indices array, and the information stored for the corresponding insertion location is the index of this element in the dk_entries array. If you don't see it, look back [Python-Dev] More compact dictionaries with faster iteration Description in.

You can see that the code *value_addr = value fills in dk_entries, but the information here is not enough. Value_addr comes from the search function, so I found the general search function lookdict to see the key code inside it to get the insertion location.

It is clear that the hash function calculates a location that corresponds directly to the dk_entries array, and that elements are also placed directly into it, without the dk_indices array.Because the hash values are not continuous, the elements that we insert into the dk_entries array in turn are also discontinuous. If you don't see it, look back [Python-Dev] More compact dictionaries with faster iteration Description in.

Why fast

The reason for faster iterations is due to the densification of the dk_entries array, which results in fewer iterations.Python versions 3.5.9 and 3.6.0 don't differ much in how they are written, so here's just a comparison of data replication codes from dictresize ().A detailed analysis of the dictresize() function is included in the appendix.

Do you know what code saves time now?Is all blocks of for (i = 0; I < oldkeys - > dk_nentries; i++) {...} code.In Python 3.5.9, they correspond to for (i = 0; I < oldsize; i++) {...}, where oldsize equals oldkeys->dk_size, which is no different depending on how the code is written, but according to the USABLE_FRACTION settings, dk_nentries account for only two-thirds of dk_size, so the number of new dictionary iterations is less than one-third.The same reason is for the faster iteration in the dict_items() function.

In addition to space savings, the new memory layout makes iteration faster. Currently, keys(), values, and items() loop over the sparse table, skipping-over free slots in the hash table. Now, keys/values/items can loop directly over the dense table, using fewer memory accesses.

Another benefit is that resizing is faster and touches fewer pieces of memory. Currently, every hash/key/value entry is moved or copied during a resize. In the new layout, only the indices are updated. For the most part, the hash/key/value entries never move (except for an occasional swap to fill a hole left by a deletion).

With the reduced memory footprint, we can also expect better cache utilization.

Source Appearance Affection

Python's dictionary implementation is a set of tradeoff art, and there are too many things to think about:

Use space as a percentage of application space

Selection of hash and probe functions

Initialize Minimum Space Required

How much to expand a dictionary when it expands

Effect of Element Distribution on CPU Cache

The parameters in Python are currently tested extensively and the scenario is comprehensive.However, tradeoff's art also includes optimization for specific application scenarios, and performance can be improved if Python parameters can be optimized based on actual business scenarios.

In addition, there are several performance optimization points:

The cache pool only caches small objects, and the creation and expansion of large-capacity dictionaries requires a re-application of memory each time.How small is it?

#define PyDict_MINSIZE 8

8 allows dicts with no more than 5 active entries.

Given the existence of the lookdict_unicode() function, try using strings as key s

Refer to some of the comments in. /Objects/dictnotes.txt and. /Objects/dictobject.c.

appendix

dict expansion source code analysis

The expansion occurs when an element is inserted, and the dictionary is expanded when mp->ma_keys->dk_usable <= 0. The new capacity is calculated using the GROWTH_RATE macro.The dictresize() function handles "combined" and "splitted" cases, which need to be looked at separately.

/* Python 3.6.0 */
/* ./Objects/dictobject.c */
/* GROWTH_RATE. Growth rate upon hitting maximum load.
* Currently set to used*2 + capacity/2.
* This means that dicts double in size when growing without deletions,
* but have more head room when the number of deletions is on a par with the
* number of insertions.
* Raising this to used*4 doubles memory consumption depending on the size of
* the dictionary, but results in half the number of resizes, less effort to
* resize.
* GROWTH_RATE was set to used*4 up to version 3.2.
* GROWTH_RATE was set to used*2 in version 3.3.0
*/
#define GROWTH_RATE(d) (((d)->ma_used*2)+((d)->ma_keys->dk_size>>1))
/*
Restructure the table by allocating a new table and reinserting all
items again. When entries have been deleted, the new table may
actually be smaller than the old one.
If a table is split (its keys and hashes are shared, its values are not),
then the values are temporarily copied into the table, it is resized as
a combined table, then the me_value slots in the old table are NULLed out.
After resizing a table is always combined,
but can be resplit by make_keys_shared().
*/
static int
dictresize(PyDictObject *mp, Py_ssize_t minsize)
{
Py_ssize_t i, newsize;
PyDictKeysObject *oldkeys;
PyObject **oldvalues;
PyDictKeyEntry *ep0;
/* Find the smallest table size > minused. */
/* 1. Calculate New Size */
for (newsize = PyDict_MINSIZE;
newsize < minsize && newsize > 0;
newsize <<= 1)
;
if (newsize <= 0) {
PyErr_NoMemory();
return -1;
}
/* 2. Request a new PyDictKeysObject object */
oldkeys = mp->ma_keys;
oldvalues = mp->ma_values;
/* Allocate a new table. */
mp->ma_keys = new_keys_object(newsize);
if (mp->ma_keys == NULL) {
mp->ma_keys = oldkeys;
return -1;
}
// New table must be large enough.
assert(mp->ma_keys->dk_usable >= mp->ma_used);
if (oldkeys->dk_lookup == lookdict)
mp->ma_keys->dk_lookup = lookdict;
/* 3. Element Removal */
mp->ma_values = NULL;
ep0 = DK_ENTRIES(oldkeys);
/* Main loop below assumes we can transfer refcount to new keys
* and that value is stored in me_value.
* Increment ref-counts and copy values here to compensate
* This (resizing a split table) should be relatively rare */
if (oldvalues != NULL) {
/* 3.1 splitted table Convert to combined table */
for (i = 0; i < oldkeys->dk_nentries; i++) {
if (oldvalues[i] != NULL) {
Py_INCREF(ep0[i].me_key); // To copy the key, the original key is used, so increase the reference count
ep0[i].me_value = oldvalues[i];
}
}
}
/* Main loop */
for (i = 0; i < oldkeys->dk_nentries; i++) {
PyDictKeyEntry *ep = &ep0[i];
if (ep->me_value != NULL) {
insertdict_clean(mp, ep->me_key, ep->me_hash, ep->me_value);
}
}
mp->ma_keys->dk_usable -= mp->ma_used;
/* 4. Clean up old values */
if (oldvalues != NULL) {
/* NULL out me_value slot in oldkeys, in case it was shared */
for (i = 0; i < oldkeys->dk_nentries; i++)
ep0[i].me_value = NULL;
DK_DECREF(oldkeys);
if (oldvalues != empty_values) {
free_values(oldvalues);
}
}
else {
assert(oldkeys->dk_lookup != lookdict_split);
assert(oldkeys->dk_refcnt == 1);
DK_DEBUG_DECREF PyObject_FREE(oldkeys);
}
return 0;
}

Before analyzing the contents of a function, look at the instructions before the function:

Restructure the table by allocating a new table and reinserting all items again. When entries have been deleted, the new table may actually be smaller than the old one. If a table is split (its keys and hashes are shared, its values are not), then the values are temporarily copied into the table, it is resized as a combined table, then the me_value slots in the old table are NULLed out. After resizing a table is always combined, but can be resplit by make_keys_shared().

This explanation tells us two important things:

The new dictionary may be smaller than the old one because there may be some deleted entries in the old dictionary.(After the dictionary deletes the elements, in order to keep the probe sequence open, the element state changes to dummy, and these dummy state elements are removed when creating a new dictionary)

Nevertheless, in order to be lazy, I still call this an "expansion".

Dictionaries in "splitted" mode are always expanded to "combined" mode and can be readjusted to "splitted" mode using the make_keys_shared() function.Expansion copies the original detached values into entry.

I divide this function into four steps:

Calculate New Size

The program recalculates the dictionary size, but is the parameter Py_ssize_t minsize not a dictionary size?Why recalculate?

As the name implies, minsize specifies the size of the dictionary expected by the caller, but the size must be an integer power of 2, so recalculate it.Looking at the new_keys_object() function, you'll also find that the function starts with the following sentence: assert(IS_POWER_OF_2(size)), which is a hard requirement because it was already mentioned in the introduction of the data structure, referencedk_size Description.

Request a new PyDictKeysObject object

In "combined" mode, the parts that need to be expanded are dk_indices and dk_entries inside the PyDictKeysObject object, which are not directly expanded by the program because dk_indices anddk_entries Instead of pointers, they occupy a contiguous area of memory behind the structure PyDictKeysObject, so they directly reapply for a new PyDictKeysObject object.

In "splitted" mode, an additional extension of ma_values would have been required, but since the extension converts the dictionary to "combined" mode, there is no need to actually expand ma_values. Instead, request a new PyDictKeysObject object, transfer ma_values to dk_entries, and point ma_values to NULL.

Curiosity has returned. Why not set dk_indices and dk_entries as pointers to separate arrays?Wouldn't it be possible to expand the array with functions like realloc?There is also no need to reapply the PyDictKeysObject object or manually copy the array elements.

I can't find the answer to this question online or in the source code, so I'll just guess it myself.If I switched to pointers now and the two arrays applied for separate space outside the structure, there would be two problems:

Code fragmentation.Originally there was only one PyDictKeysObject object, now there are two more external arrays. In addition to adding memory management code, the code also needs to detect whether the *dk_indices pointer and *dk_entries pointer are empty in each function.

Frequent memory request releases cause performance problems.Cache pools now add objects to the cache when PyDictKeysObject objects are released and do not destroy them immediately. The original dk_indices and dk_entries are arrays inside the structure, which can be cached with the structure, but not with a pointer.To solve this problem, a separate cache pool is added to the external array, which in turn causes code fragmentation.

It can't be said that either method is good or bad. This is a tradeoff problem. Time, space, maintainability and not both.

Fish and bear's paw can't be bought at the same time.-- "Fish I Want It"

Newton's third law. You got to leave something behind. --<Interstellar>

Element Removal

splitted table to combined table

This step transfers ma_values to me_value in dk_entries so that the splitted table can be operated on later in the "combined" mode, and then restores me_value in dk_entries.Note the Py_INCREF(ep0[i].me_key) operation, which increases the reference count for each key. Why?The reason is also found in the insertdict_clean() function.

This function is faster than the insertdict() function, and its origin should be consulted issue1456209 .Notice this sentence in the note preceding the function:

Note that no refcounts are changed by this routine; if needed, the caller is responsible for incref'ing key and value.

This function simply copies the value and does not change any reference counts.As you can see here, the reference count should be added by one when the key from the old PyDictKeysObject object is copied to the newly applied PyDictKeysObject object.

So the question arises again, why hasn't the reference count of value increased?Don't forget that you are now working on split table.Old PyDictKeysObject objects are shared by many PyDictObjects, so keys are also shared. In order not to affect other PyDictObject objects, you need to copy keys into the new PyDictKeysObject object; oldvalues =Mp->ma_values are private in the PyDictObject object and can be moved to a new PyDictKeysObject object without retaining the original value, so there is no need to modify the reference count.

To make it easier to understand, let's say: I copied Li Hua's homework. After I copied it, Li Hua's homework will be returned to Li Hua, and he will also hand it in. So the citation count of the homework increased one of my copies, which is the case of copying the key.And I thought, copying too much would be found by the teacher, so I copied some of the content again, just throw it away before, so even though I wrote my homework twice, I only handed in one at last, and the citation count of the homework did not change. This is the case with mobile value.

To make it easier to understand, let's make it simple: key is copy, reference count + 1; value is move, reference count + 0

Clean up old values

In "combined" mode, the old PyDictKeysObject object is released directly;

In "splitted" mode, you need to restore me_value to NULL in dk_entries in the old PyDictKeysObject object for reason reference 3.1 First sentence inside.Finally, release the ma_values array.