Using valgrind with cython
Identifying memory leaks in cython with valgrind
How to use valgrind to track down memory leaks in cython. This example walks through the process for a bug in spaCy reported in issue #3618 and fixed in PR #4486.
-
Create a minimal script
minimal.py
that runs the code where you suspect a memory leak:import spacy nlp = spacy.load('en') doc = nlp("This is a sentence.")
-
Download the valgrind suppressions file from CPython and uncomment the lines related to
PyObject_Free
andPyObject_Realloc
as instructed in the header: valgrind-python.supp -
Run valgrind with
--leak-check=full
to get detailed logs about where the memory related to the leaks is allocated:valgrind --tool=memcheck --leak-check=full \ --suppressions=valgrind-python.supp --log-file=minimal.valgrind.log \ python minimal.py
(Side note: setting
PYTHONMALLOC=malloc
(for python3.6+) lets valgrind provide a more detailed analysis of python’s memory allocation, but I didn’t need it to find this kind of cython-specific memory leak.) -
Inspect the saved log file. The end of the file provides a summary:
==10207== LEAK SUMMARY: ==10207== definitely lost: 3,936 bytes in 16 blocks ==10207== indirectly lost: 0 bytes in 0 blocks ==10207== possibly lost: 149,361 bytes in 94 blocks ==10207== still reachable: 2,667,208 bytes in 1,535 blocks ==10207== suppressed: 32 bytes in 1 blocks
The
definitely lost
bytes indicate memory leaks. If a memory leak is small and happens once on initialization, it may not be a major problem. If you add a loop to the minimal python script and notice that that amount of memory lost is increasing as you increase the number of iterations, then you clearly have a problematic memory leak. -
Modify
minimal.py
so that the minimal example is executed 10 times:import spacy nlp = spacy.load('en') for i in range(10): doc = nlp("This is a sentence.")
When
doc = nlp("This is a sentence.")
is executed 10 times, the summary looks like this:==29544== LEAK SUMMARY: ==29544== definitely lost: 31,000 bytes in 105 blocks ==29544== indirectly lost: 0 bytes in 0 blocks ==29544== possibly lost: 148,289 bytes in 92 blocks ==29544== still reachable: 2,667,504 bytes in 1,536 blocks ==29544== suppressed: 32 bytes in 1 blocks
-
Search for
definitely lost
in the log file to find more information about where the allocations for the memory leaks occurred, e.g.:==10207== 1,024 bytes in 2 blocks are definitely lost in loss record 667 of 878 ==10207== at 0x4837B65: calloc (vg_replace_malloc.c:752) ==10207== by 0x20641C1A: __pyx_f_5spacy_6syntax_13_parser_model_resize_activations(__pyx_t_5spacy_6syntax_13_parser_model_ActivationsC*, __pyx_t_5spacy_6syntax_13_parser_model_SizesC) (_parser_model.cpp:6096) ==10207== by 0x206450F7: __pyx_f_5spacy_6syntax_13_parser_model_predict_states(__pyx_t_5spacy_6syntax_13_parser_model_ActivationsC*, __pyx_t_5spacy_6syntax_6_state_StateC**, __pyx_t_5spacy_6syntax_13_parser_model_WeightsC const*, __pyx_t_5spacy_6syntax_13_parser_model_SizesC) (_parser_model.cpp:6254)
The third line above indicates that the leaking memory was allocated on line 6096 of
_parser_model.cpp
:/* "spacy/syntax/_parser_model.pyx":72 * A.token_ids = <int*>calloc(n.states * n.feats, sizeof(A.token_ids[0])) * A.scores = <float*>calloc(n.states * n.classes, sizeof(A.scores[0])) * A.unmaxed = <float*>calloc(n.states * n.hiddens * n.pieces, sizeof(A.unmaxed[0])) # <<<<<<<<<<<<<< * A.hiddens = <float*>calloc(n.states * n.hiddens, sizeof(A.hiddens[0])) * A.is_valid = <int*>calloc(n.states * n.classes, sizeof(A.is_valid[0])) */ __pyx_v_A->unmaxed = ((float *)calloc(((__pyx_v_n.states * __pyx_v_n.hiddens) * __pyx_v_n.pieces), (sizeof((__pyx_v_A->unmaxed[0])))));
Line 72 of
_parser_model.pyx
is where the memory was allocated in cython:if A._max_size == 0: A.token_ids = <int*>calloc(n.states * n.feats, sizeof(A.token_ids[0])) A.scores = <float*>calloc(n.states * n.classes, sizeof(A.scores[0])) A.unmaxed = <float*>calloc(n.states * n.hiddens * n.pieces, sizeof(A.unmaxed[0])) A.hiddens = <float*>calloc(n.states * n.hiddens, sizeof(A.hiddens[0])) A.is_valid = <int*>calloc(n.states * n.classes, sizeof(A.is_valid[0])) A._max_size = n.states
-
Searching the code shows that there’s no
free()
associated with thesecalloc()
calls, identifying the cause of the memory leak.In this case, restructuring the code with utility functions that allocate and free the memory when these structs are used in
nn_parser.pyx
fixes the problem: https://github.com/explosion/spaCy/commit/3dfc76457709818fd3675b727d34e056aa6d434c