[focal/core20][python3.7+] staging conflicts when multiple python parts have the same python dependencies
Metadata
Current evaluation
No evaluation has been recorded for this issue yet.
Issue body
TL;DR: as of python 3.7, .pyc files by default include a timestamp and a size of the source file which results in a change of a hash every time a .pyc file is generated for a given source file. This results in staging conflicts for python parts.
https://docs.python.org/3/library/py_compile.html#py_compile.compile
https://docs.python.org/3/library/py_compile.html#py_compile.PycInvalidationMode.TIMESTAMP
Even if this is set, there is still an issue https://github.com/pypa/pip/issues/8414
Description/analysis:
When building a project with multiple identical python dependencies I consistently get an error like this:
Failed to stage: Parts 'openstack-projects' and 'cluster' have the following files, but with different contents:
bin/activate
bin/activate.csh
bin/activate.fish
bin/python3
pyvenv.cfg
bin/python3
lib/python3.8/site-packages/Flask-1.1.2.dist-info/RECORD
lib/python3.8/site-packages/__pycache__/easy_install.cpython-38.pyc
lib/python3.8/site-packages/certifi/__pycache__/__init__.cpython-38.pyc
lib/python3.8/site-packages/certifi/__pycache__/__main__.cpython-38.pyc
# many other .pyc files ...
While snapcraft suggests that I use something like `organize`, `filesets` and `stage`, the issue is that the source files for those dependencies are identical - there is no reason for any manual work here.
Source hashes are the same:
snapcraft-microstack # sha256sum ./parts/cluster/install/lib/python3.8/site-packages/click/_textwrap.py
6a30b3933165cb9b639bd7e843937dfcc39e69824c063025b6e15aebd9f88976
./parts/cluster/install/lib/python3.8/site-packages/click/_textwrap.py
snapcraft-microstack # sha256sum ./parts/openstack-projects/install/lib/python3.8/site-packages/click/_textwrap.py
6a30b3933165cb9b639bd7e843937dfcc39e69824c063025b6e15aebd9f88976 ./parts/openstack-projects/install/lib/python3.8/site-packages/click/_textwrap.py
.pyc files are different:
snapcraft-microstack # sha256sum ./parts/openstack-projects/install/lib/python3.8/site-packages/click/__pycache__/_textwrap.cpython-38.pyc
398b47a5abfc87e9da73153e42d48dcd5d917bd637a0e0af1eb6999f19fb1085 ./parts/openstack-projects/install/lib/python3.8/site-packages/click/__pycache__/_textwrap.cpython-38.pyc
snapcraft-microstack # sha256sum ./parts/cluster/install/lib/python3.8/site-packages/click/__pycache__/_textwrap.cpython-38.pyc
d4642cfecd727d228944a1d31ff728e7ef6529a7a88898f6568ea6e96d1f8f82 ./parts/cluster/install/lib/python3.8/site-packages/click/__pycache__/_textwrap.cpython-38.pyc
RECORD files include hashes as well, hence they are also different:
snapcraft-microstack # diff ./parts/openstack-projects/install/lib/python3.8/site-packages/Flask-1.1.2.dist-info/RECORD ./parts/cluster/install/lib/python3.8/site-packages/Flask-1.1.2.dist-info/RECORD
1c1
< ../../../bin/flask,sha256=VXQqccMeG03Rn8_yN8Kq3Up13rzyaoHsEckFnCxHor4,242
---
> ../../../bin/flask,sha256=NAzPpe84iZFX3PYsCZEirt3fAFObAjBuCpM25792kSU,231
Apparently, as of python 3.7, .pyc files include a timestamp and a size of the source by default (PycInvalidationMode.TIMESTAMP). There is a way to override this behavior by setting the SOURCE_DATE_EPOCH environment variable to switch py_compile to using PycInvalidationMode.CHECKED_HASH:
https://docs.python.org/3/library/py_compile.html
py_compile.compile(file, cfile=None, dfile=None, doraise=False, optimize=-1, invalidation_mode=PycInvalidationMode.TIMESTAMP, quiet=0)
invalidation_mode should be a member of the PycInvalidationMode enum and controls how the generated bytecode cache is invalidated at runtime. The default is PycInvalidationMode.CHECKED_HASH ***if the SOURCE_DATE_EPOCH environment variable is set***, otherwise ***the default is PycInvalidationMode.TIMESTAMP***.
https://docs.python.org/3/library/py_compile.html#py_compile.PycInvalidationMode.TIMESTAMP
TIMESTAMP
The .pyc file includes the timestamp and size of the source file, which Python will compare against the metadata of the source file at runtime to determine if the .pyc file needs to be regenerated.
https://docs.python.org/3/library/py_compile.html#py_compile.PycInvalidationMode.CHECKED_HASH
CHECKED_HASH
The .pyc file includes a hash of the source file content, which Python will compare against the source at runtime to determine if the .pyc file needs to be regenerated.
Adding something like this seems to be needed:
build-environment:
- SOURCE_DATE_EPOCH: '1591640328'
However, see https://bugs.launchpad.net/snapcraft/+bug/1882535/comments/2
Evaluation history
No evaluation history available.