[File] [PATCH] Improve python magic checks based on PEP 552
Michał Górny
mgorny at gentoo.org
Tue Jul 19 15:14:06 UTC 2022
Replace the large part of hardcoded Python magic numbers with a simpler
check based on PEP 552, implemented in Python 3.7 (magic 3392+).
According to PEP 552, the .pyc file starts with the following header
(in pseudocode):
uleshort magic_number
string "\x0d\x0a"
ulelong flags
union {
struct {
ulelong timestamp
ulelong size
}
ulequad hash
}
The magic number is monotonically increasing. Starting with Python
3.11, the range for each version is supposed to start with 2900+50n
where n is the minor number. However, I am not sure how long this
assumption is going to hold, given that Python 3.11 alone almost
exhausted its 50-number range. Also because of this, it does not seem
a good idea to keep hardcoding all of the known versions.
Instead, try to detect a "generic PEP 552 .pyc file" by looking for:
1. the "\x0d\x0d\x0a" string at offset 1 -- this covers the fixed part
of the header plus half of the magic number that should suffice till
magic 3600 (Python 3.14)
2. the flag field being clear except for the two bits currently used
(Python rejects .pyc files with additional bits set)
Report the Python version by checking against the known version ranges.
Unfortunately, I did not find a solution that does not involve this
somewhat ugly "range tree", or hardcoding the whole range. Be more
specific that the magic values in question belong to CPython.
Additionally, report the validity checking method (timestamp-
or hash-based), plus the value of check-source flag and the validity
checking data (timestamp + size or hash value).
Finally, add the magic number used by the current version of PyPy2.7.
I am planning to also include support for PyPy3.9 in the future.
However, the current versions wrongly use CPython magic numbers
due to an implementation bug:
https://foss.heptapod.net/pypy/pypy/-/issues/3783
---
magic/Magdir/python | 103 +++++++++++++-------------------------------
1 file changed, 29 insertions(+), 74 deletions(-)
diff --git a/magic/Magdir/python b/magic/Magdir/python
index ed588859..5b1e5f1b 100644
--- a/magic/Magdir/python
+++ b/magic/Magdir/python
@@ -86,6 +86,8 @@
!:mime application/x-bytecode.python
0 belong 0x04f30d0a python 2.7 byte-compiled
!:mime application/x-bytecode.python
+0 belong 0x0af30d0a PyPy2.7 byte-compiled
+!:mime application/x-bytecode.python
0 belong 0xb80b0d0a python 3.0 byte-compiled
!:mime application/x-bytecode.python
0 belong 0xc20b0d0a python 3.0 byte-compiled
@@ -186,80 +188,33 @@
!:mime application/x-bytecode.python
0 belong 0x3f0d0d0a python 3.7 byte-compiled
!:mime application/x-bytecode.python
-0 belong 0x400d0d0a python 3.7 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x410d0d0a python 3.7 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x420d0d0a python 3.7 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x480d0d0a python 3.8 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x490d0d0a python 3.8 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x520d0d0a python 3.8 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x530d0d0a python 3.8 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x540d0d0a python 3.8 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x550d0d0a python 3.8 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x5c0d0d0a python 3.9 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x5d0d0d0a python 3.9 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x5e0d0d0a python 3.9 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x5f0d0d0a python 3.9 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x600d0d0a python 3.9 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x610d0d0a python 3.9 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x660d0d0a python 3.10 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x670d0d0a python 3.10 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x680d0d0a python 3.10 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x690d0d0a python 3.10 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x6a0d0d0a python 3.10 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x6b0d0d0a python 3.10 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x6c0d0d0a python 3.10 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x6d0d0d0a python 3.10 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x6e0d0d0a python 3.10 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x6f0d0d0a python 3.10 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x7a0d0d0a python 3.11 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x7b0d0d0a python 3.11 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x7c0d0d0a python 3.11 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x7d0d0d0a python 3.11 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x7e0d0d0a python 3.11 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x7f0d0d0a python 3.11 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x800d0d0a python 3.11 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x810d0d0a python 3.11 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x820d0d0a python 3.11 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x830d0d0a python 3.11 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x840d0d0a python 3.11 byte-compiled
-!:mime application/x-bytecode.python
-0 belong 0x850d0d0a python 3.11 byte-compiled
-!:mime application/x-bytecode.python
+
+# magic 3392+ implements PEP 552: Deterministic pycs
+# uleshort magic followed by \x0d\0xa
+# \x0d as part of magic should suffice till Python 3.14 (magic 3600)
+1 string \x0d\x0d\x0a
+# extra check: only two bits of flag field are currently used
+>4 ulelong <0x4 Byte-compiled Python module for
+!:mime application/x-bytecode.python
+# now look at the magic number to determine the version
+>>0 uleshort <3400 CPython 3.7,
+>>0 default x
+>>>0 uleshort <3420 CPython 3.8,
+>>>0 default x
+>>>>0 uleshort <3430 CPython 3.9,
+>>>>0 default x
+>>>>>0 uleshort <3450 CPython 3.10,
+>>>>>0 default x
+>>>>>>0 uleshort <3500 CPython 3.11,
+>>>>>>0 default x CPython 3.12 or newer,
+# the flag field determines how .pyc validity is checked
+>>4 ulelong&1 0 timestamp-based,
+>>>8 uledate x .py timestamp: %s UTC,
+>>>12 ulelong x .py size: %d bytes
+>>4 ulelong&1 !0 hash-based, check-source flag
+>>>4 ulelong&2 0 unset,
+>>>4 ulelong&2 !0 set,
+>>>8 ulequad x hash: 0x%llx
0 search/1/w #!\040/usr/bin/python Python script text executable
!:strength + 15
--
2.35.1
More information about the File
mailing list