[File] [PATCH v2] Improve python magic checks based on PEP 552
Christos Zoulas
christos at zoulas.com
Sun Jul 24 23:59:54 UTC 2022
Committed, thanks!
christos
> On Jul 24, 2022, at 2:14 PM, Michał Górny <mgorny at gentoo.org> wrote:
>
> Replace the large part of hardcoded Python magic numbers with a simpler
> check based on PEP 552, implemented in Python 3.7 (magic 3392+).
>
> According to PEP 552, the .pyc file starts with the following header
> (in pseudocode):
>
> uleshort magic_number
> string "\x0d\x0a"
> ulelong flags
> union {
> struct {
> ulelong timestamp
> ulelong size
> }
> ulequad hash
> }
>
> The magic number is monotonically increasing. Starting with Python
> 3.11, the range for each version is supposed to start with 2900+50n
> where n is the minor number. However, I am not sure how long this
> assumption is going to hold, given that Python 3.11 alone almost
> exhausted its 50-number range. Also because of this, it does not seem
> a good idea to keep hardcoding all of the known versions.
>
> Instead, try to detect a "generic PEP 552 .pyc file" by looking for:
>
> 1. the fixed "\x0d\x0a" string at offset 2
>
> 2. the flag field being clear except for the two bits currently used
> (Python rejects .pyc files with additional bits set)
>
> 3. the magic number using range for CPython versions (relying on 0x0d
> being part of the magic number, i.e. sufficient till CPython 3.14)
> and fixed values for known PyPy3 versions
>
> Report the specific CPython version by checking against the known
> version ranges. Unfortunately, I did not find a solution that does not
> involve this somewhat ugly "range tree", or hardcoding the whole range.
> Be more specific that the magic values in question belong to CPython.
>
> Additionally, report the validity checking method (timestamp-
> or hash-based), plus the value of check-source flag and the validity
> checking data (timestamp + size or hash value).
>
> Finally, add the magic number used by the current versions of PyPy2.7,
> PyPy3.7, PyPy3.8 and PyPy3.9. In case of the two latter versions, this
> requires a fix found in HG post 7.3.9 release, as the versions up to
> 7.3.9 used CPython's magic due to a bug.
> ---
> magic/Magdir/python | 116 ++++++++++++++++----------------------------
> 1 file changed, 42 insertions(+), 74 deletions(-)
>
> diff --git a/magic/Magdir/python b/magic/Magdir/python
> index ed588859..25be8c93 100644
> --- a/magic/Magdir/python
> +++ b/magic/Magdir/python
> @@ -86,6 +86,8 @@
> !:mime application/x-bytecode.python
> 0 belong 0x04f30d0a python 2.7 byte-compiled
> !:mime application/x-bytecode.python
> +0 belong 0x0af30d0a PyPy2.7 byte-compiled
> +!:mime application/x-bytecode.python
> 0 belong 0xb80b0d0a python 3.0 byte-compiled
> !:mime application/x-bytecode.python
> 0 belong 0xc20b0d0a python 3.0 byte-compiled
> @@ -186,80 +188,46 @@
> !:mime application/x-bytecode.python
> 0 belong 0x3f0d0d0a python 3.7 byte-compiled
> !:mime application/x-bytecode.python
> -0 belong 0x400d0d0a python 3.7 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x410d0d0a python 3.7 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x420d0d0a python 3.7 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x480d0d0a python 3.8 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x490d0d0a python 3.8 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x520d0d0a python 3.8 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x530d0d0a python 3.8 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x540d0d0a python 3.8 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x550d0d0a python 3.8 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x5c0d0d0a python 3.9 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x5d0d0d0a python 3.9 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x5e0d0d0a python 3.9 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x5f0d0d0a python 3.9 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x600d0d0a python 3.9 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x610d0d0a python 3.9 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x660d0d0a python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x670d0d0a python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x680d0d0a python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x690d0d0a python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x6a0d0d0a python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x6b0d0d0a python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x6c0d0d0a python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x6d0d0d0a python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x6e0d0d0a python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x6f0d0d0a python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x7a0d0d0a python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x7b0d0d0a python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x7c0d0d0a python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x7d0d0d0a python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x7e0d0d0a python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x7f0d0d0a python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x800d0d0a python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x810d0d0a python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x820d0d0a python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x830d0d0a python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x840d0d0a python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0 belong 0x850d0d0a python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> +
> +# magic 3392+ implements PEP 552: Deterministic pycs
> +0 name pyc-pep552
> +# the flag field determines how .pyc validity is checked
> +>4 ulelong&1 0 timestamp-based,
> +>>8 uledate x .py timestamp: %s UTC,
> +>>12 ulelong x .py size: %d bytes
> +>4 ulelong&1 !0 hash-based, check-source flag
> +>>4 ulelong&2 0 unset,
> +>>4 ulelong&2 !0 set,
> +>>8 ulequad x hash: 0x%llx
> +
> +# uleshort magic followed by \x0d\0xa
> +2 string \x0d\x0a
> +# extra check: only two bits of flag field are currently used
> +>4 ulelong <0x4
> +# \x0d as part of magic should suffice till Python 3.14 (magic 3600)
> +>>1 ubyte 0x0d Byte-compiled Python module for
> +!:mime application/x-bytecode.python
> +# now look at the magic number to determine the version
> +>>>0 uleshort <3400 CPython 3.7,
> +>>>0 default x
> +>>>>0 uleshort <3420 CPython 3.8,
> +>>>>0 default x
> +>>>>>0 uleshort <3430 CPython 3.9,
> +>>>>>0 default x
> +>>>>>>0 uleshort <3450 CPython 3.10,
> +>>>>>>0 default x
> +>>>>>>>0 uleshort <3500 CPython 3.11,
> +>>>>>>>0 default x CPython 3.12 or newer,
> +>>>0 use pyc-pep552
> +>>0 uleshort 240 Byte-compiled Python module for PyPy3.7,
> +!:mime application/x-bytecode.python
> +>>>0 use pyc-pep552
> +>>0 uleshort 256 Byte-compiled Python module for PyPy3.8,
> +!:mime application/x-bytecode.python
> +>>>0 use pyc-pep552
> +>>0 uleshort 336 Byte-compiled Python module for PyPy3.9,
> +!:mime application/x-bytecode.python
> +>>>0 use pyc-pep552
>
> 0 search/1/w #!\040/usr/bin/python Python script text executable
> !:strength + 15
> --
> 2.35.1
>
> --
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20220724/0d046967/attachment.asc>
More information about the File
mailing list