[File] [PATCH v2] Improve python magic checks based on PEP 552

Christos Zoulas christos at zoulas.com
Sun Jul 24 23:59:54 UTC 2022


Committed, thanks!

christos

> On Jul 24, 2022, at 2:14 PM, Michał Górny <mgorny at gentoo.org> wrote:
> 
> Replace the large part of hardcoded Python magic numbers with a simpler
> check based on PEP 552, implemented in Python 3.7 (magic 3392+).
> 
> According to PEP 552, the .pyc file starts with the following header
> (in pseudocode):
> 
>    uleshort    magic_number
>    string      "\x0d\x0a"
>    ulelong     flags
>    union {
>      struct {
>        ulelong timestamp
>        ulelong size
>      }
>      ulequad   hash
>    }
> 
> The magic number is monotonically increasing.  Starting with Python
> 3.11, the range for each version is supposed to start with 2900+50n
> where n is the minor number.  However, I am not sure how long this
> assumption is going to hold, given that Python 3.11 alone almost
> exhausted its 50-number range.  Also because of this, it does not seem
> a good idea to keep hardcoding all of the known versions.
> 
> Instead, try to detect a "generic PEP 552 .pyc file" by looking for:
> 
> 1. the fixed "\x0d\x0a" string at offset 2
> 
> 2. the flag field being clear except for the two bits currently used
>   (Python rejects .pyc files with additional bits set)
> 
> 3. the magic number using range for CPython versions (relying on 0x0d
>   being part of the magic number, i.e. sufficient till CPython 3.14)
>   and fixed values for known PyPy3 versions
> 
> Report the specific CPython version by checking against the known
> version ranges.  Unfortunately, I did not find a solution that does not
> involve this somewhat ugly "range tree", or hardcoding the whole range.
> Be more specific that the magic values in question belong to CPython.
> 
> Additionally, report the validity checking method (timestamp-
> or hash-based), plus the value of check-source flag and the validity
> checking data (timestamp + size or hash value).
> 
> Finally, add the magic number used by the current versions of PyPy2.7,
> PyPy3.7, PyPy3.8 and PyPy3.9.  In case of the two latter versions, this
> requires a fix found in HG post 7.3.9 release, as the versions up to
> 7.3.9 used CPython's magic due to a bug.
> ---
> magic/Magdir/python | 116 ++++++++++++++++----------------------------
> 1 file changed, 42 insertions(+), 74 deletions(-)
> 
> diff --git a/magic/Magdir/python b/magic/Magdir/python
> index ed588859..25be8c93 100644
> --- a/magic/Magdir/python
> +++ b/magic/Magdir/python
> @@ -86,6 +86,8 @@
> !:mime application/x-bytecode.python
> 0	belong		0x04f30d0a	python 2.7 byte-compiled
> !:mime application/x-bytecode.python
> +0	belong		0x0af30d0a	PyPy2.7 byte-compiled
> +!:mime application/x-bytecode.python
> 0	belong		0xb80b0d0a	python 3.0 byte-compiled
> !:mime application/x-bytecode.python
> 0	belong		0xc20b0d0a	python 3.0 byte-compiled
> @@ -186,80 +188,46 @@
> !:mime application/x-bytecode.python
> 0	belong		0x3f0d0d0a	python 3.7 byte-compiled
> !:mime application/x-bytecode.python
> -0	belong		0x400d0d0a	python 3.7 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x410d0d0a	python 3.7 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x420d0d0a	python 3.7 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x480d0d0a	python 3.8 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x490d0d0a	python 3.8 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x520d0d0a	python 3.8 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x530d0d0a	python 3.8 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x540d0d0a	python 3.8 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x550d0d0a	python 3.8 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x5c0d0d0a	python 3.9 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x5d0d0d0a	python 3.9 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x5e0d0d0a	python 3.9 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x5f0d0d0a	python 3.9 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x600d0d0a	python 3.9 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x610d0d0a	python 3.9 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x660d0d0a	python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x670d0d0a	python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x680d0d0a	python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x690d0d0a	python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x6a0d0d0a	python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x6b0d0d0a	python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x6c0d0d0a	python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x6d0d0d0a	python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x6e0d0d0a	python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x6f0d0d0a	python 3.10 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x7a0d0d0a	python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x7b0d0d0a	python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x7c0d0d0a	python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x7d0d0d0a	python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x7e0d0d0a	python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x7f0d0d0a	python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x800d0d0a	python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x810d0d0a	python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x820d0d0a	python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x830d0d0a	python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x840d0d0a	python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> -0	belong		0x850d0d0a	python 3.11 byte-compiled
> -!:mime application/x-bytecode.python
> +
> +# magic 3392+ implements PEP 552: Deterministic pycs
> +0	name		pyc-pep552
> +# the flag field determines how .pyc validity is checked
> +>4	ulelong&1	0		timestamp-based,
> +>>8	uledate		x		.py timestamp: %s UTC,
> +>>12	ulelong		x		.py size: %d bytes
> +>4	ulelong&1	!0		hash-based, check-source flag
> +>>4	ulelong&2	0		unset,
> +>>4	ulelong&2	!0		set,
> +>>8	ulequad		x		hash: 0x%llx
> +
> +# uleshort magic followed by \x0d\0xa
> +2		string		\x0d\x0a
> +# extra check: only two bits of flag field are currently used
> +>4		ulelong		<0x4
> +# \x0d as part of magic should suffice till Python 3.14 (magic 3600)
> +>>1		ubyte		0x0d		Byte-compiled Python module for
> +!:mime application/x-bytecode.python
> +# now look at the magic number to determine the version
> +>>>0		uleshort	<3400		CPython 3.7,
> +>>>0		default		x
> +>>>>0		uleshort	<3420		CPython 3.8,
> +>>>>0		default		x
> +>>>>>0		uleshort	<3430		CPython 3.9,
> +>>>>>0		default		x
> +>>>>>>0		uleshort	<3450		CPython 3.10,
> +>>>>>>0		default		x
> +>>>>>>>0	uleshort	<3500		CPython 3.11,
> +>>>>>>>0	default		x		CPython 3.12 or newer,
> +>>>0		use		pyc-pep552
> +>>0		uleshort	240		Byte-compiled Python module for PyPy3.7,
> +!:mime application/x-bytecode.python
> +>>>0		use		pyc-pep552
> +>>0		uleshort	256		Byte-compiled Python module for PyPy3.8,
> +!:mime application/x-bytecode.python
> +>>>0		use		pyc-pep552
> +>>0		uleshort	336		Byte-compiled Python module for PyPy3.9,
> +!:mime application/x-bytecode.python
> +>>>0		use		pyc-pep552
> 
> 0	search/1/w	#!\040/usr/bin/python	Python script text executable
> !:strength + 15
> --
> 2.35.1
> 
> --
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20220724/0d046967/attachment.asc>


More information about the File mailing list