[File] Identifying Microsoft XPS (XML Paper Specification) / Improve .zip container detection

Christoph Biedl astron.com.bwoj at manchmal.in-ulm.de
Fri May 15 16:20:25 EDT 2026


Debian bug report: https://bugs.debian.org/1134126
This also has links to test samples.

Long story short, this is yet another file format using .zip as
a container. Sometime the ooxml detection hits for "[Content_Types].xml"
being the first item - it the order is different, file(1) falls back to
"Zip archive data".

Now I cannot see how to handle this with plain magic in a robust way but
perhaps somebody else has ideas.

Otherwise this makes me think - possibly not for the first time -  file
should eventually become able to use the file listing of a .zip file for
identification - parsing in the binary data like in magic/Magdir/msooxml
looks scary and is fairly fragile.

    Christoph



More information about the File mailing list