[File] Java source file incorrectly identified as HTML document
Aman Sharma
amansha at kth.se
Sun Apr 20 20:53:31 UTC 2025
Hi,
I have two files, Reference.txt<https://github.com/user-attachments/files/19689452/Reference.txt> and Rebuild.txt<https://github.com/user-attachments/files/19689451/Rebuild.txt>. Their file type is:
$ file Reference.txt Rebuild.txt
Reference.txt: HTML document, ASCII text, with very long lines (6135)
Rebuild.txt: Java source, ASCII text, with very long lines (6135)
Both are Java source files. However, Reference.txt is incorrectly identified as an HTML document. As suggested by Chris here<https://lists.reproducible-builds.org/pipermail/diffoscope/2025-April/002838.html>, if line 30 in Reference.txt is removed,`" <title>(.*?)<\\/title>"`, file command correctly classifies it as Java source.
Regards,
Aman Sharma
PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.astron.com/pipermail/file/attachments/20250420/e0a789f3/attachment.htm>
More information about the File
mailing list