<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><br id="lineBreakAtBeginningOfMessage"><div><br><blockquote type="cite"><div>On Mar 26, 2024, at 5:22 PM, David Konerding <dakoner@gmail.com> wrote:</div><br class="Apple-interchange-newline"><div><p><defanged_div dir="ltr">Hi,</defanged_div></p><p><defanged_div>I am trying to write a rule to extract more info from BigTIFF. Currently, TIFF files extract directory entries and output metadata, while BigTIFF only reports the file type and endian.<br><br>Working from this page, <a href="http://bigtiff.org/" target="_blank">http://bigtiff.org/</a> I am trying to read the offset to the first directory entry; in TIFF, this is a short (16-bit), while in BigTIFF, it's a quad (64-bit) to support files with very large offsets.</defanged_div></p><defanged_div><p><defanged_div><br></defanged_div></p><defanged_div><p><defanged_div>As such, I am trying to write this continuation:</defanged_div></p><defanged_div><p><defanged_div>>>>(8.Q) use \^bigtiff_ifd<br></defanged_div></p><defanged_div><p><defanged_div>Which IIUC is saying "starting at file offset 8, read a bequad (64 bit) and then recursively call the named magic bigtiff_ifd (which is slightly different from tiff_ifd).</defanged_div></p><defanged_div><p><defanged_div><br></defanged_div></p><defanged_div><p><defanged_div>When I try this on my test file (a big-endian BigTIFF), I get this debug error:</defanged_div></p><defanged_div><p><defanged_div></defanged_div></p><defanged_div><p><defanged_div>10: >>>> 8(bequad,&0), use,='^bigtiff_ifd',""]<br>lhs/off overflow 28956860354 0<br></defanged_div></p><defanged_div><p><defanged_div><br></defanged_div></p><defanged_div><p><defanged_div>If I understand the code in do_ops correctly (<a href="https://github.com/file/file/blob/master/src/softmagic.c#L1465" target="_blank">https://github.com/file/file/blob/master/src/softmagic.c#L1465</a>) </defanged_div></p><defanged_div><p><defanged_div>the values for lhs and off are compared to UINT_MAX and INT_MIN, and a failure is reported if the value is too large.</defanged_div></p><defanged_div><p><defanged_div>On my 64-bit system, UINT_MAX seems to be based on 32-bit integer, which causes the overflow error when compared against a 64-bit value.</defanged_div></p><defanged_div><p><defanged_div><br></defanged_div></p><defanged_div><p><defanged_div>Before I proceed, wanted a sanity check here: are 64-bit offsets larger than 2**32 considered invalid by file?</defanged_div></p></defanged_div></defanged_div></defanged_div></defanged_div></defanged_div></defanged_div></defanged_div></defanged_div></defanged_div></defanged_div></defanged_div></defanged_div></defanged_div></defanged_div></div></blockquote>Yes, If you look in <a href="https://github.com/file/file/blob/master/src/file.h#L360">https://github.com/file/file/blob/master/src/file.h#L360</a> offsets are 32 bits right now. </div><div>It would not be too hard to change everything to be 64 bits, but the question is: is it really looking for</div><div>magic data past 2GB and it is correct (the data is there)? So far I have not found the need to change</div><div>the code to support 64 bit offsets...</div><div><br></div><div>christos</div><br></body></html>