[File] Patch: make libmagic faster

Christos Zoulas christos at zoulas.com
Wed Feb 27 16:53:01 UTC 2019


On Feb 27,  3:47pm, mls at suse.de (Michael Schroeder) wrote:
-- Subject: [File] Patch: make libmagic faster

| Hi Christos et al,
| 
| I'm the maintainer of the 'rpm' package here at SUSE. Some people
| complained to me that the build time of packages with lots of
| html files has gotten really bad since some time.
| 
| So I did some tests and it turned out that the culprit is the
| libmagic library used by rpm the classify every file. Some magic
| entries (e.g. the c/c++ entries) now do more regex matches than
| before.
| 
| (The magic(5) man page says "Regular expressions can take exponential
| time to process, and their performance is hard to predict, so their
| use is discouraged.")
| 
| My idea to solve this is to add "search/8192" lines that check if
| some mandatory part of the regex is present in the buffer. (8192
| being the value of FILE_REGEX_MAX.) This is done by the attached
| patch "file-use-search.dif".
| 
| The patch improved things a little bit, but not much. It turned
| out that the search implementation is not very optimized. So I
| changed the code to use the memmem() function if it is available
| and no search flags are used. See patch "file-use-memmem.dif".
| 
| Which those patches the speed of libmagic has improved by a factor
| of two and is now back to what it was before.
| 
| So, what do you think?

I think it is great :-)

thanks,

christos


More information about the File mailing list