MAKEPP_SIGNATURES(1) | Makepp | MAKEPP_SIGNATURES(1) |
makepp_signatures -- How makepp knows when files have changed
C: C,
c_compilation_md5, M: "md5",
P: "plain",
S: "shared_object",
X: "xml",
xml_space
Each file is associated with a signature, which is a string that changes if the file has changed. Makepp compares signatures to see whether it needs to rebuild anything. The default signature for files is a concatenation of the file's modification time and its size, unless you're executing a C/C++ compilation command, in which case the default signature is a cryptographic checksum on the file's contents, ignoring comments and whitespace. If you want, you can switch to a different method, or you can define your own signature functions.
How the signature is actually used is controlled by the build check method (see makepp_build_check). Normally, if a file's signature changes, the file itself is considered to have changed, and makepp forces a rebuild.
If makepp is building a file, and you don't think it should be, you might want to check the build log (see makepplog). Makepp writes an explanation of what it thought each file depended on, and why it chose to rebuild.
There are several signature methods included in makepp. Makepp usually picks the most appropriate standard one automatically. However, you can change the signature method for an individual rule by using ":signature" modifier on the rule which depends on the files you want to check, or for all rules in a makefile by using the "signature" statement, or for all makefiles at once using the "-m" or "--signature-method" command line option.
Makepp used to look only at the file's modification time, but if you run makepp several times within a second (e.g., in a script that's building several small things), sometimes modification times won't change. Then, hopefully the file's size will change.
If the case where you may run makepp several times a second is a problem for you, you may find that using the "md5" method is somewhat more reliable. If makepp builds a file, it flushes its cached MD5 signatures even if the file's date hasn't changed.
For efficiency's sake, makepp won't reread the file and recompute the complex signatures below if this plain signature hasn't changed since the last time it computed it. This can theoretically cause a problem, since it's possible to change the file's contents without changing its date and size. In practice, this is quite hard to do so it's not a serious danger. In the future, as more filesystems switch to timestamps of under a second, hopefully Perl will give us access to this info, making this failsafe.
The idea is to be independent of formatting changes. This is done by pulling everything up as far as possible, and by eliminating insignificant spaces. Words are exempt from pulling up, since they might be macros containing "__LINE__", so they remain on the line where they were.
// ignored comment #ifdef XYZ #include <xyz.h> #endif int a = 1; #line 20 void f ( int b ) { a += b + ++c; } /* more ignored comment */
is treated as though it were
#ifdef XYZ #include<xyz.h> #endif int a=1; #line 20 void f( int b){ a+=b+ ++c;}
That way you can reindent your code or add or change comments without triggering a rebuild, so long as you don't change the line numbers. (This signature method recompiles if line numbers have changed because that causes calls to "__LINE__" and most debugging information to change.) It also ignores whitespace and comments after the last token. This is useful for preventing a useless rebuild if your VC adds lines at a "$""Log$" tag when checking in.
This method is particularly useful for the following situations:
%.h %.cxx: %.qtdlg $(HLIB)/Qt/qt_dialog_generator $(HLIB)/Qt/qt_dialog_generator $(input)
Every time the input file changed, the resulting .h file also was rewritten, and ordinarily this would trigger a rebuild of everything that included it. However, most of the time the contents of the .h file didn't actually change (except for a comment about the build time written by the preprocessor), so a recompilation was not actually necessary.
Actually in practice this saves less recompiles than you'd hope for, because mere comment changes often add lines. In order for logging with "__LINE__" or the debugger to match your source, this requires recompilation. So this signature is specially useless for the "tangle" family of tools from literate programming, where your code resides in some bigger file and even changes to a documentation section irrelevant to code will be reflected in the extracted source via a "#line" directive.
If you can live with wrong line numbers during development, you can set the variable "makepp_signature_C_flat" (with an uppercase C) to some true value (like 1). Then, whereas the compiler still sees the real file, the above example will be flattened for signing as:
#ifdef XYZ #include<xyz.h> #endif int a=1;void f(int b){a+=b+ ++c;}
Note that signatures are only recalculated when files change. So you can build for everyone in a repository without this option, and those who want the option can set it when building in their sandbox. When they first locally change a file, even only trivially, that will cause a recompilation, because with this option a totally different signature is calculated. But then they can reformat the file as much as they want without further recompilation.
The opposite is also true: Just omitting this option after it was set and recompiling will not fix your line numbers. So, if line numbers matter, don't do a production build in the same sandbox without cleaning first.
This is particularly useful if you have some file which is often regenerated during the build process that other files depend on, but which usually doesn't actually change. If you use the "md5" signature checking method, makepp will realize that the file's contents haven't changed even if the file's date has changed. (Of course, this won't help if the files have a timestamp written inside of them, as archive files do for example.)
In the following command the parser will detect an implicit dependency on $(LIBDIR)/libmylib.so, and build it if necessary. However the link command will only be reperformed whenever the library exports a different set of symbols:
myprog: $(OBJECTS) :signature shared_object $(LD) -L$(LIBDIR) -lmylib $(inputs) -o $(output)
This works as long as the functions' interfaces don't change. But in that case you'd change the declaration, so you'd also need to change the callers.
Note that this method only applies to files whose name looks like a shared library. For all other files it falls back to "c_compilation_md5", which may in turn fall back to others.
Common to both methods is that they sign the essence of each xml document. Presence or not of a BOM or "<?xml?>" header is ignored. Comments are ignored, as is whether text is protected as "CDATA" or with entities. Order and quoting style of attributes doesn't matter, nor does how you render empty tags.
For any file which is not valid xml, or if the Expat based "XML::Parser" or the "XML::LibXML" parser is not installed, this falls back to method md5. If you switch your Perl installation from one of the parsers to the others, makepp will think the files are different as soon as their timestamp changes. This is because the result of either parser is logically equivalent, but they produce different signatures. In the unlikely case that this is a problem, you can force use of only "XML::LibXML" by setting in Perl:
$Mpp::Signature::xml::libxml = 1;
The "C" or "c_compilation_md5" method has a built in list of suffixes it recognizes as being C or C-like. If it gets applied to other files it falls back to simpler signature methods. But many file types are syntactically close enough to C++ for this method to be useful. Close enough means C++ comment and string syntax and whitespace is meaningless except one space between words (and C++'s problem cases "- -", "+ +", "/ *" and "< <").
It (and its subclasses) can now easily be extended to other suffixes. Anyplace you can specify a signature you can now tack on one one of these syntaxes to make the method accept additional filenames:
Signature methods apply to all files of a rule. Now if you have a compiler that takes a C like source code and an XML configuration file you'd either need a combined signature method that smartly handles both file types, or you must choose an existing method which will not know whether a change in the other file is significant.
In the future signature method configuration may be changed to filename-pattern, optionally per command.
You can, if you want, define your own methods for calculating file signatures and comparing them. You will need to write a Perl module to do this. Have a look at the comments in "Mpp/Signature.pm" in the distribution, and also at the existing signature algorithms in "Mpp/Signature/*.pm" for details.
Here are some cases where you might want a custom signature method:
2021-01-06 | perl v5.32.0 |