I think the C preprocessor was designed after the GPM clone m6, its successor m4, and Ratfor, so I suspect the difficulty in doing things like this is intentional. I guess I should ask McIlroy, who is responsible for pushing m4 to its absolute limits and was present when the C preprocessor was being designed: https://www.cs.dartmouth.edu/~doug/barem4.m4
_ Pure macros as a programming language
_
_ m4 is Turing complete even when stripped to the bare minimum
_ of one builtin: `define'. This is not news; Christopher
_ Strachey demonstrated it in his ancestral GPM, described in
_ "A general- purpose macrogenerator", The Computer Journal 8
_ (1965) 225-241.
_
_ This m4 program more fully illustrates universality by
_ building familiar programming capabilities: unlimited
_ precision integer arithmetic, boolean algebra, conditional
_ execution, case-switching, and some higher-level operators
_ from functional programming. In support of these normal
_ facilities, however, the program exploits some unusual
_ programming idioms:
_
_ 1. Case-switching via macro names constructed on the fly.
_ 2. Equality testing by redefining macros.
_ 3. Representing data structures by nested parenthesized lists.
_ 4. Using macros as associative memory.
_ 5. Inserting nested parameter symbols on the fly.
_
_ Idioms 2 and 5 are "reflective": the program writes code
_ for itself.
It's very easy to get into enormous amounts of trouble in m4, m6, or GPM. The C preprocessor is not without its problems, but it is rare that I have difficulty in understanding why a given gcc -E invocation produces the output it does.Related: The Preprocessor Iceberg https://jadlevesque.github.io/PPMP-Iceberg/
There you can find a recursive macro expansion implementation (as a gcc hack) that fits on a slide:
#2""3
#define PRAGMA(...) _Pragma(#__VA_ARGS__)
#define REVIVE(m) PRAGMA(push_macro(#m))PRAGMA(pop_macro(#m))
#define DEC(n,...) (__VA_ARGS__)
#define FX(f,x) REVIVE(FX) f x
#define HOW_MANY_ARGS(...) REVIVE(HOW_MANY_ARGS) \
__VA_OPT__(+1 FX(HOW_MANY_ARGS, DEC(__VA_ARGS__)))
int main () {
printf("%i", HOW_MANY_ARGS(1,2,3,4,5)); // 5
}
It sounds like the one in the article works for more compilers, but there doesn't seem to be a copy-pasteable example anywhere to check for myself. Also, the "Our GitHub Org" link on the site just links to github.com.I did the first week of AOC22 in the C preprocessor: https://github.com/camel-cdr/boline/tree/main/aoc22
One can also (ab)use the build system to run arbitrary preprocessing steps with any language over the "C" input. You can have recursive macros by using M4 or Perl or Python or some other language to expand them, converting your "foo.c.in" into a "foo.c" to hand off to the C preprocessor & compiler. It still feels dirty, but it's often much easier to understand & debug.
I wonder if the author is aware of the __VA_TAIL__ proposal[1], it covered similar grounds and IMO very well thought out, but unfortunately not accepted into C2Y (judging from committee meeting minutes).
[1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3307.htm
genuinely remarkable, the altogether perhaps even productive mischief you can get up to, especially with `__VA_OPT__` becoming a proper standard in both C and C++ so you don't have to feel dirty about using it.
i recently made use of plenty of ugly tricks in this vein to take a single authoritative table of macro invocations that defined a bunch of pixel formats, and make them graduate from defining bitfield structs to classes with accessors that performed good old fashioned shifts and masks, all without ever specifying the individual bit offsets of channels, just their individual widths, and macro magic did the rest. no templates, no actual c++, could just as feasibly produce pure c bindings down the line by just changing a few names.
getting really into this stuff makes you stop thinking of c function-like macros as functions of their arguments as such, but rather unary functions of argument lists, where arity roughly becomes the one notion vaguely akin to typing in the whole enterprise, or at least the one place where the compiler exhibits behaviour resembling that of a type checker. this was especially true considering the entries in the table i wound up with were variadic, terminating in variably many (name, width) parenthesised tuples. and i just... had the means to "uncons" them so to speak. fun stuff.
this is worth it, imo, in precisely one context, which is: you want a single source of truth that defines fiddly but formulaic implementations spread across multiple files that must remain coordinated, and this is something you do infrequently enough that you don't consider it worthwhile introducing "real" "big boy" code gen into your build process. mind, you usually do end up having to commit to a little utility header that defines convenient macros (_Ex and such in the article), but hey. c'est la vie. basically x macros (https://en.wikipedia.org/wiki/X_macro) on heart attack quantities of steroids.
In many ways being limited ends up being a feature. Even limited as it is, you get some crimes against humanity like the bourne shell source, but at least most people agree it is a bad idea
If it allowed more unlimited metaprogramming, building big complex things as macros might well have become popular
I wept when the author mentioned implementing SHA256 in macros.
The lack of (easy) recursion in CPP is so frustrating because it was always available in assembly languages with even very old and very simple macro assemblers- with the caveat that the recursion depth was often very limited, and no tail call elimination. For example, if you need to fill memory:
; Fill memory with backward sequence
macro fill n
word n
if n != 0
fill n - 1
endif
endm
So "fill 3" expands to:
word 3
word 2
word 1
word 0
There is no way this was not known about when C was created. They must have been burned by recursive macro abuse and banned it (perhaps from m4 experience as others have said).The other assembly language feature that I missed is the ability to switch sections. This is useful for building tables in a distributed fashion. Luckily you can do it with gcc.
I've ready the article 4 times already today and I'm still crying. This looks like the solution to a problem I'm having (C++, but I'm doing things that templates and constexpr can't do), but trying to get it all to work is painful. Kudos to the author at making an attempt to explain it.
#define _H4X0R_CONVERT_ONE(arg) \
((union { unsigned long long u; void *v; }){ \
.u = (unsigned long long)arg, \
}).v
Couldn't this be just #define _H4X0R_CONVERT_ONE(arg) (void*)(uintptr_t)(arg)
?Also, thanks, now I can finally use
void my_printf(const char *fmt, void* args[], size_t argc);
ergonomically: #define my_printf(fmt, ...) (my_printf)((fmt), \
(void*[]){ H4X0R_VA_VOID_STAR_CONVERT(__VA_ARGS__) }, \
H4X0R_VA_COUNT(__VA_ARGS__))
int main(int argc, char **argv) {
my_printf("int: %d, ptr: %p, str: %s, missing: %d\n", 42, argv, "Hello world!");
}
$ gcc test.c && ./a.out
int: 42, ptr: 0x7FFF46AA3E78, str: Hello world!, missing: %!d(MISSING)
Funnily enough, the difference between passing ... and locally-allocated void*[] is basically who has to spill the data to the stack, the caller or the called function.> C has many advantages that have led to its longevity (60 years as perhaps the most important language).
53 years by my count. Did something relevant happen in 1960? Maybe author is alluding to B?
Imagine trying to implement the C preprocessor. I had to write it from scratch 3 times before it worked 100%.
Macro is one of the ugliest features available in langs like C/CPP
Mildly related, sort of, one can prevent expansion of variadic macros as follows:
#define printf(...)
int (printf)(const char *, ...);
I keep on seeing many random code bases just resort to #undef instead...A C preprocessor implemented in Python: https://github.com/paulross/cpip
I used to write a preprocessor until I noticed those kind of thing...I stopped writing it after that
Can I use this technique to expand MACRO(a,b,c,…) into something like F(a,b,c…); G(a,b,c…)?
Is this a DoS risk - code that sends your build chain into an infinite loop?
The c pre processor. C doesn't have macros. It's fucking miserable. Anyone who uses it is a masochist
The behavior of C macros is actually described by a piece of pseudocode from Dave Prosser and it is not in the standard:
* https://www.spinellis.gr/blog/20060626/
* https://www.spinellis.gr/pubs/jrnl/2006-DDJ-Finessing/html/S...
* https://gcc.gnu.org/legacy-ml/gcc-prs/2001-q1/msg00495.html