Author: Athena Boose
Date: 2025-06-18
One of the features I miss most when writing C is the complete lack of any sort of sane syntax for generics. Yes, the lack of memory safety is a problem, but for the sorts of hobby projects I end up using C for it’s less of a big deal.
Realistically, I should just use C++, but that’s the boring option. I’ve debated writing a compiler that adds some extensions to C, like templates, sum types, pattern matching, and such, but that’s a lot of work (and who in their right mind would use or standardize my hobby compiler?). You’re also effectively reinventing C++ at that point.
So, I’m left with the features that come inbuilt with C.
Void pointers are the first thing that comes to mind when writing a generic C datastructure. You could implement a generic vector as follows:
struct vector
void *data;
size_t num_elements;
size_t alloc_size;
size_t element_size;
;
int vector_new(struct vector *vec, size_t element_size);
int vector_push(struct vector *vec, void *elem);
int vector_pop(struct vector *vec);
void vector_delete(struct vector *vec);
Of course, vector_new
could return a pointer to a
newly-allocated vector if you wish to hide the internal implementation
of the vector, but you get the idea.
This has some downsides. First, the type erasure.
There’s nothing about the type struct rudp_vector
that
tells you what type it holds, you have to communicate that with comments
or variable names. If you change the type it holds, the compiler isn’t
going to provide errors telling you what to fix. You could grep for
variable names/comments, but even that is error-prone; what if you
misname a variable or don’t add a comment saying the type of the
vector?
It’s also not typesafe. There’s nothing stopping you from storing multiple types of the same size in this vector, either on purpose or, more likely, by accident.
There’s also a small overhead for storing the element size with the vector, but that’s admittedly pretty minimal.
The benefit of this approach is that the code to implement it is
pretty normal and pretty readable. A bit annoying, sure, the constant
calls to memcpy()
are a pain, but still readable. I’m
guilty of using this approach occasionally when I can’t be bothered to
do things “properly”.
Another approach to generics is to intentionally include a file
multiple times. You require users to predefine a macro, say
VECTOR_TYPE
, before including the file containing the
vector file. You can then do something like this:
Usage:
#define VECTOR_TYPE int
#include "vector.h"
VECTOR(int) new_vec;
VECTOR_NEW(int, &new_vec);
int num = 1234;
VECTOR_PUSH(int, &new_vec, &num);
VECTOR_POP(int, &new_vec);
/* ... */
VECTOR_DELETE(int, &new_vec);
vector.h:
#ifndef VECTOR_TYPE
#error "Must define VECTOR_TYPE to the desired type"
#endif
#define VECTOR(TYPE) struct vector_##TYPE
#define VECTOR_NEW(TYPE, VECPTR) vector_new_##TYPE(VECPTR)
#define VECTOR_PUSH(TYPE, VECPTR, ELEMPTR) vector_push_##TYPE(VECPTR, ELEMPTR)
// ...etc
static inline int vector_new_##VECTOR_TYPE(VECTOR(VECTOR_TYPE) *vec)
/* implementation */
// ...etc
#undef VECTOR_TYPE
This works fine. It is typesafe, although it does require that you pass the type to each function you call, which is annoying.
You also have to do the whole #define
and
#include
nonsense for each type you want to use within a
single file. Again, annoying, but manageable.
A related approach is adding back the include guards and using a
macro called something like VECTOR_DEFINE_FUNCTIONS
that
takes the vector type as an argument and expands to the definitions of
the proper functions.
This latter approach is a bit more flexible, as you could have separate macros for defining function declarations and definitions, which could be useful. The downside of the latter approach is multiline macro definitions are horrendously ugly (as you’ll later see).
Another downside to this approach is that it defines a bunch of
functions which might be a bit of a hazard if you ever define any
function with the same name. This issue is probably avoidable if you
call each function something like
vector_push_internal_##TYPE
or make the function names
all-caps.
The final flaw of this approach is the requirement that your types be a single identifier. As someone who doesn’t usually typedef their structs, this is the biggest (and likely pettiest) issue for me.
A fair warning before we start - typeof() is a C23 feature, and before that was supported as an extension by GCC, Clang, and probably some other compilers. If you’re using MSVC, you’re probably out of luck. The same goes for statement-expressions, except they’re not even standardized.
typeof() is similar to sizeof in that it operates on the type of an expression; specifically, it evaluates to the type of a given expression.
Statement-expressions are described here.
I won’t really show an implementation here, as the implementations are annoying. Here’s an example from a project of mine.
This approach is also typesafe; effectively, you specify a type when creating the struct for a given datastructure, and then write some generic macros that use typeof() to figure out what type the datastructure passed in was and act accordingly.
Statement-expressions allow you to have local scoping and to effectively return variables from a macro. They are still quite flawed; any internal variables will need to be named specially to avoid name collisions. But, they allow you to write code that looks as follows:
VECTOR_TYPE(int) newvec;
vector_new(newvec);
vector_push(newvec, 123);
vector_pop(newvec);
vector_delete(newvec);
(The actual code I’ve linked is a bit more complicated because I wanted to provide a facility for the use of a generic allocator.)
Note the fact that pointers aren’t needed as these are macros, rather
than functions. You can even pass these vectors around between
functions, something like
int func(VECTOR_TYPE(int) *vector);
will do the trick.
The major downside of this is, as stated before, the ugly implementations. The issue of name collisions means that variable names look horrible, and multiline macros don’t make things any better.
Seriously, I can’t quite convey how soul-draining implementing some of these was. If something goes wrong with your implementation, the error messages are damn near unreadable, and the code is incredibly ugly.
Another downside is bad interop with C++ code. The article above describes this in detail. Then again, if you’re using C++, you have templates anyways, so I think it’s less of an issue.
I don’t think this approach is viable for complicated datastructures (at least, not if I want to keep my sanity intact), but for simple things (vectors, hashmaps, heaps, deques, etc), it’s fine.
The syntax for using the “functions” made by this approach is perfect, though. It’s kinda similar to actual C++ templates, you even get pseudo-references since macros don’t need pointers to work.