There are two aspects when documenting source code:
- Problem domain (user perspective, why?, what?)
- Solution domain (developer perspective, how?)
A reader of source code who is unfamiliar with the "why" and "what" will have trouble understanding any non-trivial code base. Hence, it may be a good idea to document the why and what in the source code.
While the problem domain depends on code comments (or external documentation), the solution domain does not necessarily need comments. One school of thought is to write self documenting code. In order to be self documenting, it needs to be easy comprehensible:
- Keep it simple
- Write small modules and functions that have a single, well defined purpose
- Use abstractions (reduction to relevant information and selection of appropriate representations)
...and of course and maybe the most important:
- Use descriptive names
So the questions are: What are good names and how to find them?
There are two domains names stem from:
- Solution Domain
- Problem Domain
Obviously we have to use concepts names from the domain that we use to solve problems, i.e. computer science. Some names have its origin in abstract theoretic concepts (e.g. names of data structures or algorithms), others have become idiomatic in specific programming language ecosystems.
Secondly, there are concepts from the domain whose problems we try to solve. These can also be reflected in the source code. For that purpose, the Domain Driven Design DDD) development approach introduced the "ubiquitous language".
Concepts need to be named consistently. That means:
- Don't use different words with the same meaning for one concept (i.e. no synonyms)
- Don't use words that have several meanings to name a concept (i.e. no homonyms)
The mapping between concepts and names needs to be a one-to-one correspondence (bijective).
Most programs will contain some levels of abstraction. For instance, the concrete concept of a telephone number could be named
If we are in a context where it doesn't matter that it is a telephone number, this detail can be abstracted away and it could be named
So, in case of abstraction, it may be okay to use more abstract names for a concept.
Length & Locality
Hierarchical organization and naming of source code elements make code easier to understand, because it makes dependencies between different modules explicit.
In programming languages that support hierarchical namespaces there is no need for hierarchical names, since this feature is already baked into the language.
In languages without this feature, the name of externally visible elements can be composed in a hierarchical manner to achieve the same.
Example (in C):
// extern function (externally visible) // sat - satellite software // acs - attitude control system // init - function void sat_acs_init(); // static function (only visible inside acs module) - see locality. static void calc_attitude();
Note that names always start from the top (root) node of a hierarchy.
Solution Domain Names
Solution domain names should be idiomatic. While each programming language has its own idioms (that should be followed), I propose the following universal terms.
term[ (abbreviation)][ - antonym [ (antonym abbreviation)]][: term description] allocate (alloc) - free begin - end buffer (buf) calculate (calc) ceiling (ceil) - floor command (cmd) copy (cp) create - destroy delete (del) difference (diff) directory (dir) end - begin export - import first - last floor - ceiling (ceil) free - allocate (alloc) head - tail import - export index (i) initialize (init) - terminate (term) iterator (iter) last - first length (len): Length of variable length data (e.g. length of a C string in a fixed size char array) make (mk) maximum (max) - minimum (min) memory (mem) minimum (min) - maximum (max) move (mv) new - old next - previous (prev) number (num) object (obj) open - close pointer (ptr) pop - push position (pos) previous (prev) - next push - pop read - write receive - send reference (ref) remove (rm) send - receive size: Size of fixed sized containers (e.g. a C array. It can contain data of variable length) source (src) start - stop stop - start string (str) tail - head terminate (term) - initialize (init) write - read
Names for files and paths.
Consider a file path of the form /dir1/dir2/stem.ext: ./dir2/ - (relative) directory (path) ./dir2/stem.ext - (relative) file path /dir1/dir2/ - (absolute) directory path /dir1/dir2/stem.ext - (full or absolute) file path dir1, dir2 - directories or folders ext - extension stem - (filename) stem stem.ext - basename or filename
Format of Names
An Eye Tracking Study on camelCase and under_score Identifier Styles
Concise and consistent naming
Domain Driven Desing
File name? Path name? Base name? Naming standard for pieces of a path
Folklore and science of naming practices