There are two aspects of software documentation:
- Why the program exists and what it does to achieve its goal from a user perspective.
- How the source code works, i.e. the developer perspective
A reader of source code who is unfamiliar with the "why" and "what" of the program may not be able to fully understand it. Hence, it may be a good idea to document the user perspective in the source code.
Today, most code comments document the "how", which is controversial. Parts of the software community try to write self documenting code that does not depend on comments to make clear how it works. "How" comments are considered a code smell.
How to write self documenting code? It needs to be easily comprehensible. There are some techniques:
- Keep it simple
- Write small modules and functions that have a single, well defined purpose
- Use abstractions (reduction to relevant information and selection of appropriate representations)
...and of course and maybe the most important:
- Use descriptive names
So the questions are: What are good names and how to find them?
There are two domains names stem from:
- Solution Domain
- Problem Domain
Obviously we have to name concepts from the domain that we use to solve problems, that is computer science. Some names have its origin in abstract theoretic concepts (e.g. the names of data structures), others have become idiomatic in specific programming language ecosystems.
Secondly, there are concepts from the domain were our problems has its origin. These also need to be reflected in the source code. For this purpose, the Domain Driven Design DDD) development approach makes use of an "ubiquitous language".
Concepts need to be named consistently. That means:
- Don't use different words with the same meaning for one concept (i.e. no synonyms)
- Don't use words that have several meanings to name a concept (i.e. no homonyms)
The mapping between concepts and names needs to be a one-to-one correspondence (bijective).
Most programs will contain some levels of abstraction. For instance, the concrete concept of a telephone number could be named
If we are in a context where it does not matter that the number is a telephone number, this detail could be abstracted away and it could be named
So, in case of abstraction, it can be okay to use different names for the same concept.
Length & Locality
Hierarchical organization and naming of source code elements make code easier to understand, because it makes dependencies between different modules explicit.
In programming language that support hierarchical namespaces there is no need for hierarchical names, since this feature is already baked into the language.
In languages without this feature, the name of externally visible elements can be composed in a hierarchical manner, to achieve the same.
Example (in C):
// extern function (externally visible) // sat - satellite software // acs - attitude control system // init - function void sat_acs_init(); // static function (only visible inside acs module) - see locality. static void calc_attitude();
Note that names always start from the top (root) node of a hierarchy.
Solution Domain Names
Solution domain names should be idiomatic. While each programming language has its own idioms (that should be followed), I propose the following universal terms.
term[ (abbreviation)][ - antonym [ (antonym abbreviation)]][: term description] allocate (alloc) - free begin - end buffer (buf) calculate (calc) ceiling (ceil) - floor command (cmd) copy (cp) create - destroy delete (del) difference (diff) directory (dir) end - begin export - import first - last floor - ceiling (ceil) free - allocate (alloc) head - tail import - export index (i) initialize (init) - terminate (term) iterator (iter) last - first length (len): Length of variable length data (e.g. length of a C string in a fixed size char array) make (mk) maximum (max) - minimum (min) memory (mem) minimum (min) - maximum (max) move (mv) new - old next - previous (prev) number (num) object (obj) open - close pointer (ptr) pop - push position (pos) previous (prev) - next push - pop read - write receive - send reference (ref) remove (rm) send - receive size: Size of fixed sized containers (e.g. a C array. It can contain data of variable length) source (src) start - stop stop - start string (str) tail - head terminate (term) - initialize (init) write - read
Names for files and paths.
Consider a file path of the form /dir1/dir2/stem.ext: ./dir2/ - (relative) directory (path) ./dir2/stem.ext - (relative) file path /dir1/dir2/ - (absolute) directory path /dir1/dir2/stem.ext - (full or absolute) file path dir1, dir2 - directories or folders ext - extension stem - (filename) stem stem.ext - basename or filename
Format of Names
An Eye Tracking Study on camelCase and under_score Identifier Styles
Concise and consistent naming
Domain Driven Desing
File name? Path name? Base name? Naming standard for pieces of a path
Folklore and science of naming practices