Naming in Software Engineering

There are two aspects of software documentation:

  • Why the program exists and what it does to achieve its goal from a user perspective.
  • How the source code works, i.e. the developer perspective

A reader of source code who is unfamiliar with the "why" and "what" of the program may not be able to fully understand it. Hence, it may be a good idea to document the user perspective in the source code.

Today, most code comments document the "how", which is controversial. Parts of the software community try to write self documenting code that does not depend on comments to make clear how it works. "How" comments are considered a code smell.

How to write self documenting code? It needs to be easily comprehensible. There are some techniques:

  • Keep it simple
  • Write small modules and functions that have a single, well defined purpose
  • Use abstractions (reduction to relevant information and selection of appropriate representations)

...and of course and maybe the most important:

  • Use descriptive names

So the questions are: What are good names and how to find them?

Domain

There are two domains names stem from:

  • Solution Domain
  • Problem Domain

Obviously we have to name concepts from the domain that we use to solve problems, that is computer science. Some names have its origin in abstract theoretic concepts (e.g. the names of data structures), others have become idiomatic in specific programming language ecosystems.

Secondly, there are concepts from the domain were our problems has its origin. These also need to be reflected in the source code. For this purpose, the Domain Driven Design DDD) development approach makes use of an "ubiquitous language".

Consistency

Concepts need to be named consistently. That means:

  • Don't use different words with the same meaning for one concept (i.e. no synonyms)
  • Don't use words that have several meanings to name a concept (i.e. no homonyms)

The mapping between concepts and names needs to be a one-to-one correspondence (bijective).

Abstraction

Most programs will contain some levels of abstraction. For instance, the concrete concept of a telephone number could be named telephone_number.

If we are in a context where it does not matter that the number is a telephone number, this detail could be abstracted away and it could be named number.

So, in case of abstraction, it can be okay to use different names for the same concept.

Length & Locality

Hierarchicallity

Hierarchical organization and naming of source code elements make code easier to understand, because it makes dependencies between different modules explicit.

In programming language that support hierarchical namespaces there is no need for hierarchical names, since this feature is already baked into the language.

In languages without this feature, the name of externally visible elements can be composed in a hierarchical manner, to achieve the same.

Example (in C):

// extern function (externally visible)
//   sat - satellite software
//   acs - attitude control system
//   init - function
void sat_acs_init();

// static function (only visible inside acs module) - see locality.
static void calc_attitude();

Note that names always start from the top (root) node of a hierarchy.

Solution Domain Names

Solution domain names should be idiomatic. While each programming language has its own idioms (that should be followed), I propose the following universal terms.

term[ (abbreviation)][ - antonym [ (antonym abbreviation)]][: term description]

allocate (alloc) - free 
begin - end
buffer (buf)
calculate (calc)
ceiling (ceil) - floor
command (cmd)
copy (cp)
create - destroy
delete (del)
difference (diff)
directory (dir)
end - begin
export - import
first - last
floor - ceiling (ceil)
free - allocate (alloc)
head - tail
import - export
index (i)
initialize (init) - terminate (term)
iterator (iter)
last - first
length (len): Length of variable length data (e.g. length of a C string in a fixed size char array)
make (mk)
maximum (max) - minimum (min)
memory (mem)
minimum (min) - maximum (max)
move (mv)
new - old
next - previous (prev)
number (num)
object (obj)
open - close
pointer (ptr)
pop - push
position (pos)
previous (prev) - next
push - pop
read - write
receive - send
reference (ref)
remove (rm)
send - receive
size: Size of fixed sized containers (e.g. a C array. It can contain data of variable length)
source (src)
start - stop
stop - start
string (str)
tail - head
terminate (term) - initialize (init)
write - read

Names for files and paths.

Consider a file path of the form /dir1/dir2/stem.ext:

./dir2/ - (relative) directory (path)
./dir2/stem.ext - (relative) file path
/dir1/dir2/ - (absolute) directory path
/dir1/dir2/stem.ext - (full or absolute) file path
dir1, dir2 - directories or folders
ext - extension
stem - (filename) stem
stem.ext - basename or filename

Format of Names

There is empirical evidence that snake_case improves readability. But: Use it only if it's idiomatic in your language.

Sources:

An Eye Tracking Study on camelCase and under_score Identifier Styles
Concise and consistent naming
Domain Driven Desing
File name? Path name? Base name? Naming standard for pieces of a path
Folklore and science of naming practices