[DRAFT] Bourne/Korn Shell Coding Conventions

This page is currently work-in-progress until it is approved by the OS/Net community. Please send any comments to `<shell-discuss@opensolaris.org>`.

OpenSolaris.org

Table of Contents

Intro

Rules

General

Basic Format
Commenting
Interpreter magic
Harden the script against unexpected (user) input
Use builtin commands if the shell provides them
Use blocks and not subshells if possible
use long options for "set"
Use $(...) instead of `...` command substitutions
Always put the result of a $(...) or $( ...;) command substitution in quotes
Scripts should always set their PATH
Make sure that commands from other packages/applications are really installed on the machine
Check how boolean values are used/implemented in your application
The shell always operates on characters not bytes
Multibyte locales and input
Only use external filters like grep/sed/awk/etc. if you want to process lots of data with them
If the first operand of a command is a variable, use --
Use $ export FOOBAR=val # instead of $ FOOBAR=val ; export FOOBAR #
Use a subshell (e.g. $ ( mycmd ) #) around places which use set -- $(mycmd) and/or shift
Be careful with using TABS in script code, they are not portable between editors or platforms
If you have multiple points where your application exits with an error message create a central function for this purpose
Think about using $ set -o nounset # by default
Avoid using eval unless absolutely necessary
Use the string/array concatenation operator +=
Use source instead of '. '(dot) to include other shell script fragments
Use $"..." instead of gettext ... "..." for strings that need to be localized for different locales
Use set -o noglob if you do not need to expand files
Use IFS= to avoid problems with spaces in filenames
Set the message locale if you process output of tools which may be localised
Cleanup after yourself.
Use a proper exit code
Use shcomp -n scriptname.sh /dev/null to check for common errors

Functions

Use functions to break up your code
Do not use function names which are reserved keywords in C/C++/JAVA or the POSIX shell standard
Use ksh-style function
Use a proper return code
Use FPATH to load common functions, not source

if, for and while

Format
test Builtin
Use "[[ expr ]]" instead of "[ expr ]"
Use "(( ... ))" for arithmetic expressions
Compare exit code using arithmetic expressions expressions
Use builtin commands in conditions for while endless loops
Single-line if-statements
Exit Status and if/while statements

Variable types, naming and usage

Names of local, non-environment, non-constant variables should be lowercase
Do not use variable names which are reserved keywords/variable names in C/C++/JAVA or the POSIX shell standard
Always use '{'+'}' when using variable names longer than one character
Always put variables into quotes when handling filenames or user input
Use typed variables if possible.
Store lists in arrays or associative arrays
Use compound variables or associative arrays to group similar variables together

I/O

Avoid using the "echo" command for output
Use redirect and not exec to open files
Avoid redirections per command when the output goes into the same file, e.g. $ echo "foo" >xxx ; echo "bar" >>xxx ; echo "baz" >>xxx #
Avoid the creation of temporary files and store the values in variables instead
If you create more than one temporary file create an unique subdir
Use {n}<file instead of fixed file descriptor numbers
Use inline here documents instead of echo "$x" | command
Use the -r option of read to read a line
Print compound variables using print -C varname or print -v varname
Put the command name and arguments before redirections
Enable the gmacs editor mode when reading user input using the read builtin

Math

Use builtin arithmetic expressions instead of external applications
Use floating-point arithmetic expressions if calculations may trigger a division by zero or other exceptions
Use printf "%a" when passing floating-point values
Put constant values into readonly variables
Avoid string to number (and/or number to string) conversions in arithmetic expressions expressions
Set LC_NUMERIC when using floating-point constants

Misc

Put [${LINENO}] in your PS4

Intro

This document describes the shell coding style used for all the SMF script changes integrated into (Open)Solaris.

All new SMF shell code should conform to this coding standard, which is intended to match our existing C coding standard.

When in doubt, think "what would be the C-Style equivalent ?" and "What does the POSIX (shell) standard say ?"

Rules

General

Basic Format

Similar to cstyle, the basic format is that all lines are indented by TABs or eight spaces, and continuation lines (which in the shell end with "\") are indented by an equivalent number of TABs and then an additional four spaces, e.g.

cp foo bar
cp some_realllllllllllllllly_realllllllllllllly_long_path \
   to_another_really_long_path

The encoding used for the shell scripts is either ASCII or UTF-8, alternative encodings are only allowed when the application requires this.

Commenting

Shell comments are preceded by the '#' character. Place single-line comments in the right-hand margin. Use an extra '#' above and below the comment in the case of multi-line comments:

cp foo bar		# Copy foo to bar

#
# Modify the permissions on bar.  We need to set them to root/sys
# in order to match the package prototype.
#
chown root bar
chgrp sys bar

Interpreter magic

The proper interpreter magic for your shell script should be one of these:

#!/bin/sh        Standard Bourne shell script
#!/bin/ksh -p    Standard Korn shell 88 script.  You should always write ksh
                 scripts with -p so that ${ENV} (if set by the user) is not
                 sourced into your script by the shell.
#!/bin/ksh93     Standard Korn shell 93 script (-p is not needed since ${ENV} is
                 only used for interactive shell sessions).

Harden the script against unexpected (user) input

Harden your script against unexpected (user) input, including command line options, filenames with blanks (or other special characters) in the name, or file input

Use builtin commands if the shell provides them

Use builtin commands if the shell provides them. For example ksh93s+ (ksh93, version 's+') delivered with Solaris (as defined by PSARC 2006/550) supports the following builtins: basename, cat, chgrp, chmod, chown, cmp, comm, cp, cut, date, dirname, expr, fds, fmt, fold, getconf, head, id, join, ln, logname, mkdir, mkfifo, mv, paste, pathchk, rev, rm, rmdir, stty, tail, tee, tty, uname, uniq, wc, sync Those builtins can be enabled via $ builtin name_of_builtin # in shell scripts (note that ksh93 builtins implement exact POSIX behaviour - some commands in Solaris /usr/bin/ directory implement pre-POSIX behaviour. Add /usr/xpg6/bin/:/usr/xpg4/bin before /usr/bin/ in ${PATH} to test whether your script works with the XPG6/POSIX versions)

Use blocks and not subshells if possible

Use blocks and not subshells if possible, e.g. use $ { print "foo" ; print "bar" ; } instead of $ (print "foo" ; print "bar") # - blocks are faster since they do not require to save the subshell context (ksh93) or trigger a shell child process (Bourne shell, bash, ksh88 etc.)

use long options for "`set`"

use long options for "set", for example instead of $ set -x # use $ set -o xtrace # to make the code more readable.

Use `$(...)` instead of `...` command substitutions

Use $(...) instead of `...` - `...` is an obsolete construct in ksh+POSIX sh scripts and $(...).is a cleaner design, requires no escaping rules, allows easy nesting etc.

`${ ...;}`-style command substitutions

ksh93 has support for an alternative version of command substitutions with the syntax ${ ...;} which do not run in a subshell.

Always put the result of a `$(...)` or `$( ...;)` command substitution in quotes

Always put the result of $( ... ) or $( ...;) in quotes (e.g. foo="$( ... )" or foo="$( ...;)") unless there is a very good reason for not doing it

Scripts should always set their `PATH`

Scripts should always set their PATH to make sure they do not use alternative commands by accident (unless the value of PATH is well-known and guaranteed to be set by the caller)

Make sure that commands from other packages/applications are really installed on the machine

Scripts should make sure that commands in optional packages are really there, e.g. add a "precheck" block in scipts to avoid later failure when doing the main job

Check how boolean values are used/implemented in your application

Check how boolean values are used in your application.

For example:

mybool=0
# do something
if [ $mybool -eq 1 ] ; then do_something_1 ; fi

could be rewritten like this:

mybool=false # (valid values are "true" or "false", pointing
# to the builtin equivalents of /bin/true or /bin/false)
# do something
if ${mybool} ; then do_something_1 ; fi

integer mybool=0 # values are 0 or 1
# do something
if (( mybool==1 )) ; then do_something_1 ; fi

The shell always operates on characters not bytes

Shell scripts operate on characters and not bytes. Some locales use multiple bytes (called "multibyte locales") to represent one character

Note

ksh93 has support for binary variables which explicitly operate on bytes, not characters. This is the only allowed exception.

Multibyte locales and input

Think about whether your application has to handle file names or variables in multibyte locales and make sure all commands used in your script can handle such characters (e.g. lots of commands in Solaris's /usr/bin/ are not able to handle such values - either use ksh93 builtin constructs (which are guaranteed to be multibyte-aware) or commands from /usr/xpg4/bin/ and/or /usr/xpg6/bin)

Only use external filters like `grep`/`sed`/`awk`/etc. if you want to process lots of data with them

Only use external filters like grep/sed/awk/etc. if a significant amount of data is processed by the filter or if benchmarking shows that the use of builtin commands is significantly slower (otherwise the time and resources needed to start the filter are far greater then the amount of data being processed, creating a performance problem).

For example:

if [ "$(echo "$x" | egrep '.*foo.*')" != "" ] ; then
    do_something ;
done

can be re-written using ksh93 builtin constructs, saving several |fork()|+|exec()|'s:

if [[ "${x}" == ~(E).*foo.* ]] ; then
    do_something ;
done

If the first operand of a command is a variable, use `--`

If the first operand of a command is a variable, use -- for any command that accepts this as end of argument to avoid problems if the variable expands to a value starting with -.

Note

At least print, /usr/bin/fgrep, /usr/bin/grep, /usr/bin/egrep support -- as "end of arguments"-terminator.

Use `$ export FOOBAR=val #` instead of `$ FOOBAR=val ; export FOOBAR #`

Use $ export FOOBAR=val # instead of $ FOOBAR=val ; export FOOBAR # - this is much faster.

Use a subshell (e.g. `$ ( mycmd ) #`) around places which use `set -- $(mycmd)` and/or `shift`

Use a subshell (e.g. $ ( mycmd ) #) around places which use set -- $(mycmd) and/or shift unless the variable affected is either a local one or if it's guaranteed that this variable will no longer be used (be careful for loadable functions, e.g. ksh/ksh93's autoload !!!!)

Be careful with using TABS in script code, they are not portable between editors or platforms

Be careful with using TABS in script code, they are not portable between editors or platforms.

If you use ksh93 use $'\t' to include TABs in sources, not the TAB character itself.

If you have multiple points where your application exits with an error message create a central function for this purpose

If you have multiple points where your application exits with an error message create a central function for this, e.g.

if [ -z "$tmpdir" ] ; then
        print -u2 "mktemp failed to produce output; aborting."
        exit 1
fi
if [ ! -d $tmpdir ] ; then
        print -u2 "mktemp failed to create a directory; aborting."
        exit 1
fi

should be replaced with

function fatal_error
{
    print -u2 "${progname}: $*"
    exit 1
}
# do something (and save ARGV[0] to variable "progname")
if [ -z "$tmpdir" ] ; then
        fatal_error "mktemp failed to produce output; aborting."
fi
if [ ! -d "$tmpdir" ] ; then
        fatal_error "mktemp failed to create a directory; aborting."
fi

Think about using `$ set -o nounset #` by default

Think about using $ set -o nounset # by default (or at least during the script's development phase) to catch errors where variables are used when they are not set (yet), e.g.

$ (set -o nounset ; print ${foonotset})
/bin/ksh93: foonotset: parameter not set

Avoid using `eval` unless absolutely necessary

Avoid using eval unless absolutely necessary. Subtle things can happen when a string is passed back through the shell parser. You can use name references to avoid uses such as eval $name="$value".

Use the string/array concatenation operator `+=`

Use += instead of manually adding strings/array elements, e.g.

foo=""
foo="${foo}a"
foo="${foo}b"
foo="${foo}c"

should be replaced with

foo=""
foo+="a"
foo+="b"
foo+="c"

Use `source` instead of '`.` '(dot) to include other shell script fragments

Use source instead of '.' (dot) to include other shell script fragments - the new form is much more readable than the tiny dot and a failure can be caught within the script.

Use `$"..."` instead of `gettext ... "..."` for strings that need to be localized for different locales

Use $"..." instead of gettext ... "..." for strings that need to be localized for different locales. gettext will require a fork()+exec() and reads the whole catalog each time it's called, creating a huge overhead for localisation (and the $"..." is easier to use, e.g. you only have to put a $ in front of the catalog and the string will be localised).

Use `set -o noglob` if you do not need to expand files

If you don't expect to expand files, you can do set -f (set -o noglob) as well. This way the need to use "" is greatly reduced.

Use `IFS=` to avoid problems with spaces in filenames

Unless you want to do word splitting, put IFS= at the beginning of a command. This way spaces in file names won't be a problem. You can do IFS='delims' read -r line to override IFS just for the read command. However, you can't do this for the set builtin.

Set the message locale if you process output of tools which may be localised

Set the message locale (LC_MESSAGES) if you process output of tools which may be localised

Example 1. Set LC_MESSAGES when testing for specific outout of the /usr/bin/file utility:

# set french as default message locale
export LC_MESSAGES=fr_FR.UTF-8

...

# test whether the file "/tmp" has the filetype "directory" or not
# we set LC_MESSAGES to "C" to ensure the returned message is in english
if [[ "$(LC_MESSAGES=C file /tmp)" = *directory ]] ; then
    print "is a directory"
fi

Note

The environment variable LC_ALL always overrides any other LC_* environment variables (and LANG, too), including LC_MESSAGES. if there is the chance that LC_ALL may be set replace LC_MESSAGES with LC_ALL in the example above.

Cleanup after yourself.

Cleanup after yourself. For example ksh/ksh93 have an EXIT trap which is very useful for this.

Note

Note that the EXIT trap is executed for a subshell and each subshell level can run it's own EXIT trap, for example

$ (trap "print bam" EXIT ; (trap "print snap" EXIT ; print "foo"))
foo
snap
bam

Use a proper `exit` code

Explicitly set the exit code of a script, otherwise the exit code from the last command executed will be used which may trigger problems if the value is unexpected.

Use `shcomp -n scriptname.sh /dev/null` to check for common errors

Use shcomp -n scriptname.sh /dev/null to check for common problems (such as insecure, depreciated or ambiguous constructs) in shell scripts.

Functions

Use functions to break up your code

Use functions to break up your code into smaller, logical blocks.

Do not use function names which are reserved keywords in C/C++/JAVA or the POSIX shell standard

Do not use function names which are reserved keywords (or function names) in C/C++/JAVA or the POSIX shell standard (to avoid confusion and/or future changes/updates to the shell language).

Use ksh-style `function`

It is highly recommended to use ksh style functions (function foo { ... }) instead of Bourne-style functions (foo() { ... }) if possible (and local variables instead of spamming the global namespace).

Warning

The difference between old-style Bourne functions and ksh functions is one of the major differences between ksh88 and ksh93 - ksh88 allowed variables to be local for Bourne-style functions while ksh93 conforms to the POSIX standard and will use a function-local scope for variables declared in Bourne-style functions.

Example (note that "integer" is an alias for "typeset -li"):

# new style function with local variable
$ ksh93 -c 'integer x=2 ; function foo { integer x=5 ; } ; print "x=$x"
; foo ; print "x=$x" ;'
x=2
x=2
# old style function with an attempt to create a local variable
$ ksh93 -c 'integer x=2 ; foo() { integer x=5 ; } ; print "x=$x" ; foo ;
print "x=$x" ;'
x=2
x=5

>usr/src/lib/libshell/common/COMPATIBILITY says about this issue:

Functions, defined with name() with ksh-93 are compatible with the POSIX standard, not with ksh-88. No local variables are permitted, and there is no separate scope. Functions defined with the function name syntax, maintain compatibility. This also affects function traces.

(this issue also affects /usr/xpg4/bin/sh in Solaris 10 because it is based on ksh88. This is a bug.).

Use a proper `return` code

Explicitly set the return code of a function - otherwise the exit code from the last command executed will be used which may trigger problems if the value is unexpected.

The only allowed exception is if a function uses the shell's errexit mode to leave a function, subshell or the script if a command returns a non-zero exit code.

Use `FPATH` to load common functions, not `source`

Use the ksh FPATH (function path) feature to load functions which are shared between scripts and not source - this allows to load such a function on demand and not all at once.

`if`, `for` and `while`

Format

To match cstyle, the shell token equivalent to the C "{" should appear on the same line, separated by a ";", as in:

if [ "$x" = "hello" ] ; then
    echo $x
fi

if [[ "$x" = "hello" ]] ; then
    print $x
fi

for i in 1 2 3; do
    echo $i
done

for ((i=0 ; i < 3 ; i++)); do
    print $i
done

while [ $# -gt 0 ]; do
    echo $1
    shift
done

while (( $# > 0 )); do
  print $1
  shift
done

`test` Builtin

DO NOT use the test builtin. Sorry, executive decision.

In our Bourne shell, the test built-in is the same as the "[" builtin (if you don't believe me, try "type test" or refer to usr/src/cmd/sh/msg.c).

So please do not write:

if test $# -gt 0 ; then

instead use:

if [ $# -gt 0 ] ; then

Use "`[[ expr ]]`" instead of "`[ expr ]`"

Use "[[ expr ]]" instead of "[ expr ]" if possible since it avoids going through the whole pattern expansion/etc. machinery and adds additional operators not available in the Bourne shell, such as short-circuit && and ||.

Use "`(( ... ))`" for arithmetic expressions

Use "(( ... ))" instead of "[ expr ]" or "[[ expr ]]" expressions.

Example: Replace

i=5
# do something
if [ $i -gt 5 ] ; then

with

i=5
# do something
if (( i > 5 )) ; then

Compare exit code using arithmetic expressions expressions

Use POSIX arithmetic expressions to test for exit/return codes of commands and functions. For example turn

if [ $? -gt 0 ] ; then

into

if (( $? > 0 )) ; then

Use builtin commands in conditions for `while` endless loops

Make sure that your shell has a "true" builtin (like ksh93) when executing endless loops like $ while true ; do do_something ; done # - otherwise each loop cycle runs a |fork()|+|exec()|-cycle to run /bin/true

Single-line if-statements

It is permissible to use && and || to construct shorthand for an "if" statement in the case where the if statement has a single consequent line:

[ $# -eq 0 ] && exit 0

instead of the longer:

if [ $# -eq 0 ]; then
  exit 0
fi

Exit Status and `if`/`while` statements

Recall that "if" and "while" operate on the exit status of the statement to be executed. In the shell, zero (0) means true and non-zero means false. The exit status of the last command which was executed is available in the $? variable. When using "if" and "while", it is typically not necessary to use $? explicitly, as in:

grep foo /etc/passwd >/dev/null 2>&1
if [ $? -eq 0 ]; then
  echo "found"
fi

Instead, you can more concisely write:

if grep foo /etc/passwd >/dev/null 2>&1; then
  echo "found"
fi

Or, when appropriate:

grep foo /etc/passwd >/dev/null 2>&1 && echo "found"

Variable types, naming and usage

Names of local, non-environment, non-constant variables should be lowercase

Names of variables local to the current script which are not exported to the environment should be lowercase while variable names which are exported to the environment should be uppercase.

The only exception are global constants (=global readonly variables, e.g. $ float -r M_PI=3.14159265358979323846 # (taken from <math.h>)) which may be allowed to use uppercase names, too.

Warning

Uppercase variable names should be avoided because there is a good chance of naming collisions with either special variable names used by the shell (e.g. PWD, SECONDS etc.).

Do not use variable names which are reserved keywords/variable names in C/C++/JAVA or the POSIX shell standard

Do not use variable names which are reserved keywords in C/C++/JAVA or the POSIX shell standard (to avoid confusion and/or future changes/updates to the shell language).

Note

The Korn Shell and the POSIX shell standard have many more reserved variable names than the original Bourne shell. All these reserved variable names are spelled uppercase.

Always use `'{'`+`'}'` when using variable names longer than one character

Always use '{'+'}' when using variable names longer than one character unless a simple variable name is followed by a blank, /, ;, or $ character (to avoid problems with array, compound variables or accidental misinterpretation by users/shell)

print "$foo=info"

should be rewritten to

print "${foo}=info"

Always put variables into quotes when handling filenames or user input

Always put variables into quotes when handling filenames or user input, even if the values are hardcoded or the values appear to be fixed. Otherwise at least two things may go wrong:

a malicious user may be able to exploit a script's inner working to infect his/her own code
a script may (fatally) misbehave for unexpected input (e.g. file names with blanks and/or special symbols which are interpreted by the shell)

Note

As alternative a script may set IFS='' ; set -o noglob to turn off the interpretation of any field seperators and the pattern globbing.

Use typed variables if possible.

For example the following is very inefficient since it transforms the integer values to strings and back several times:

a=0
b=1
c=2
# more code
if [ $a -lt 5 -o $b -gt c ] ; then do_something ; fi

This could be rewritten using ksh constructs:

integer a=0
integer b=1
integer c=2
# more code
if (( a < 5 || b > c )) ; then do_something ; fi

Store lists in arrays or associative arrays

Store lists in arrays or associative arrays - this is usually easier to manage.

For example:

x="
/etc/foo
/etc/bar
/etc/baz
"
echo $x

can be replaced with

typeset -a mylist
mylist[0]="/etc/foo"
mylist[1]="/etc/bar"
mylist[2]="/etc/baz"
print "${mylist[@]}"

or (ksh93-style append entries to a normal (non-associative) array)

typeset -a mylist
mylist+=( "/etc/foo" )
mylist+=( "/etc/bar" )
mylist+=( "/etc/baz" )
print "${mylist[@]}"

Difference between expanding arrays with mylist[@] and mylist[*] subscript operators

Arrays may be expanded using two similar subscript operators, @ and *. These subscripts differ only when the variable expansion appears within double quotes. If the variable expansion is between double-quotes, "${mylist[*]}" expands to a single string with the value of each array member separated by the first character of the IFS variable, and "${mylist[@]}" expands each element of name to a separate string.

Example 2. Difference between [@] and [*] when expanding arrays

typeset -a mylist
mylist+=( "/etc/foo" )
mylist+=( "/etc/bar" )
mylist+=( "/etc/baz" )
IFS=","
printf "mylist[*]={ 0=|%s| 1=|%s| 2=|%s| 3=|%s| }\n" "${mylist[*]}"
printf "mylist[@]={ 0=|%s| 1=|%s| 2=|%s| 3=|%s| }\n" "${mylist[@]}"

will print:

mylist[*]={ 0=|/etc/foo,/etc/bar,/etc/baz| 1=|| 2=|| 3=|| }
mylist[@]={ 0=|/etc/foo| 1=|/etc/bar| 2=|/etc/baz| 3=|| }

Use compound variables or associative arrays to group similar variables together

Use compound variables or associative arrays to group similar variables together.

For example:

box_width=56
box_height=10
box_depth=19
echo "${box_width} ${box_height} ${box_depth}"

could be rewritten to ("associative array"-style)

typeset -A -E box=( [width]=56 [height]=10 [depth]=19 )
print -- "${box[width]} ${box[height]} ${box[depth]}"

or ("compound variable"-style

box=(
    float width=56
    float height=10
    float depth=19
    )
print -- "${box.width} ${box.height} ${box.depth}"

I/O

Avoid using the "`echo`" command for output

The behaviour of "echo" is not portable (e.g. System V, BSD, UCB and ksh93/bash shell builtin versions all slightly differ in functionality) and should be avoided if possible. POSIX defines the "printf" command as replacement which provides more flexible and portable behaviour.

Use "`print`" and not "`echo`" in Korn Shell scripts

Korn shell scripts should prefer the "print" builtin which was introduced as replacement for "echo".

Caution

Use $ print -- ${varname}" # when there is the slightest chance that the variable "varname" may contain symbols like "-". Or better use "printf" instead, for example

integer fx
# do something
print $fx

may fail if "f" contains a negative value. A better way may be to use

integer fx
# do something
printf "%d\n" fx

Use `redirect` and not `exec` to open files

Use redirect and not exec to open files - exec will terminate the current function or script if an error occurs while redirect just returns a non-zero exit code which can be caught.

Example:

if redirect 5</etc/profile ; then
    print "file open ok"
    head <&5
else
    print "could not open file"
fi

Avoid redirections per command when the output goes into the same file, e.g. `$ echo "foo" >xxx ; echo "bar" >>xxx ; echo "baz" >>xxx #`

Each of the redirections above trigger an |open()|,|write()|,|close()|-sequence. It is much more efficient (and faster) to group the rediction into a block, e.g. { echo "foo" ; echo "bar" ; echo "baz" } >xxx #

Avoid the creation of temporary files and store the values in variables instead

Avoid the creation of temporary files and store the values in variables instead if possible

Example:

ls -1 >xxx
for i in $(cat xxx) ; do
    do_something ;
done

can be replaced with

x="$(ls -1)"
for i in ${x} ; do
    do_something ;
done

Note

ksh93 supports binary variables (e.g. typeset -b varname) which can hold any value.

If you create more than one temporary file create an unique subdir

If you create more than one temporary file create an unique subdir for these files and make sure the dir is writable. Make sure you cleanup after yourself (unless you are debugging).

Use {n}<file instead of fixed file descriptor numbers

When opening a file use {n}<file, where n is an integer variable rather than specifying a fixed descriptor number.

This is highly recommended in functions to avoid that fixed file descriptor numbers interfere with the calling script.

Example 3. Open a network connection and store the file descriptor number in a variable

function cat_http
{
    integer netfd

...

    # open TCP channel
    redirect {netfd}<>"/dev/tcp/${host}/${port}"

    # send HTTP request
    request="GET /${path} HTTP/1.1\n"
    request+="Host: ${host}\n"
    request+="User-Agent: demo code/ksh93 (2007-08-30; $(uname -s -r -p))\n"
    request+="Connection: close\n"
    print "${request}\n" >&${netfd}

    # collect response and send it to stdout
    cat <&${netfd}

    # close connection
    exec {netfd}<&-

...

}

Use inline here documents instead of `echo "$x" | command`

Use inline here documents, for example

command <<< $x

rather than

print -r -- "$x" | command

Use the `-r` option of `read` to read a line

Use the -r option of read to read a line. You never know when a line will end in \ and without a -r multiple lines can be read.

Print compound variables using `print -C varname` or `print -v varname`

Print compound variables using print -C varname or print -v varname to make sure that non-printable characters are correctly encoded.

Example 4. Print compound variable with non-printable characters

compound x=(
    a=5
    b="hello"
    c=(
        d=9
        e="$(printf "1\v3")" 
    )
)
print -v x

will print:

(
        a=5
        b=hello
        c=(
                d=9
                e=$'1\0133' 
        )
)

vertical tab, \v, octal=\013.

Put the command name and arguments before redirections

Put the command name and arguments before redirections. You can legally do $ > file date instead of date > file but don't do it.

Enable the `gmacs` editor mode when reading user input using the `read` builtin

Enable the gmacseditor mode before reading user input using the read builtin to enable the use of cursor+backspace+delete keys in the edit line

Example 5. Prompt user for a string with gmacs editor mode enabled

set -o gmacs 
typeset inputstring="default value"
...
read -v inputstring?"Please enter a string: "
...
printf "The user entered the following string: '%s'\n" "${inputstring}"

...

	Enable gmacs editor mode.
	The value of the variable is displayed and used as a default value.
	Variable used to store the result.
	Prompt string which is displayed in stderr.

Math

Use builtin arithmetic expressions instead of external applications

Use builtin (POSIX shell) arithmetic expressions instead of expr, bc, dc, awk, nawk or perl.

Note

ksh93 supports C99-like floating-point arithmetic including special values such as +Inf, -Inf, +NaN, -NaN.

Use floating-point arithmetic expressions if calculations may trigger a division by zero or other exceptions

Use floating-point arithmetic expressions if calculations may trigger a division by zero or other exceptions - floating point arithmetic expressions in ksh93 support special values such as +Inf/-Inf and +NaN/-NaN which can greatly simplify testing for error conditions, e.g. instead of a trap or explicit if ... then... else checks for every sub-expression you can check the results for such special values.

Example:

$ ksh93 -c 'integer i=0 j=5 ; print -- "x=$((j/i)) "'
ksh93: line 1: j/i: divide by zero
$ ksh93 -c 'float i=0 j=-5 ; print -- "x=$((j/i)) "'
x=-Inf

Use `printf "%a"` when passing floating-point values

Use printf "%a" when passing floating-point values between scripts or as output of a function to avoid rounding errors when converting between bases.

Example:

function xxx
{
    float val

    (( val=sin(5.) ))
    printf "%a\n" val
}
float out
(( out=$(xxx) ))
xxx
print -- $out

This will print:

-0.9589242747
-0x1.eaf81f5e09933226af13e5563bc6p-01

Put constant values into readonly variables

For example:

float -r M_PI=3.14159265358979323846

float M_PI=3.14159265358979323846
readonly M_PI

Avoid string to number (and/or number to string) conversions in arithmetic expressions expressions

Avoid string to number and/or number to string conversions in arithmetic expressions expressions to avoid performance degradation and rounding errors.

Example 6. (( x=$x*2 )) vs. (( x=x*2 ))

float x
...
(( x=$x*2 ))

will convert the variable "x" (stored in the machine's native |long double| datatype) to a string value in base10 format, apply pattern expansion (globbing), then insert this string into the arithmetic expressions and parse the value which converts it into the internal |long double| datatype format again. This is both slow and generates rounding errors when converting the floating-point value between the internal base2 and the base10 representation of the string.

The correct usage would be:

float x
...
(( x=x*2 ))

e.g. omit the '$' because it's (at least) redundant within arithmetic expressions.

Example 7. x=$(( y+5.5 )) vs. (( x=y+5.5 ))

float x
float y=7.1
...
x=$(( y+5.5 ))

will calculate the value of y+5.5, convert it to a base-10 string value amd assign the value to the floating-point variable x again which will convert the string value back to the internal |long double| datatype format again.

The correct usage would be:

float x
float y=7.1
...
(( x=y+5.5 ))

i.e. this will save the string conversions and avoid any base2-->base10-->base2-conversions.

Set `LC_NUMERIC` when using floating-point constants

Set LC_NUMERIC when using floating-point constants to avoid problems with radix-point representations which differ from the representation used in the script, for example the de_DE.* locale use ',' instead of '.' as default radix point symbol.

For example:

# Make sure all math stuff runs in the "C" locale to avoid problems with alternative
# radix point representations (e.g. ',' instead of '.' in de_DE.*-locales). This
# needs to be set _before_ any floating-point constants are defined in this script)
if [[ "${LC_ALL}" != "" ]] ; then
    export \
        LC_MONETARY="${LC_ALL}" \
        LC_MESSAGES="${LC_ALL}" \
        LC_COLLATE="${LC_ALL}" \
        LC_CTYPE="${LC_ALL}"
        unset LC_ALL
fi
export LC_NUMERIC=C
...
float -r M_PI=3.14159265358979323846

Note

The environment variable LC_ALL always overrides all other LC_* variables, including LC_NUMERIC. The script should always protect itself against custom LC_NUMERIC and LC_ALL values as shown in the example above.

Misc

Put `[${LINENO}]` in your `PS4`

Put [${LINENO}] in your PS4 prompt so that you will get line numbers with you run with -x. If you are looking at performance issues put $SECONDS in the PS4 prompt as well.

[DRAFT] Bourne/Korn Shell Coding Conventions

This page is currently work-in-progress until it is approved by the OS/Net community. Please send any comments to <shell-discuss@opensolaris.org>.

OpenSolaris.org

Intro

Rules

General

Basic Format

Commenting

Interpreter magic

Harden the script against unexpected (user) input

Use builtin commands if the shell provides them

Use blocks and not subshells if possible

use long options for "set"

Use $(...) instead of `...` command substitutions

${ ...;}-style command substitutions

Always put the result of a $(...) or $( ...;) command substitution in quotes

Scripts should always set their PATH

Make sure that commands from other packages/applications are really installed on the machine

Check how boolean values are used/implemented in your application

The shell always operates on characters not bytes

Note

Multibyte locales and input

Only use external filters like grep/sed/awk/etc. if you want to process lots of data with them

If the first operand of a command is a variable, use --

Note

Use $ export FOOBAR=val # instead of $ FOOBAR=val ; export FOOBAR #

Use a subshell (e.g. $ ( mycmd ) #) around places which use set -- $(mycmd) and/or shift

Be careful with using TABS in script code, they are not portable between editors or platforms

If you have multiple points where your application exits with an error message create a central function for this purpose

Think about using $ set -o nounset # by default

Avoid using eval unless absolutely necessary

Use the string/array concatenation operator +=

Use source instead of '. '(dot) to include other shell script fragments

Use $"..." instead of gettext ... "..." for strings that need to be localized for different locales

Use set -o noglob if you do not need to expand files

Use IFS= to avoid problems with spaces in filenames

Set the message locale if you process output of tools which may be localised

Note

Cleanup after yourself.

Note

Use a proper exit code

Use shcomp -n scriptname.sh /dev/null to check for common errors

Functions

Use functions to break up your code

Do not use function names which are reserved keywords in C/C++/JAVA or the POSIX shell standard

Use ksh-style function

Warning

Use a proper return code

Use FPATH to load common functions, not source

if, for and while

Format

test Builtin

Use "[[ expr ]]" instead of "[ expr ]"

Use "(( ... ))" for arithmetic expressions

Compare exit code using arithmetic expressions expressions

Use builtin commands in conditions for while endless loops

Single-line if-statements

Exit Status and if/while statements

Variable types, naming and usage

Names of local, non-environment, non-constant variables should be lowercase

Warning

Do not use variable names which are reserved keywords/variable names in C/C++/JAVA or the POSIX shell standard

Note

Always use '{'+'}' when using variable names longer than one character

Always put variables into quotes when handling filenames or user input

Note

Use typed variables if possible.

Store lists in arrays or associative arrays

Difference between expanding arrays with mylist[@] and mylist[*] subscript operators

Use compound variables or associative arrays to group similar variables together

I/O

Avoid using the "echo" command for output

Use "print" and not "echo" in Korn Shell scripts

Caution

Use redirect and not exec to open files

Avoid redirections per command when the output goes into the same file, e.g. $ echo "foo" >xxx ; echo "bar" >>xxx ; echo "baz" >>xxx #

Avoid the creation of temporary files and store the values in variables instead

Note

If you create more than one temporary file create an unique subdir

Use {n}<file instead of fixed file descriptor numbers

This page is currently work-in-progress until it is approved by the OS/Net community. Please send any comments to `<shell-discuss@opensolaris.org>`.

use long options for "`set`"

Use `$(...)` instead of `...` command substitutions

`${ ...;}`-style command substitutions

Always put the result of a `$(...)` or `$( ...;)` command substitution in quotes

Scripts should always set their `PATH`

Only use external filters like `grep`/`sed`/`awk`/etc. if you want to process lots of data with them

If the first operand of a command is a variable, use `--`

Use `$ export FOOBAR=val #` instead of `$ FOOBAR=val ; export FOOBAR #`

Use a subshell (e.g. `$ ( mycmd ) #`) around places which use `set -- $(mycmd)` and/or `shift`

Think about using `$ set -o nounset #` by default

Avoid using `eval` unless absolutely necessary

Use the string/array concatenation operator `+=`

Use `source` instead of '`.` '(dot) to include other shell script fragments

Use `$"..."` instead of `gettext ... "..."` for strings that need to be localized for different locales

Use `set -o noglob` if you do not need to expand files

Use `IFS=` to avoid problems with spaces in filenames

Use a proper `exit` code

Use `shcomp -n scriptname.sh /dev/null` to check for common errors

Use ksh-style `function`

Use a proper `return` code

Use `FPATH` to load common functions, not `source`

`if`, `for` and `while`

`test` Builtin

Use "`[[ expr ]]`" instead of "`[ expr ]`"

Use "`(( ... ))`" for arithmetic expressions

Use builtin commands in conditions for `while` endless loops

Exit Status and `if`/`while` statements

Always use `'{'`+`'}'` when using variable names longer than one character

Avoid using the "`echo`" command for output

Use "`print`" and not "`echo`" in Korn Shell scripts

Use `redirect` and not `exec` to open files

Avoid redirections per command when the output goes into the same file, e.g. `$ echo "foo" >xxx ; echo "bar" >>xxx ; echo "baz" >>xxx #`

Use inline here documents instead of `echo "$x" | command`

Use the `-r` option of `read` to read a line

Print compound variables using `print -C varname` or `print -v varname`

Enable the `gmacs` editor mode when reading user input using the `read` builtin

Use `printf "%a"` when passing floating-point values

Set `LC_NUMERIC` when using floating-point constants

Put `[${LINENO}]` in your `PS4`