Soft: A Software Environment Abstraction Mechanism

Rémy Evard and Robert Leslie
Experimental Systems Group, Northeastern University

ABSTRACT

In a traditional UNIX environment, software is installed in many different locations across a distributed filesystem. In order to effectively use the software, users must correctly configure their PATH, MANPATH and other related shell environment variables. A large and dynamic software environment can cause havoc for users as they try to locate programs not in their PATH, resolve filename collisions, and keep up with changes made by administrators, while attempting to update their startup files. In turn, administrators must notify users of new directories or values to put in their files and must spend time debugging users' environments.

A layer of abstraction between the available applications and the way those applications are made available to users through environment variable settings adds a great deal of flexibility for both users and administrators. Administrators can configure and modify software installations without having to notify users of changes. Users may simply indicate which sets of software they desire to use, or they may create arbitrarily complex user-specific modifications. We have implemented this with a mechanism that does not lose speed at login time and which does not use any special shells.

Introduction

Two years ago, the systems administration staff and computing systems in the College of Computer Science at Northeastern University underwent a significant reorganization. The entire computing environment was rebuilt from scratch, opening a number of opportunities for inventing unique solutions to common administration problems. A complete description of this endeavor, named Tenwen, can be found elsewhere in these proceedings.

One such problem we faced is familiar to many administrators: informing users how to configure their software environment, starting with the PATH environment variable, but continuing with other important variables, such as MANPATH, TEXINPUTS, XAPPLRESDIR, and so forth.

A seemingly simple problem actually has few adequate solutions. The name and syntax of the necessary startup file to be modified depends on the user's login shell, and the actual modifications to make depend on how software is being installed on the system. Neither of these are necessarily something all users would be able to determine for themselves.

An abstraction mechanism simplifies both sides of the user-administrator continuum by separating the details of how, where, and why software resides where it does from the way users can access it: Users can configure their environments easily, reliably, and generally without fuss, while at the same time administrators can be flexible with their choice of software installation design. Changes can be made to all users' environments simultaneously when necessary to reflect the natural dynamic nature of the computing system.

We have named this software abstraction mechanism "Soft".

Site Information

The Experimental Systems Group manages the computers in the College of Computer Science, consisting of approximately 350 computers of various types and around 1200 active users. The group is made up of both full-time staff members and student volunteers, totalling an average of 10 people each quarter.

At last count, over 2500 different software packages had been installed on each of our major architectures in the last two years. Our users range from complete novices to expert UNIX programmers.

Typical Software Installation Factors

The way software is installed greatly influences the way users' environment variables must be configured; there are usually one or more significant directories such as /usr/local/bin where the majority of software is installed. There are also many other directories specific to particular applications or computers. Besides the PATH and MANPATH variables, there are sometimes additional environment variables which must be set correctly to work with certain applications.

All of the above is true regardless of whether a comprehensive method for installing software is used, but having such a method can actually increase the usefulness of an automatically configured environment. For example, if your software installation method emphasizes the separation of large applications into individual directories, then those directories can be combined and manipulated easily by the administrators in such a way that the applications become available painlessly to users, either automatically, or through a simple modification to one file for those users interested in the applications.

Problems

The usual mechanism for configuring a new user's environment is to create a default PATH and MANPATH in the user's default startup files. This works well until a new directory needs to be added, and all of the existing users need to be told to update their files. Many will fail to get the message. Inevitably, others will end up with directories in the wrong order for their particular needs, and name conflicts will mystify users until they give in and ask for help. This mechanism also typically fails to address other important environment variables specific to various applications.

Another common mechanism is to set default environments from global shell configuration files, if such global initializations are supported by all of the various shells in use. However, there are often some odd cases which users must handle, and, inevitably, the directories will be in the wrong order for certain people. Also, many people will want to customize the default environment, a feat which will be difficult for them to do without overriding the defaults completely.

The overall result is that users will have to know more about your software installation methods than either they or you want, ending up complicating their lives, constraining your flexibility, and generating unnecessary overhead in terms of endless questions and environment bugs.

Previous Work

envv, by David F. Skoll, as posted to comp.sources.misc, provides a primitive abstraction method by creating shell-independent scripts. However, users must still know which of those scripts to access, and it is expensive at login time.

user-setup, from Auburn University (LISA VI paper) allows users to select software packages and creates dotfiles for them. However, it relies on software installations using Modules (LISA V paper), and it only works for csh-style shells, making it virtually useless for many people.

Depot, (LISA VI paper) may appear to be related, but in fact addresses a different issue. _Depot_ and similar packages manage software installation but do not specifically address user environment issues. Both layers of abstraction are needed in a complex environment.

Design Goals

We wanted to create a basic abstraction mechanism that allowed maximum flexibility for both the administrators and the users. We wanted to remain in complete control of the abstraction, yet we felt it important that users be able to customize their environment in even the most detailed way possible.

We wanted a system that was fast, and that provided a suitable default environment without any special effort. We also wanted to be able to specialize an environment easily to include specific subsets of software for people who were beta-testing programs, or for people who needed access to administrative programs, and so forth.

Finally, we wanted to be able to group related software in such a way that the entire group could be added to one's PATH in a simple step. This grouping mechanism would be powerful enough to enable users to resolve their own name collisions. For example, users would have the ability to override the standard UNIX utilities with their GNU counterparts if they desired to do so.

Overview of Soft

Soft works by reading a single file from the user's home directory, .software, to determine how the user's environment should be configured. This file has its own simple syntax, independent of any shell.

Each word in this file is an index key to an administrator-controlled database listing all of the directories and environment variables associated with each group of software available on the system.

There are two kinds of database entries: application entries and macro entries. Application entries contain all of the information needed to use a particular application or class of applications: the necessary PATH and MANPATH components, as well as any other needed environment variables and contents. It is possible to create application entries which do not have any PATH or MANPATH components, but are used merely to introduce other variables into the environment.

Macro entries allow several other entries to be combined into a single keyword. A macro may contain references to application entries, or other macros. When a macro keyword is found in a .software file, the system expands the macro in a recursive manner according to the entries in the database until it resolves completely into a list of application entries. Macro entries are distinctly identified by a preceding `@' character.

The system processes the .software file whenever the user logs in. It creates an executable shell script based on the contents of the file that would customize the user's environment. This script is called the .software _cache_ file, because it is recreated only when necessary. The vast majority of the time, no updating is necessary and the cache file can be executed (sourced) quickly. The cache file must be updated only when either the user's .software file is changed, or when the system-wide database is changed.

User Interface

We currently give all of our new users a simple, default .software file which gives them our standard supported environment. We have written documentation for the Soft mechanism in the form of man pages, which users can read to learn how to customize their .software file and their environment. The mechanism is heavily customizable, satisfying even the most finicky of our users.

Most users will be content to select from the predefined list of keywords to customize their environment, ordering the keywords in the order they wish to ultimately order their PATH and other variables. At any point, users may insert other special directives into their .software file of the form:

  VARIABLE=value
which introduces an environment variable and an associated value. The two variables PATH and MANPATH are treated in a special manner, such that:
  PATH=/proj/demeter/bin
inserts the directory /proj/demeter/bin into the user's PATH relative to the other path modifiers surrounding it.

For those users who would rather avoid this system completely, they may simply delete their .software file, and no customizations will take place whatsoever. (At our site, several users have done this, but then moved back to using a .software file.)

Since the .software file is shell-independent, users can change shells and have the same software environment without having to modify any startup files. Only the cache file is dependent on what variety of shell is in use. Both sh- and csh-style scripts can be generated, but generating cache files for other types of shells would be a simple addition.

Implementation

In the following list, $HOME represents each user's home directory, while $SOFT represents the Soft installation directory. The complete set of files used by the Soft system is as follows:

$HOME/.software
the user's configuration file
$HOME/.software-cache.{sh,csh}
the user's shell-specific startup cache files (never modified by hand)
$SOFT/software-env.{sh,csh}
the global shell-specific startup files (never change)
$SOFT/software-env.db
the global configuration database, where administrators make updates
$SOFT/make-sw-cache
a Perl script to create a cache file from a .software file
Installation of the Soft system requires initially creating the software database, and placing hooks into each supported shell's global initialization files to source the appropriate $SOFT/software-env.* file. (If a shell does not support any global initialization files, the hook can be placed in the user-level startup files.)

In practice, we have found it helpful to define an alias, "resoft", in each user's environment which will immediately cause any changes in the user's .software file to be reflected in the environment (updating the cache file if necessary). This alias merely invokes the same hook as above to invoke one of the $SOFT/software-env.* scripts. It must be an alias, rather than an actual command, so that it can modify the current environment.

Administrators update the database when new directories for software are created, or when a special application requires specific environment variables. Nothing is required when an application is installed in an existing directory and has no special environment variables.

Experiences

Our user feedback has been completely positive. The number of PATH-related questions we have answered in the last year totals approximately 5.

We had to modify the global database quite frequently when it was first built, but have settled down to about once a month or less, depending on the level of activity related to software installations.

The performance of the system has been quite satisfactory. A small amount of time is needed to rebuild the per-user cache file when changes are made, but this is infrequent, and the cost of the entire system is negligible otherwise.

The ability of the system to cope with both sh- and csh-style shells has proven useful not only because it allows users to change shells without grief, but also because it can work simultaneously with windowing startup files (typically Bourne shell scripts), allowing the windowing environment to parallel the user's login shell environment (which may be of the csh variety.) This allows programs invoked from a window manager to be used the same way as programs invoked from the command line.

We have found that the the actual switch from X11R5 to X11R6 as far as users are concerned will be a simple matter of modifying one macro definition in the system-wide database file. In fact, we plan to move to a completely new software directory structure in the upcoming months without disturbing the user community at all.

Planned Improvements

There are several ways we expect to be able to improve the Soft mechanism.

We plan to write user-interface software that will help even the most novice users configure their .software files to their satisfaction. The interface will consult our independent software installation system and help database to provide detailed information to the user about the software they are selecting. It will provide an easy mechanism to reorder their PATH, and insert custom directories.

We also plan to extend the .software file format and syntax such that we can conditionalize various pieces only to be effective on particular machines or platforms. For example, if a particular directory only exists on one machine, then it should only appear in users' PATH variable when they login to that machine. Similarly, other variables could be customized in a dynamic fashion depending entirely on the attributes of the host machine. The .software cache files would be constructed with real shell conditionals to achieve this effect.

Conclusion

The Soft abstraction mechanism has made life much simpler for the administrators and the users of our computer systems. The administrators can change the underlying software architecture and configure software packages without having to notify the user community. The users have a very simple startup mechanism with no loss of functionality.

We highly recommend this system to anyone administering a site of any size.

Availability

Our implementation of the Soft abstraction mechanism is available via anonymous FTP from ftp.ccs.neu.edu:/pub/sysadmin/soft-1.1.tar.gz.

Author Information

Rémy Evard has been the leader of the Experimental Systems Group at Northeastern University for two busy years. He received his M.S. in Computer Science from the University of Oregon in 1992, and has worked as a systems administrator and as a consultant for Argonne National Laboratory for the past six years. He may be reached electronically at remy@ccs.neu.edu.

Robert Leslie is a full-time undergraduate student in the College of Computer Science at Northeastern University. He has worked closely with the Systems Group for nearly two years, helping to design and implement many of the changes in the College's computing environment since its restructuring. He is scheduled to graduate with a B.S. in Computer Science in 1996. He may be reached electronically at django@ccs.neu.edu.

Bibliography

Appendix A: Manual Pages

Following are two manual pages, one for the .software file format, and one for the `resoft' command.

SOFTWARE(5)               FILE FORMATS                SOFTWARE(5)

NAME
      .software - configuration file for user environment

DESCRIPTION
     Your software environment in the College of Computer Science
     (namely,  your PATH, MANPATH, and possibly other environment
     variables) is initially determined when  you  login  by  the
     contents of this file in your home directory.  The mechanism
     for doing this is designed to  be  flexible,  easy  to  use,
     efficient, and independent of which shell you use.

     The file contains a list of words  separated  by  whitespace
     (spaces, tabs, or newlines). Comments can appear anywhere in
     the file, following a hash  (#)  character.  Each  word  can
     either  be  a keyword to be expanded by the system, or some-
     thing of the form

          ENVIRONVAR=value

     which defines an environment variable.  The  variables  PATH
     and  MANPATH are special cases, and are treated in a special
     way as described below.

     Each keyword in your .software file is looked up in a system
     dictionary  to  find a translation for your PATH and MANPATH
     environment variables. The order of these keywords is impor-
     tant;  these  variables will be built in the same order that
     the keywords appear. Some keywords are prefixed with an  at-
     symbol  (@);  these  are  used in the same way as other key-
     words, except  the  system  automatically  expands  them  in
     macro-like  fashion  into  a  list  of  equivalent  keywords
     without the at-symbol. If any keyword appears more than once
     (including  as  part  of  macro  expansion),  then the first
     occurrence of the keyword takes precendence.

     As mentioned above, words of the form  ENVIRONVAR=value  can
     be placed anywhere in your .software file to set environment
     variables automatically for you when you login.  The  advan-
     tage  of  doing this over setting them in your .login, .pro-
     file or other shell initialization file is that these  vari-
     ables  are guaranteed to be set regardless of what shell you
     use (provided your shell is officially supported by the sys-
     tem).  There  are  two special cases in the way this is han-
     dled:

          PATH=/some/bin/directory

          MANPATH=/some/man/directory

     Each of these directives will append  the  specified  direc-
     tories to the respective variable, not replace its contents.
     You can  freely  mix  PATH=  and  MANPATH=  with  any  other
     keywords,  and  thereby easily create a fairly comprehensive
     environment. You are free to use $VAR  syntax  to  represent
     the value of an existing variable in these directives.

     The exact list of keywords that you  can  use  to  configure
     your  environment  is  determined by a system-wide database.
     Its contents are subject to change periodically, however the
     following  keywords  are  generally  guaranteed  to be valid
     (case is not significant):

     system         The  set  of  system  directories  containing
                    standard UNIX commands.

     ccs            The set of  local  (CCS)  directories,  which
                    should almost always precede system.

     sys5           The set of System V directories for Sun  sys-
                    tems.

     GNU            All GNU software and utilities.

     X11            The latest revision of the  X  Window  System
                    software (version 11).

     TeX            All TeX publishing  software,  including  the
                    proper  special  environment variables needed
                    to use it correctly.

     adm            Directories with special administrative  com-
                    mands,  usually  only  useful  to users doing
                    systems administration.

     beta           Directories with  software  undergoing  beta-
                    testing.  This software is technically unsup-
                    ported, but possibly useful or interesting.

     home           The directories $HOME/bin/$ARCH and $HOME/bin
                    (from your home directory).

     dot            The single current directory "."

     Unless you have specific needs to arrange the above keywords
     in a specific manner, the following macros are probably more
     useful:

     @base               All essential local  and  system  direc-
                         tories  (ccs and system). If you are not
                         using any of the following  macros,  you
                         should   at   least  use  this  in  your
                         .software file, or  you  may  find  many
                         common commands missing.

     @standard           This gives you  a  standard  environment
                         that includes essential directories, GNU
                         software,  X,  and  any  other   special
                         applications  as  defined  by the system
                         database.

     @gnu-standard       This the same as @standard, except  that
                         the  GNU  software precedes the standard
                         system software,  so  that  certain  GNU
                         commands   will  override  their  vendor
                         counterparts. This is  recommended  only
                         if you understand its consequences.

     @all                This  is  a  more  general  macro   that
                         includes  a  few more things than @stan-
                         dard, particularly sys5 directories.

     @gnu-all            Finally, this is the same as @all except
                         that   GNU   software   precedes  system
                         software, similar to @gnu-standard.

     Note that none of the above macros include home  or  dot  in
     them,  so you should include them yourself wherever you want
     them. It is not recommended that you use dot at all, because
     it  can  be  a  security  problem. If you must use it, it is
     recommended that  you  put  it  at  the  very  end  of  your
     .software file.

     The information in your .software file is  cached.  Normally
     when you login, the system checks to see if this cache needs
     to be updated. If so, the system rereads your .software file
     to  rebuild  the cache; if you are using the system interac-
     tively, you will see a message informing you of  this  fact.
     The   system  caches  your  .software  file  into  the  file
     .software-cache.shell also in  your  home  directory,  where
     shell  is  replaced  by  either sh or csh depending on which
     variant of shell you use.

     If you make a change to your .software file,  you  will  not
     see  the  effect  of  those  changes until the next time you
     login. However, the command resoft can  be  used  to  update
     your  environment  on-the-fly.  If  you find for some reason
     that even after doing this your changes do not  seem  to  be
     taking  effect,  your  cache file is probably faulty and you
     can simply delete it and try again.

BUGS
     If there are syntax errors in your .software file, the  sys-
     tem  generally  will  not  try  to use it at all, and simply
     gives you a minimal default environment and a  warning  mes-
     sage.  To  discover  the problem, review your cache file and
     look for the error.

     If you have no .software file at all in your home directory,
     this  entire description does not apply, and the system will
     not provide you with  any  default  environment.  While  not
     strictly  a bug, this is not recommended practice. For those
     users with special needs, however, this may be considered  a
     feature.

FILES
     $HOME/.software
     $HOME/.software-cache.{csh,sh}
     /ccs/etc/software-env.db
     /ccs/etc/software-env.{csh,sh}
     /ccs/etc/make-sw-cache

SEE ALSO
     resoft(1)


RESOFT(1)                USER COMMANDS                  RESOFT(1)

NAME
     resoft - effect changes in software environment

DESCRIPTION
     The resoft command is specific to the  software  environment
     in  the  College of Computer Science at Northeastern Univer-
     sity. It is used to make your current  software  environment
     reflect that defined by your .software file.

     resoft is actually defined as an alias;  each  of  the  sup-
     ported  shells will automatically define this alias for you.
     The alias normally does something similar to the following:

          source /ccs/etc/software-env.csh

     The actual definition of the alias depends  on  which  shell
     you use.

SEE ALSO
     software(5)

Appendix B: Sample software-env.db

The following is a sample $SOFT/software-env.db system-wide database file.

#
# Database format:
#   - One logical line per entry; physical lines may be crossed using \
#   - Comment lines beginning with `#' are ignored
#   - Words are separated by whitespace
#   - Keys are case-insensitive
#   - Ordering in this file is not significant
#   - Environment variables may be used freely in path settings
#   - Either the PATH or MANPATH may be set to `-' to ignore it
#   - No need to enclose environ vars between { }
#   - Warning: in case of duplicate keywords, only the last is effective
#
# To update every user's .software cache, touch this file.
#
#  (keyword)	PATH	[ MANPATH	[ ENVIRONVAR	SETTING		... ] ]
#

#
# System-wide standard settings:

(system)	/usr/ucb:/bin:/usr/bin:/etc:/usr/etc	/usr/man
(sys5)		/usr/5bin
(ccs)		/ccs/bin:/local/bin			/ccs/man:/local/man
(home)		$HOME/bin/$ARCH:$HOME/bin		$HOME/man
(dot)		.

#
# Special system/group settings:

(adm)		/ccs/adm/bin:/priv/adm/bin:/local/adm/bin:/local/etc	\
			/ccs/adm/man:/priv/adm/man:/local/adm/man
(beta)		/ccs/beta/bin:/local/beta/bin				\
			/ccs/beta/man:/local/beta/man
(sun4-games)	/usr/games

#
# Collections of software:

(GNU)		/local/gnu/bin			/local/gnu/man
(X11R5)		/local/apps/X11R5/bin		/local/apps/X11R5/man
(X11R6)		/usr/X11R6/bin:/ccs/X11R6/bin:/local/X11R6/bin		\
			/usr/X11R6/man:/ccs/X11R6/man:/local/X11R6/man
(Openwin)	/local/apps/Openwin/bin		/local/apps/Openwin/man

#
# Application packages:

(TeX)	-	-	\
	TEXFORMATS					\
	  .:/local/apps/tex/lib/formats:/ccs/apps/tex/lib/formats \
	TEXINPUTS	.:/ccs/apps/tex/lib/inputs	\
	TEXPOOL		/ccs/apps/tex/lib		\
	BIBINPUTS	.:/ccs/apps/tex/lib/bib

(ObjectCenter)		\
    /local/apps/objectcenter/bin:/local/apps/objectcenter/sparc-sunos4/bin  \
    /local/apps/objectcenter/man

(ECLiPSe)	/local/apps/eclipse/bin/$ARCH	/local/apps/eclipse/man	\
		KEGIDIR	/local/apps/eclipse

(gcc-2.5)	/local/gnu/apps/gcc-2.5.8/bin	/local/gnu/apps/gcc-2.5.8/man

#
# Research software

(CM)		/usr/cm/bin			/usr/cm/man
(MasPar)	/usr/maspar/bin			/usr/maspar/man		\
		MP_PATH		/usr/maspar

#
# Miscellaneous

(MH)		/local/apps/mh/bin		/local/apps/mh/man
(NetHack)	/local/apps/nethack/bin		/local/apps/nethack/man
(Netrek)	/local/apps/netrek/bin

#
# The following are macros for groups of apps or "latest version".
# They should resolve to a list of other keywords or macros in this database.

(@x-windows)	X11R5
(@games)	NetHack Netrek sun4-games
(@all-apps)	TeX MH

#
# Bundle the essential paths into a macro:

(@base)		ccs system

#
# Choose-your-own-software-bundle ... "home" and "dot" not included:

(@standard)	ccs system GNU @x-windows @all-apps
(@gnu-standard)	ccs GNU system @x-windows @all-apps

#
# More bundles, more inclusive:

(@all)		@standard sys5 @games
(@gnu-all)	@gnu-standard sys5 @games

#
# For those users with empty/bad .software, this entry is required:

(@default)	@standard home dot

Appendix C: Sample .software Files

An example of a simple .software file:

@base		# Get basic system commands - leave this!
@x-windows	# All basic X utilities
GNU		# GNU software
@all-apps	# All other special applications

A more complex .software file:

GNU			# Make GNU utils override standard system commands
@base			# Get basic system commands - leave this!
PATH=/lib:/usr/lib	# Include /lib stuff
@x-windows		# All basic X utilities
MH			# MH mail system commands

RESEARCH=$HOME/research			# Create a custom environ var

PATH=$RESEARCH/bin/$ARCH:$RESEARCH/bin	# Include a custom path
MANPATH=$RESEARCH/man			# Custom man pages

home		# Include $HOME/bin/$ARCH + $HOME/bin
TeX		# TeX publishing package
@games		# We like to have fun once in a while...

dot		# Include "." in the path, at the very end

Appendix D: Sample .software Cache File

A sample cache file that was generated from the previous example of a complex .software file:

# DO NOT MODIFY THIS FILE DIRECTLY
#
# This file was created automatically by the software system. You can force
# its recreation by altering or touching the following file:
#
# -rw-r--r--  1 django        692 Aug  2 10:16 /home/django/.software
#
# For more information, refer to software(5) and resoft(1).
#
setenv RESEARCH $HOME/research
setenv TEXFORMATS .:/local/apps/tex/lib/formats:/ccs/apps/tex/lib/formats
setenv TEXINPUTS .:/ccs/apps/tex/lib/inputs
setenv TEXPOOL /ccs/apps/tex/lib
setenv BIBINPUTS .:/ccs/apps/tex/lib/bib
#
setenv PATH /local/gnu/bin:/ccs/bin:/local/bin:/usr/ucb:/bin:/usr/bin:/etc:\
/usr/etc:/lib:/usr/lib:/local/apps/X11/bin:/local/apps/mh/bin:\
$RESEARCH/bin/${ARCH}:$RESEARCH/bin:$HOME/bin/${ARCH}:$HOME/bin:\
/local/apps/nethack/bin:/local/apps/netrek/bin:/usr/games:.
#
setenv MANPATH /local/gnu/man:/ccs/man:/local/man:/usr/man:\
/local/apps/X11/man:/local/apps/mh/man:$RESEARCH/man:$HOME/man:\
/local/apps/nethack/man
#
# End of cache (v4.4)