Kelly O'Hair's Weblog (blogs.sun.com)

pageicon Thursday Jun 12, 2008

Removing duplicate PATH entries

Seemed like everywhere people just kept adding things to PATH without regards to whether it was already in PATH. I don't suspect that long PATH entries are a performance problem in Linux and Solaris, but Windows??? I don't pretend to completely understand all the places in a Windows system where PATH is processed and repeatedly scanned and the directories repeatedly probed, but it seemed like an easy thing to fix for all platforms, ... or so I thought. So I investigated how I could remove duplicate entries in the PATH variable using some kind of shell commands, I ended up with an awk command that worked pretty well.

Below I added newlines in the single quote argument to awk, you may need to mush those lines together into one long line, I broke it up below so it was easier to read.


# Given a PATH like string and a separator, remove duplicate entries
removeDups() # string sep
{
  if [ "${osname}" = "windows" ] ; then
    printf "%s\n" "$1" | \
      sed -e 's@\\@/@g' | \
      ${AWK} -F"$2" \
       '{ \
          a[toupper($1)]; \
          printf "%s",$1; \
          for(i=2;i<=NF;i++){ \
            if(!(toupper($i) in a)){ \
              a[toupper($i)]; \
              printf "%s%s",FS,$i; \
            } \
          }; \
          printf "\n"; \
        }'
  else
    printf "%s\n" "$1" | \
      ${AWK} -F"$2" \
       '{ \
          a[$1]; \
          printf "%s",$1; \
          for(i=2;i<=NF;i++){ \
            if(!($i in a)){ \
              a[$i]; \
              printf "%s%s",FS,$i; \
            } \
          }; \
          printf "\n"; \
        }'
  fi
}

# OS name: Linux or SunOS, pot luck on Windows
osname="`uname -s`"
sep=":"
if [ `printf "%s\n" "${PATH}"` != "" ] ; then
  osname="windows"
  sep=";"
fi
if [ "`echo ${osname} | grep -i CYGWIN`" != "" ] ; then
  osname="windows"
fi

# Need particular AWK
AWK=awk
if [ "${osname}" = "SunOS" ] ; then
  AWK=nawk
fi

# Get new path setting
newpath=`removeDups "${PATH}" "${sep}"`

# Redefine your PATH setting
if [ "${PATH}" != "${newpath}" ] ; then
  echo "# Resetting PATH to remove duplicates"
  PATH_ORIGINAL="${PATH}"
  export PATH_ORIGINAL
  PATH="${newpath}"
  export PATH
fi

Maybe someone else can get something out of this. Or suggest an even better way. ;^)

-kto

Comments:

#! /bin/bash

OFS="$IFS"
IFS=":"
DELIM=""

for c in $1 ;do
echo $c
done | sort | uniq | while read ;do
echo -n "$DELIM$REPLY"
DELIM=":"
done
IFS="$OFS"
echo

Posted by nikita on June 12, 2008 at 01:40 PM PDT #

I think you forgot about the PATH order needing to be preserved.

Also, Windows allows spaces in pathnames and uses the \ character instead of the / character. So consider a PATH like "C:/Program Files/x86/bin;C:/winnt"
(just an example).
Most "echo" commands will turn that "\x86" into the character code 0x86, which is not what you want. This is why I used printf instead of echo, to avoid the \x character conversions.

-kto

Posted by Kelly O'Hair on June 12, 2008 at 03:22 PM PDT #

Speaking of paths - I've wondered at the output of 'strace java -version' for a while. There is an enormous amount of seemingly senseless 'stat'-ing and 'open'-ing of nonexisting files and folders.

Why?

Best
Anders

execve("/usr/bin/java", ["java", "-version"], [/* 38 vars */]) = 0
brk(0) = 0x8052000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f88000
readlink("/proc/self/exe", "/usr/lib/jvm/java-6-openjdk/jre/bin/java", 4096) = 40
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/i686/sse2/cmov/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/i686/sse2/cmov", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/i686/sse2/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/i686/sse2", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/i686/cmov/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/i686/cmov", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/i686/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/i686", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/sse2/cmov/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/sse2/cmov", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/sse2/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/sse2", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/cmov/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/cmov", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/tls", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/i686/sse2/cmov/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/i686/sse2/cmov", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/i686/sse2/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/i686/sse2", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/i686/cmov/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/i686/cmov", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/i686/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/i686", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/sse2/cmov/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/sse2/cmov", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/sse2/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/sse2", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/cmov/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/cmov", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../lib/i386/jli", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../jre/lib/i386/jli/tls/i686/sse2/cmov/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../jre/lib/i386/jli/tls/i686/sse2/cmov", 0xbfa696f0) = -1 ENOENT (No such file or directory)
open("/usr/lib/jvm/java-6-openjdk/jre/bin/../jre/lib/i386/jli/tls/i686/sse2/libz.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/jvm/java-6-openjdk/jre/bin/../jre/lib/i386/jli/tls/i686/sse2", 0xbfa696f0) = -1 ENOENT (No such file or directory)

http://people.binf.ku.dk/biehl/strace.java.out

Posted by Anders on June 13, 2008 at 01:30 AM PDT #

The pathmunge function on RedHat systems (in /etc/profile) prevents duplicates from ever getting added to the PATH -

pathmunge () {
if ! echo $PATH | /bin/egrep -q "(^|:)$1($|:)" ; then
if [ "$2" = "after" ] ; then
PATH=$PATH:$1
else
PATH=$1:$PATH
fi
fi
}

Posted by naveed on June 13, 2008 at 01:00 PM PDT #

On the 'strace java -version', this looks like it's the runtime linker looking for shared libraries and it's just the way it works.
See what directories are in LD_LIBRARY_PATH. A shared library has to be looked for in every entry in LD_LIBRARY_PATH, then in /usr/lib as I recall.
The cost of these failed opens is probably low, but an over populated LD_LIBRARY_PATH is not in general a good thing, assuming that's where this is coming from.
I have traditionally tried to not use LD_LIBRARY_PATH at all, in fact the JDK builds want it to be empty to start, I consider it a special case variable to use, just my opinion.

-kto

Posted by Kelly O'Hair on June 13, 2008 at 07:42 PM PDT #

Post a Comment:
Comments are closed for this entry.