Removing Duplicate PATH Entries | Linux Journal
The purpose here’s to take away replica entries from the PATH variable.
But earlier than I start, let’s be transparent: there is no compelling reason why to
to try this. The shell will, in essence, forget about duplicates PATH entries;
handiest the primary prevalence of anybody trail is essential.
Two motivations force this workout.
The first is to take a look at an awk one-liner that first of all
does not actually seem to do a lot in any respect.
The 2nd is to feed the wishes of those that are frustrated by means of
things like having replica PATH entries.
I first had the urge to try this when operating with Cygwin.
On Windows, which places nearly each and every executable in a special
listing, your PATH variable briefly can turn out to be overwhelming,
so disposing of duplicates makes it somewhat much less complicated
when you find yourself seeking to decipher what is if truth be told on your PATH variable.
Your first considered the right way to this could be to get a divorce the trail
into the person components with sed and
then go that thru type and uniq to do away with duplicates.
But you’ll briefly understand that that does not paintings, since you may have
now reordered the trails, and you do not want that. You wish to stay
the trails of their unique order, simply with duplicates got rid of.
The unique concept for this was once no longer mine. I discovered the fundamental
code for it on the web. I do not consider precisely the place, however
I imagine it was once on Stack Exchange.
The unique bash/awk code was once one thing like this:
PATH=$(echo $PATH | awk -v RS=: -v ORS=: '!($zero in a) ')
And it is shut. It nearly works, however earlier than having a look on the output,
let’s take a look at why/the way it works.
To do this, first understand the -v choices. Those set the enter
and output Record Separator variables that awk makes use of to split
the enter knowledge into person data of knowledge
and the right way to reassemble them on output.
The default is to split them by means of newlines—this is, each and every
line of enter is a separate document.
Instead of newlines, let’s use colons because the separators,
which supplies each and every of the person paths within the PATH variable
as a separate document.
You can see how this works within the following the place you convert handiest
the enter separator and depart the output separator because the newline,
and get a hold of a easy awk one-liner to print each and every of the weather
of the trail on a separate line:
$ cat showpath.sh export PATH=/usr/bin:/bin:/usr/native/bin:/usr/bin:/bin awk -v RS=: '' <<<$PATH
$ bash showpath.sh /usr/bin /bin /usr/native/bin /usr/bin /bin
So, again to the unique code.
To lend a hand realize it, let’s make it take a look at bit extra awkish by means of reformatting
it in order that it has the extra commonplace development
or situation glance to it:
!($zero in a)
The situation here’s !($zero in a).
In this, $zero is the present enter document, and a is an awk variable
(using the in operator, tells you that a is an array).
Remember, each and every enter document is a person trail from the PATH variable.
The section within the parentheses, $zero in a checks to peer if the trail
is within the array a.
The exclamation and the parentheses are to negate the situation.
So, if the present trail isn’t in a, the motion executes.
If the present trail is in a, the motion does not execute,
and because that is all there may be to the script, not anything occurs if that’s the case.
If the present trail isn’t within the array,
the code within the motion makes use of the trail as a key to
reference into the array.
In awk, arrays are associative arrays, and referencing a
non-existent part in an affiliate array routinely creates
By developing the part within the array, you may have now set the array so
that the following time you notice the similar trail part, your condtiion !($zero in a)
will fail and the acton won’t execute.
In different phrases the motion will execute handiest the primary time that you just see a trail.
And in spite of everything, after referencing the array, you print the present trail,
and awk routinely provides the output separtor.
Note that an empty print is similar to print $zero.
Let’s see it in motion:
$ cat nodupes.sh export PATH=/usr/bin:/bin:/usr/native/bin:/usr/bin:/bin echo $PATH | awk -v RS=: -v ORS=: '!($zero in a) '
$ bash nodupes.sh /usr/bin:/bin:/usr/native/bin:/bin :
As I mentioned, it nearly works.
The handiest downside is there may be an additional newline and an additional colon on
the next line.
The further newline comes from the truth that echo is including a newline
onto the tip of the trail, and because awk isn’t treating newlines as
separators, it will get added to the tip of the closing trail,
which, on this case, reasons it to seem like awk failed to take away a replica.
But awk does not see them as duplicates, it sees
/bin and /binn.
You can get rid of the trailing newline by means of the use of the -n technique to echo:
$ cat nodupes2.sh export PATH=/usr/bin:/bin:/usr/native/bin:/usr/bin:/bin echo -n $PATH | awk -v RS=: -v ORS=: '!($zero in a) '
$ bash nodupes2.sh /usr/bin:/bin:/usr/native/bin:
And you are nearly there, excluding for the trailing colon, which isn’t if truth be told
an issue. Empty PATH components can be overlooked, however since you may have come this
a ways in this fairly unnecessary adventure, it’s possible you’ll as smartly move the gap.
To repair the issue, use awk’s printf command somewhat than print.
Unlike print, printf does no longer routinely come with output document separators,
so it’s important to output them your self:
$ cat nodupes3.sh export PATH=/usr/bin:/bin:/usr/native/bin:/usr/bin:/bin echo -n $PATH | awk -v RS=: '!($zero in a) '
$ bash nodupes3.sh /usr/bin:/bin:/usr/native/bin
You is also a little bit perplexed by means of this in the beginning look.
Rather than getting rid of the trailing separtor,
you may have reversed the common sense, and you are outputting the separator first,
then the PATH part, so as an alternative of desiring to get rid of the
trailing separator, you want to suppress a number one separator.
The document separator is output by means of the primary %s structure specifier
and springs from the duration(a) > 1 ? ":" : "",
so it’s only published when there may be a couple of part within the array
(this is, the second one and next instances).
As I mentioned on the outset, there is no reason why it’s important to take away
replica trail entries; they reason no hurt.
However, for some, the easy proven fact that they’re there may be
reason why sufficient to get rid of them.