Understanding the ‘tr’ command

tr command in UNIX is a text processing command which is used to either replace or delete characters. The syntax of the command is as such –

tr [OPTION] string1 string2

Remember, anything that is written in square brackets is optional. Now, let us understand how tr works.

tr “a” “A” (enter)

Now when the above command executes, the shell will send you to the input area, i.e. you will not be seeing the $ sign which you see to give the command. Anything that you write is now treated as a string and the operations are performed accordingly.

tr “a” “A”

avantika (this is the string that I have given as input. On, pressing enter, the tr command executes and the changes are displayed)

Avantika (this is the output string)

Errors Made – 

1. tr “a” — When you try executing this command you get an error like this -” tr: two strings must be given when translating”. So, make sure that while using tr, you do provide both the strings.

2. tr “a” “” — If you try to give an empty string, you get an output as – “when not truncating set1, string2 must be non-empty “. However, remember if you give the first string to be an empty string, it does not shows any error. The simple issue will be that since you have not provided what need to be changed, nothing will get changed. Example– tr “” “avantika”. Here, since the shell does not know, what needs to be changed, whatever you give as the input, will get printed as it is.

3. tr “a” “A” file_name — Remember, you can only provide three things to the tr command, First is the option (which is optional), second is the first string and lastly the next string. If you want to change the contents of the file, you need to use the concept of pipe and cat command.

UNDERSTANDING THE USE OF CAT AND PIPE IN tr

Now, most of the times we don’t want to make changes to the single input. When we want to make changes, it is mostly in the entire file. As told earlier for that you will require the cat command and the pipe. How does it works?

cat file_name |tr “a” “A”

Understand – In doing this way, first the file is opened using the cat command and the contents are then passed to the tr command because of the pipe (In pipe, the output of the command to the left is sent as input to the command at the right). Now, the tr command has got its input, it executes, make changes and displays (the way it was doing before. Every time you were giving input, pressing enter, the output was displayed).

UNDERSTANDING tr AND REDIRECTION

Now, most of the times when changes are made to a file, you want to keep those changes. By simply, executing tr you are not making any changes to the file. If you want to store those changes, it is very important for you to store it in a file.

To do this, you need the tr command, cat command, pipe and the redirection operator.

cat file_name|tr “a” “A”|cat >> new_file

Understand – First the cat command executes, it opens the file and the content of the file is then pushed into the tr command. The tr command makes changes to the input that it has got, but this time instead of printing it, it is storing the “changed” content to the new_file.

When you now do, cat new_file, you will be able to see the changed content.

DELVING DEEPER

 tr “a” “A” (enter)

avantikaaa (input)

AvAntikAAA (output)

                                                                                                                                         

tr -s “a” “A” (enter)

avantikaaa

AvAntikA

What happened here? Where did those extra a went? This is the magic of -s option. Here what happens, is that the repetitive characters of the string 1 get replaced by a single character present in string 2.

So, here in the last all the three a’s get replaced by a single A.

                                                                                                                             

tr -s “ava” “A”(enter)

ava (input)

a (output)

Why a single a?

Well, in this command, for every value or character that you write in string 1, you are assining the value a. So that means, a gets a , v gets a and then a again gets a.

If we try running this command without the -s option, we will get aaa.

Now, what happens is that when you do it with -s. It squeezes all of these a’s and prints a single a.


tr”bc” “fd”(enter)

bcbcbcbcbc

fdfdfdfdfd

 

In this command, the b  gets changed by f and c gets changed by d.

                                                                                                                                              

tr -s “bc” “fd” (enter)

bcbcbcbc

fd

 Simple thing, you know what happened without -s. With -s, it will get squeezed and give a single fd.

Well, this is not what happens. You instead get the output as

fdfdfd. But why?

Because, the tr command is not taking bc as a single string and converting it with a single string fd. No. Instead, you are specifying that b should get changed with f and c with d and since there are no multiple occurence of b and c, -s simply not works.

Now, let us try the above command with a different input –

bbbcccbbbccc

fdfd

WHY?

Here, all the b will get converted to f and c to d and then the multiple occurences will be squeeezed into one.

                                                                                                                                   

tr “ava” “bcd”

avantika

dcdntikd

 

Why is it printing dcd? It should have printed instead, bcd, right?? Well, what is happening here is that, when the tr tries to find out the replacement characters for the characters present in string 1, for a it first finds b, it makes changes and then finds d for a. So, the final change is taken into consideration and is applied to the entire string.

                                                                                                                                         

tr -s “ava” “bcd”

avaavaavaavaava

What happens?

The shell knows it needs to replace a with d and v with c and since there is -s for every consecutive occurence of either of these in the string, it has to be replaced by the single character.

So, a gets replaced by d, v by c and then there are two occurences of a. Each first gets squeezed into a single character and then the change is applied.

Note -: You can now do this with [a-z][A-Z]. In this case, for every lower case letter you are assigning the upper case letter associated.

Happy Learning 🙂



Leave a comment