String Manipulations in Bash

You need string manipulation, no matter what you are. You may be a coder in the software team or Linux administrator in the system team or any team member in the DevOps team. Therefore, you must learn string manipulation for different languages. Especially if you are a system/DevOps engineer, probably you have to use different languages but it is nearly impossible you are an expert in string manipulation for many languages. But if you are an expert in Bash, it does not matter which language use when you need advanced string manipulation operations. Because Bash is a universal language in the Linux world. You can call the system command from the main language platform at any time. Then, let’s be expert string manipulation in Bash. I am sure your life will be easy after that. :)

Bash supports a surprising number of string manipulation operations. It is impossible to mention here. But do not panic, I am working on Linux for over many years and I know the most useful ones. However, we should glance at some basic operations, although it sounds very simple. After this fundamental basic knowledges, we will study real examples for advanced operations.

Note: I will assume you know fundamental Linux and Bash knowledge.

We can access the length of a string using the hash (#) operator.

$ VAR=Batur
$ echo ${#VAR}



We can extract a substring using the colon (:) operator.

$ VAR=Batur
$ echo ${VAR:1}
$ echo ${VAR:2}
$ echo ${VAR:1:3}



Following syntax deletes or replaces match of $substring from $string


The syntax deletes the longest match of $sub from the front of $string

$ VAR="Batur Orkun"
$ echo ${VAR##Batur}




The syntax deletes the longest match of $sub from the back of $string

$ VAR="Batur Orkun"
$ echo ${VAR%%Orkun}




It matches the pattern in the variable $string, and replace only the first match of the pattern with the replacement.

$ VAR="Batur Orkun"
$ echo ${VAR/r/R}


BatuR Orkun


Replace all the matches

$ VAR="Batur Orkun"
$ echo ${VAR/r/R}


BatuR ORkun

Note: Regular Expression (RegEx) is an important tool for string manipulations. You should learn basic Regular Expression at least. Knowing regular expressions will make life easy while struggling with string manipulations. But you must be careful while using RegEx.

For example:

digit="456"if [[ $digit =~ [0-9] ]]; then
echo "$digit is a digit"
echo "digit is NO digit "


456 is a digit

This is another simple example of using RegEx in bash. But I can not say logic is right. Because, if you set the digit to “456a”, the output says it is “digit” again. So we should fix it.

digit="456a"if [[ $digit =~ ^-?[0-9]+$ ]]; then
echo "$digit is a digit"
echo "digit is NO digit "


456 is a NO digit

Basic & Common RegEx Operators:

  • The ^ indicates the beginning of the input pattern
  • The - is a literal "-"
  • The ? means "0 or 1 of the preceding (-)"
  • The + means "1 or more of the preceding ([0-9])"
  • The $ indicates the end of the input pattern

Bash includes many magics for string operations. You can use some utilities which already came installed packages in Linux. For example; grep, sed, and awk are very useful command-line utilities.

grep = global regular expression print

It is unargued that the most popular and used. It uses to search a string in the output of a command and in a file or files.

$ grep "batur" myfile

In this example, grep would loop through every line of the file “myfile” and print out every line that contains the word “batur”

“grep” can take lots of options but some of them are very useful. For example; If you need line numbers, use the “n” option.

$ grep -n "batur" myfile

You can search it in many files: $ grep “batur” myfile

It prints all found lines with the filename. But if you just need filenames, use the “l” option.

$ grep -l "batur" myfile

Imagine that, Grep thought everything that you need. You should glance output of “$ man grep”. But you can find a few useful options below.

  • “ -c “ : Print only a count of the lines that contain the pattern.
  • “ -l “ : Print only the names of files with matching lines, separated by newline characters.
  • “ -i “ : Ignore upper/lower case distinction during comparisons.
  • “ -n “ : Print each line by its line number in the file. ( first line is 1).
  • “ -v “: Print all lines except those that contain the pattern.
  • “ -r “: It recursively searches the pattern in all the files in the current directory and all its sub-directories.
  • “ -w “: It searches the exact word

For example, I think the option”-i” is important. Normally, searches happen case-sensitive. But, If you want to search by not -case-sensitive, you must use “-i” option.

$ grep -i "batur" myfile

It founds also words like “Batur” or “BATUR” or “baTur”. There is even more beautiful than that: you can use RegEx with “grep”.

For example; if you want to get lines only ending with “Orkun”:

$ grep "Orkun$" myfile

if you want to get lines just including “Batur Orkun”:

$ grep "^Batur Orkun$" myfile

Notice: pgrep is a special grep command. It is an acronym that stands for “Process-ID Global Regular Expressions Print”. pgrep looks through the currently running processes and lists the process IDs. It is useful when all you want to know is the process id integer of a process.

$ pgrep nginx

If there are running processes names matching “nginx”, their PIDs will be displayed on the screen. If no matches are found, the output is empty.

Example successful output


sed: stream editor

SED performs editing operations on text coming from standard input or a file. It can do lots of functions on files like, searching, find and replace, insertion or deletion. SED supports regular expression which allows it to perform complex pattern matching.

For example; You want to change the content of a file.

$ sed 's/devops/DevOps/' myfile

It performs changing all “devops” in content to “DevOps”

The output content will be the changed content of the file. You can save the content to another file with the “>” pointer or edit the original file.

$ sed "s/devops/DevOps/" myfile > newfile


$ sed -ni "s/devops/DevOps/" myfile

“-i” : Edit files in place

“-n” : Suppress automatic printing of pattern space. ( — quiet, — silent )

We use “-n”, because we don’t want to view output content.

If you use sed at all, you will probably want to know these commands.

The “s” command is probably the most important in sed and has a lot of different options. Its basic concept is simple: the s command attempts to match the pattern space against the supplied regexp. if the match is successful, then that portion of the pattern space which was matched is replaced with replacement. We used it above.

Syntax: "s/regexp/replacement/flags"

Print out the pattern space (to the standard output).

Syntax: "/pattern/ command"

For example; You want to list files or directories which names include “picus”

$ ls -l | sed -n '/picus/ p'

Maybe just you need directories, not files

$ ls -l | sed -n '/picus/ p' | grep '^d'

You can delete unwanted lines by SED

$ sed -i '/picus/ d' myfile

This command deletes the lines included “picus” word and edits your original file because of using the “-i” option.

Delete the pattern space


A text pattern scanning and processing language. Yes, You read right! AWK is a text-processing programming language. It is a direct predecessor of PERL and is still very useful in modern systems.

Awk command can be used to :

  • Arithmetic and string operations.
  • Scans a file line by line.
  • Splits each input line into fields.
  • Compares input line/fields to a pattern.
  • Performs actions on matched lines.
  • Produce formatted reports.
  • Conditionals and loops.

AWK can have an optional 3 parts; BEGIN{} , MAIN {} and END{} sections.

BEGIN { …. initialization awk commands …}
{ …. man awk commands …}
END { …. finalization awk commands …}

When you run “ls -l” command on the terminal, you may see the output like below.

-rw-rw-r — . 1 centos centos 535 Mar 20 2020
-rw-rw-r — . 1 centos centos 1890 Feb 24 2020
-rw-rw-r — . 1 centos centos 7836 Feb 14 2020

For example; Find the sum size of files listed

$ ls -l | awk 'BEGIN {sum=0} {sum=sum+$5} END {print sum}'



If you separated this output by space, the size column will be fifth.

An input line is typically made up of fields separated by white space If you want to use a different separator, use the FS option by the regular expression. The fields are called $1, $2, …, while $0 refers to the entire line. If FS is null, the input line is split into one field per character.

You must have understood how to find “$5”

$ echo "A-B-C-D-E" | awk -F "-" '{ print $2 }'



$ echo "A-B-C-D-E" | awk -F "-" '{ print $1,$5 }'



You can operate on files. Example file called “myfile”:

1) Name        Surname   City 
2) Batur Orkun Ankara 70
3) HaticeEbru Orkun Istanbul 95

For example; Print name and city columns in a simple format.

$ awk '{print $2 “=” $4}' myfile



$ awk '/Orkun/ {print $0}' myfile

Output: ( The lines included “Orkun” )

2) Batur Orkun Ankara 70
3) HaticeEbru Orkun Istanbul 95

$ awk '/Orkun/{++cnt} END {print “Count = “, cnt}' myfile


Count = 2

$ awk 'length($2) > 5' myfile

“length” function returns the length of data
Output: ( The lines length of the second input data is greater than 5 )

3) HaticeEbru Orkun Istanbul 95

What if you want to get lines fifth column value has greater than 70:

$awk '{if ($5>70) print}' myfile


3) HaticeEbru Orkun Istanbul 95

What if you want to get line numbers but get rid of the parenthesis:

$ awk '{ print substr( $1, 0,1 ) }' myfile

“substr” function returns the portion of the string specified by the offset and length parameters.



What if you don’t want to print the first line:

$ awk '{if(NR>1)print}' myfile

NR variable has line number


2) Batur Orkun Ankara 70
3) HaticeEbru Orkun Istanbul 95

What if you want to list lines that have 4 columns:

$ awk '{if (NF==4) print}' myfile


1) Name Surname City

Notice: You can see different awk types:

NAWK stands for “New AWK”. This is AT&T’s version of the Awk.

MAWK, a fast implementation that mostly supports to standard features. it is smaller and faster than gawk but has limits on nf and “sprint” buffer size.

GAWK stands for “GNU AWK”. All Linux distributions come with GAWK. This is fully compatible with AWK and NAWK.

Take care and do not forget…:)

All the best people in life seem to like LINUX. (S. Wozniak)

DevOps & Software & Architect & Linux Geek —

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store