Unix / Linux: Remove ANSI Escape Sequences

Objective: Remove ANSI escape sequences from a input file on UNIX / Linux.

ANSI escape code (or escape sequences) is a method using in-band signaling to control the formatting, color, and other output options on video text terminals. To encode this formatting information, it embeds certain sequences of bytes into the text, which the terminal looks for and interprets as commands, not as character codes.

ANSI escape sequences start with 0x1B and the most common sequence is called CSI (stands for Control Sequence Introducer or Control Sequence Initiator). CSI sequence starts with ‘ESC‘ (0x1B) and ‘[‘ (left bracket, 0x5B) characters.

For ANSI color and styling (SGR – Select Graphic Rendition), CSI ends with the ‘m‘ character. An example ANSI escape sequence is shown below – it will switch the foreground color to black.

\x1b[30m

\x1b[30m

The above is with 1 SGR parameter – 30. Below are examples with 2 or 3 SGR parameters. Each SGR parameter is terminated with a ‘;‘ character.

\x1b[1;33m

1	\x1b[1;33m

\x1b[1;33;41m

1	\x1b[1;33;41m

To remove ANSI SGR escape sequences from a file “ansi.log“, use the following GNU sed syntax. This will handle up to 2 SGR parameters.

$ sed -r "s/\x1b\[([0-9]{1,2}(;[0-9]{1,2})?)?m//g" < ansi.log > noansi.log

1	$ sed -r "s/\x1b\[([0-9]{1,2}(;[0-9]{1,2})?)?m//g" < ansi.log > noansi.log

To handle SGR and EL (Erase in Line) sequeneces, the sed syntax has to be slightly modified.

$ sed -r "s/\x1b\[([0-9]{1,2}(;[0-9]{1,2})?)?[m|K]//g" < ansi.log > noansi.log

1	$ sed -r "s/\x1b\[([0-9]{1,2}(;[0-9]{1,2})?)?[m\|K]//g" < ansi.log > noansi.log

If the ANSI sequence has 3 or more SGR parameters, the above will not work as the “?” quantifier in the regular expression will only match zero or one preceding element. To match more than one element, we will need to replace it with the ‘*’ quantifier.

$ sed -r "s/\x1b\[([0-9]{1,2}(;[0-9]{1,2})*)?[m|K]//g" < ansi.log > noansi.log

1	$ sed -r "s/\x1b\[([0-9]{1,2}(;[0-9]{1,2})*)?[m\|K]//g" < ansi.log > noansi.log

The following printf statement will print yellow text on a red background. sed will then remove the ANSI escape sequence and print the text without any formatting.

printf "\033[1;33;41m %s \033[0m\n" "YELLOW on RED" | sed -r "s/\x1b\[([0-9]{1,2}(;[0-9]{1,2})*)?m//g"

1	printf "\033[1;33;41m %s \033[0m\n" "YELLOW on RED" \| sed -r "s/\x1b\[([0-9]{1,2}(;[0-9]{1,2})*)?m//g"

Mohamed Ibrahim

ibrahim = { interested_in(unix, linux, android, open_source, reverse_engineering); coding(c, shell, php, python, java, javascript, nodejs, react); plays_on(xbox, ps4); linux_desktop_user(true); }