Shell Script: Get HTTP Content

Objective: Fetch the contents of a HTTP URL and save the contents in a variable in a shell script.

Either the wget or curl utility can be used to retrieve or fetch the contents of a web page.

To download a web page using wget and assign the output to a variable htmlcontent, use the following syntax:

htmlcontent=$(wget -qO - http://stackpointer.io)

1	htmlcontent=$(wget -qO - http://stackpointer.io)

To download a web page using curl and assign the output to a variable htmlcontent, use the following syntax:

htmlcontent=$(curl -sL http://stackpointer.io)

1	htmlcontent=$(curl -sL http://stackpointer.io)

Both commands will handle HTTP 3xx response codes. In other words, if the requested URL has moved to a different location and gives a HTTP response of 301, 302, 303, etc, the request will be done at the new location.

The content can also be fetched using netcat, although it’s a bit more tricky – the HTTP request headers have to be specified and not handled by default by the netcat utility. The command below will not handle HTTP 3xx response codes.

htmlcontent=$(netcat stackpointer.io 80 <<EOF
GET / HTTP/1.0
Host: stackpointer.io

EOF
)

htmlcontent=$(netcat stackpointer.io 80 <<EOF

GET / HTTP/1.0

Host: stackpointer.io

EOF

)

The $htmlcontent variable will now contain the contents of the web page. The content can be viewed by using the echo command.

echo $htmlcontent

1	echo $htmlcontent

Mohamed Ibrahim

ibrahim = { interested_in(unix, linux, android, open_source, reverse_engineering); coding(c, shell, php, python, java, javascript, nodejs, react); plays_on(xbox, ps4); linux_desktop_user(true); }