After some time pages become more fancy. So called "web designers" appeared, they broke up content into tables and stuffed pages with a lot of tiny pictures forming frames and other design elements. Interactivity of the forms was not enough anymore and sites started to employ javascript heavily, to run different sort of programs client side. Simple, content serving sites became more and more rare in the age of web 2.0. Things became "ajaxy", sometimes even a simple "user/password" login form requires javascript and downloading of several 10k of "web design".
Web 3.0 is a new interface for the same sites: one that is optimized for content. As most of the content we are talking about are made of letters, this interface is CLI, not GUI. It takes simple commands, uses the ugly web2.0 interface instead of the user and returns with simple, content-only results.
I have been doing this for ages. Ever since web2.0 started to slow me down first. There are great tools out there, wget for handling all the http level, awk/grep/sed for processing the result. Unfortunately this is not entirely true: processing the result is PITA, especially that part when you finally got a long, crypting script that extracts the tiny bit of information you need from the ocean of senseless web design and the other day the web designer changes some divs/tables and your script fails. In such case it was sometimes easier to rewrite the script from scratch...
After a while I got enough of this. I had an idea for an ultimate solution but I am not good enough to implement that. Instead, I impemented a set of shell and awk libs called web3.0 that can easy processing all the html.