htmlSQL class – quite nice way of parsing html

Today I found a class written in PHP which implement the idea of using SQL syntax while parsing HTML documents – htmlSQL. The idea of the author – Jonas David John – is quite simple. If you want to parse a HTML document, let’s way you want to parse all divs with class “row”,the syntax will look like normal SQL:

SELECT * FROM div WHERE $class == "row"

Looking familiar isn’t it? There is also function connect() /which define the html source/ and fetch_array() which contain the results after the query.

There are few examples included in the library package and here is the simplest one:


    ** htmlSQL - Example 1
    ** Shows a simple query

    $wsql = new htmlsql();
    // connect to a URL
    if (!$wsql->connect('url', '')){
        print 'Error while connecting: ' . $wsql->error;
    /* execute a query:
       This query extracts all links with the classname = nav_item  

    if (!$wsql->query('SELECT * FROM a WHERE $class == "nav_item"')){
        print "Query error: " . $wsql->error;

    // show results:
    foreach($wsql->fetch_array() as $row){
        $row is an array and looks like this:
        Array (
            [href] => /feedback.htm
            [class] => nav_item
            [tagname] => a
            [text] => Feedback


I really like this approach – it’s looking very familiar to me – at least 99% of your time you do exactly the same in your apps – fetching some data.

Well in CakePHP is not like this ;), but I really like this lib and I definitely will use it in my projects!

One thought on “htmlSQL class – quite nice way of parsing html

Leave a Reply

Your email address will not be published. Required fields are marked *