Future Imperfect & Software Stream of Consciousness : new SQL functions

The entry below on the putative consciousness of Google got me to thinking about "The Semantic Web". It was/is an initiative of W3C to make all web pages machine readable.

A good example of making dumb web pages smart is the "Apples for Sale" example. Picture this. An HTML web page has apples for sale. It is a simple page. There is a picture of an apple, a piece of text that says "Apples For Sale". Another piece of text that says $1.00 and another piece of text that says "Each". A machine reading that web page HTML would not know that it was a commerce page offering something for sale. It would not know that $1.00 is the price. It would not know that apples is the object being offered for sale and it would not know that each is the unit relating to price per unit.

The Semantic Web would change all that. It would mark-up a web page to associate all the stuff with the HTML so that a machine could sort through it.

A few years back, the "next big thing" was a rules engine. A rules engine would be incorporated into an application, and if the business rules change, you wouldn't have to change your application. You would just change a rules file that the rules engine read.

I used a rules engine for a network policy tool that decided which server would provide what services in a LAN. I expected rules engines to progress a lot further, but they have become sidelines rather than mainstream.

How a rules engine fits into the semantic web, is that a Rules Interchange Format is part of the infrastructure of the semantic web. One must agree on rules if machines are to read and understand web pages. Rules engines can be predictive or reactive (forward chaining or backwards chaining). For example, a forward chaining rules engine calculates loan risk during a credit application while a backwards chaining rules engine tells humans or other machines when inventory items are getting low.

Rules engines have not been widely used, and in my shortsighted humble opinion, it is because they are bulky, non-intuitive and put a performance hit on applications. However, I may have an algorithm for a rules engines that rocks.

Consider the following code. It is part of the The Rule Interchange Format (RIF) which is the W3C Recommendation:

Prefix(ex )

(* ex:rule_1 *)

Forall ?customer ?purchasesYTD (

If And( ?customer#ex:Customer

?customer[ex:purchasesYTD->?purchasesYTD]

External(pred:numeric-greater-than(?purchasesYTD 5000)) )

Then Do( Modify(?customer[ex:status->"Gold"]) ) )

The RIF is entirely based on "If ..... (some condition) .... then .... (do this)". What this bit of Rules Interchange Code does, is for a commercial entity to check each customer's year-to-date purchases and if they are greater than $5,000, then upgrade their status to "Gold".

The thought struck me, that one could have a rules engine that operated directly on the database. It would parse the RIF language and automagically convert it to SQL. (I will race you to the patent office on this idea).

My rules engine would create an SQL statement that would create a cursor with "Select * from CustomerTable where "YearToDate" total > '5000.00'. Then I would loop through the cursor and update the status to gold.

The great thing about this, is that this rules engine that rocks, would revolutionize data-mining and database reporting. The more that I think about it, the more that I am convinced that this could be the NEXT BIG THING in data mining.

And as for the Semantic Web, in my opinion it is a no-go. Who is going to mark-up a few billion pages that are already out there? Also the entire history of the Internet won't be re-worked so it will be useless to the semantic web. I see this function being done at a single point at the web server level, which will have context engines to recognize stuff and mark up the page as they serve it up. Now that is a workable plan.

I'd write more on this, but I have to open up an IDE and test this rules engine idea. Later.

We as humans, are manufacturing and storing data at hyper-exponential rates. Most of it gets tucked into a database somewhere and then is available for retrieval using an SQL or structured query language call.

SQL was first created or defined by IBM in the 1970's. There have been enhancements through the years, but the middleware (it is called middleware because it represents the link between the database and the user) hasn't evolved to meet the realities of the modern internet paradigm.

We are storing data, meta-data, graphics, music, and all sorts of digital stuff like we never have before. And what do we do with it? We plunk it into a database and for the most part, there it sits.

Companies have come to realize that static stored data can be monetized and contribute to the bottom line. So they purchase all of these add-on data mining and data warehouse software packages to slice and dice their data. And what do they use? SQL that hasn't changed very much from when it was conceived.

What we need now, is a serious SQL upgrade. We need functions like SELECT REGRESSION(ColumnA, ColumnB) that will help us analyze the data. We need stuff like SELECT WEIGHTED_MOVING_AVERAGE(ColumnA, 100 values). We to know if a table insert is an outrider or a fat tail. We need to know R-SQUARED of column values. We need to be able to read the contents of videos in BLOB columns. None of this has been done.

We still labor on with primitive functions in the middleware with rich client or heavy back end stored procedures to compensate. It is time that someone looked at making intelligent middleware. We need to change it from Structured Query Language to Superb Questioning Language.

Future Imperfect & Software Stream of Consciousness

The Semantic Web and a Possible Rules Engine that Rocks

SQL Update Needed