Barriers to finding my ideal bank account
January 30, 2008
I want to be able to search in Google for the most cost-effective account to use overseas. With certain assumptions about my usage pattern and a spreadsheet of the fee structure of all the appropriate products in Australia, this is easily calculable. The spreadsheet, however, is hard to come by.
Commenting on my last post, Jack objected that the problem is too hard, because everything is important, to various degrees. To take this in to account, a “hand-crafted approach” is needed. On the topic of everything being important: this is true. However, the vast majority of content on the Internet is simply not the kind of information that the semantic web is concerned with. You mighe change, slightly, the way you link to a product or item to make it more trackable. You might standardise the way you treat tags. You might more have greater standardisation in terms of flagging, for example, a review as a review, and a rating as a rating. However, the changes you would make are relatively small compared with the amount of content out there. The time investment from individuals, minimal, the benefits large- once the tipping point has been reached.
Karan objected on the basis that the problem is too computationally difficult. However, as I suggest above, once the data is assembled, the problem is not at all difficult. The question then, is how to make that data available to Google. I borrow heavily from Berners-Lee in the following logic.
How to assemble data about transaction accounts in Australia? First, I suppose, we need to get a list of banks. APRA has such a list. Now if APRA wanted to provide a bank search function, they could either try to lift the information off the website using textual cues or- far more easily- take advantage of behind-the-scenes tagging. This page describes such-and-such a product, which is of a particular kind (link to definition on APRA website), and each fee would be tagged as being a fee of a type (again, linking to an APRA definition.) The only difference from the current situation is the links. The content is all in place.
Once this happens, the APRA search can easily find the bank account I want. But how to get these results directly in Google? If all the tagging and linking is done in a standards compliant way- say, as described by W3C- Google can pick up on those hints and make use of the data too. Better yet, Google can easily use it’s page rank mechanism to identify that the APRA search function is a good match for my query. If the APRA search function is itself appropriately designed and tagged, Google can pass my query directly to it, and return the results without any further action on my part.
Identifying the search is easy, but Google automatically using it is somewhat complicated. This is by no means insurmountable, however. We’re talking about Google, after all. All of this is easily within the capabilities of technology we have right now. All that’s missing is the co-operation.
Finding answers
January 5, 2008
In last weeks post I asked:
Is if it’s even in-theory possible to have a general-purpose search engine that you can just drop in to a knowledge-rich environment and, hey-presto, get the answers you’re looking for?
After thinking about this for a week, and reading Tim Berners-Lee’s 2001 article from the SciAm, I’m inclined to think that: no, it’s not possible. And it’s pretty obvious that it’s not possible. As Tim writes, “To date, the Web has developed most rapidly as a medium of documents for people rather than for data and information that can be processed automatically.” This goes double for the pile of information I need to digest at work. The trouble is that I have a bunch of unstructured data, and to work out what it all means and how it all relates, I need to do a lot of work. Of course it’s not possible for a search engine to do this currently!
They don’t even come close. Tim has a few examples of questions that, given a sufficiently structured data source, are very easily computable. Here’s my example: what bank in Australia has the lowest fee structure, with a preference for low overseas withdrawal fees. Plug that in to Google, and you’ll not get an answer. Oh, you might get linked to a half a dozen comparison sites, but they never seem to quite be interested in what you’re interested in. You might get linked to a Choice Magazine article that seems to have the answer: if you’re willing to pay $15 to access the article for 3 days. But you won’t get tabulated results saying: this are the bank offerings that most closely match your needs.
The semantic web could change that. Wouldn’t it be lovely. And yet it hasn’t happened. Are the technologies out there? Well, that’s my next question. Will it be as obvious as my last?
Studds knows
December 12, 2007
My job involves knowing a reasonable amount of stuff. Not as much as, say, someone investigating the development of nerves in the gut and how that effects feels. But a deal more than you need to know in order to serve beer. To help me know all of this stuff we have a Knowledge Management System. This KM system involves what are technically referred to as “buckets.” Inside this buckets, we place PDF files. We access these buckets through a website. The buckets are organised in a tree structure. There can be as many as 7 clicks to get to a file. There is a rudimentary search feature. The normal way to find an answer is to know where it is, or be told where it is. Those things that are frequently needed are bookmarked in IE.
Now, I said that this is to help me know all this stuff: that’s not really true. It’s more a knowledge repository. The idea is to get all the knowledge out of people’s heads before they leave the company. New staff have to learn it the old fashioned way. I’m finding out the hard way that this is not so easy. Which has got me thinking: what could be done to make it easier? This blog is a place for my thoughts and discoveries on this.
I’ve got a bunch of ideas already, which need developing. I’ve clicked through to Wikipedia and ooh’d and aah’d at terms like “Information Theory” and “Expert Systems.” What I propose is a two-pronged approach. On the one hand, I’ll expand and work out some of my thinking. On the other, I’ll explore at some reactions to the literature. I start tomorrow!