Learn to autoscale with the best source of information, Wikipedia.
7 May
Wikipedia is such a great source of information. Though originally controversial as a source of legitimate information, it has becoming increasingly accepted as a reliable source. In fact, in my day job of business consulting for companies (including Fortune 50 organizations), Wikipedia is one of the first places I check when conducting research.
However, if you have a website, and would like to automate the process of pulling data from Wikipedia, it’s not a simple task. You will need a very sophisticated scraper. Wouldn’t it be convenient if you could just query Wikipedia like a database?
Well, it seems like you actually can… with the help of DBpedia.
What is DBpedia?
As explained on its site (http://wiki.dbpedia.org/), DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link other data sets on the Web to Wikipedia data. We hope this will make it easier for the amazing amount of information in Wikipedia to be used in new and interesting ways, and that it might inspire new mechanisms for navigating, linking and improving the encyclopaedia itself.
As of January 2010, the DBpedia data set contains 3.5 million “things” with over half a billion “facts.” That’s certainly not a bad source for your autoscaling needs!
You can download the data sets from DBpedia here:
http://wiki.dbpedia.org/Downloads36
As a disclaimer, I have not personally used DBpedia. But, after just spending a few moments browsing the site, I’m conjuring up a number of ideas for some autoscale, autopilot, value-add sites.
Whats thoughts have you conjured up?
dave