Search Engines

8th October 2003

Yahoo News Search RSS feeds

It's not a new idea (Feedster has been doing it for a while) but it's a first for a major search engine: Yahoo are now offering RSS feeds of the results of searches within Yahoo news. The feeds are ad...

17th September 2003

Google conspiracy theories

Microdoc News have a poorly researched story suggesting that Google have been engineering their search results to favour their own properties: It could be argued that the most important site that s...

2nd August 2003

On Metadata

Tim Bray's series On Search now has a table of contents page linking to each of the previous entries. The most recent article covers metadata, and includes some insightful commentary on the huge probl...

24th July 2003

Learn to search!

Slate: Digging for Googleholes: Type in the make and model of a new DVD player, and you'll get dozens of online electronic stores in the top results, all of them eager to sell you the item. But y...

1st July 2003

time_since() on Feedster

This is pretty cool: Scott's taken Nat's time-since function and added it to Feedster, giving a quick indication of how long ago an item was posted....

16th June 2003

Tim Bray on search

I love it when bloggers stick to their word. The other day, while describing a quick Perl hack that really impressed a major client a few years ago, Tim Bray mentioned the following: Then I turne...

1st May 2003

Feedster AND searching

Feedster finally supports AND as the default search operator. This is a very good thing. I've decided to leave this site's search engine as using OR, mainly because I feel for a small search set (appr...

28th April 2003

More fun with Search

While browsing around my phoenix/ directory I spotted a sub-directory called searchplugins, which appears to control the list of search engines available in the very useful search box at the top right...

25th April 2003

Site search finally available

I've finally got around to adding a search page to this site. It uses MySQL's full text indexing, which is extremely fast and provides good results but comes at the expense of flexibility. Search term...

13th April 2003

100 random pictures

100 random AltaVista pictures is fascinating, if not guaranteed work-safe....

12th April 2003

Yahoo Search uses CSS

In all the fuss about Yahoo's new search interface over the past few days, the extensive use of CSS in the results pages was almost completely overlooked, probably because the page still contains a sm...

7th April 2003

More on the new Yahoo

Unsurprisingly, the new Yahoo is generating a whole load of commentary. There's a good thread going on Signals vs Noise, and ia/ has coverage as well. I've been playing with it a bit and it's definite...

A new Yahoo

New York Times: Yahoo Plans Improvements in Effort to Regain Lost Ground. I'm guessing this is what it's going to look like (via thelist)....

9th March 2003

Thirty five year old cookies

I'm finding myself slightly confused about the Google backlash washing around the blogosphere, which is summarised quite well by Gavin Sheridan. Most of the arguments against using Google unsurprising...

8th March 2003

Roogle

Scott Johnson has put together a blog search engine with a difference: it indexes RSS feeds rather than crawling the blogs themselves. Roogle is still under heavy development (and Scott is blogging it...

1st March 2003

Vector search engines

Building a Vector Space Search Engine in Perl: Vector-space search engines use the notion of a term space, where each document is represented as a vector in a high-dimensional space. There are as...

19th December 2002

Hotbot redesign

Douglas Bowman provides some background to the new HotBot redesign, which uses CSS for layout and almost but doesn't quite validate. It was all looking great until the HotBot Skins page told me I shou...

21st November 2002

Search engines don't care!

I've suspected this for ages, but finally it can be categorically announced that search engines just don't care about the meta keywords attribute. The only major engine that still notices it is Inktom...

31st August 2002

How the wayback machine works

How the Wayback Machine Works is a must read for anyone geeky enough to be interested in cheap clustered databases on a huge scale. The interview includes some fascinating details on the cost effectiv...

17th August 2002

Netscape Google?

Sam Buchanan: The Netscape Google mystery. A user complains of a non functional web appli ation, and when asked what browser they are using replies "Netscape Google". Sam suspects that this is because...

14th August 2002

Alchemist contest

AlltheWeb.com introduced an innovative feature called Alchemist a while ago which allows visitors to customise the site by specifying the URL to their own style sheet. They have now announced a CSS de...

Controlled vocabularies

Christina Wodtke: Mind your phraseology!, a tutorial on controlled vocabularies. The concept is very similar to that used by TopicMaps - relationships are defined between terms that take in to account...

28th July 2002

XHTML ODP attribution

The ODP require you to display an attribution on any page that reuses ODP data. The recommended attribution fails to validate as XHTML, so I created an XHTML compliant alternative which looks visually...

27th July 2002

Syndicating the ODP

Having looked at some of these tools for syndicating content from the ODP, it seems that the standard method is to grab and parse the actual HTML files from the site rather than grabbing the huge RDF ...

DMOZ for Bath

I've had my application for editorship of the DMOZ University of Bath Category accepted. Bath's main site has notoriously bad navigation, so hopefully I'll be able to use DMOZ to build an alternative....

23rd July 2002

Excite UK now powered by AllTheWeb

The Register: Tiscali to launch Excite across Europe. From Excite.co.uk:We are back - and we have created the most complete search channel on the web for you. Take advantage of our collection of 2,100...