Archive for the ‘Search’ Category

January

03

by Kaj Kandler

I recently decided to replace the lucene based search engine on Plan-B for OpenOffice.org with a Google Custom Search engine. At first glance this seems to be an easy task. Remove the old code and replace it with some Google Java scripts. However this is not how it turned out to be.
I targeted a layout, where the search box is part of the general navigation menu bar and results appear on their own page. However the HTML/CSS code generated by Google is rather inflexible. The two page template came the closest as it generates two separate code snippets, one for the search box and button and one for the search results.
So I had to add some CSS to make the divs and its generated child elements inline elements

div#cse-search-form {
display: inline-block;
zoom: 1;
...
}
div#cse-search-form * {
display: inline;
...
}

Another inconvenience is that the JavaScript includes an absolute URL for the results page. But it also works when I omit the protocol and hostname part

options.enableSearchboxOnly("/search/index");

January

03

by Kaj Kandler

I have replaced the Plan-B for OpenOffice / LibreOffice search engine with Google Custom Search.

The local search engine based on lucene was heavy on resource consumption and did require a lot of effort to keep up the indices with new or changing content. So I decided to switch to a Google Custom Search Engine.

I hope this change makes the site an even better resource or OpenOffice and LibreOffice users. Please let me know if you have any suggestions on how to improve search on the site.

June

02

by Kaj Kandler

I can’t believe what I just found on the Yahoo!Search Blog about removing pages from a website. The author says “The best way to remove dead URLs from the Yahoo! Search index is to return an HTTP Error 404 when our crawler requests the page.”

Are they serious, really serious?

The HTTP spec clearly says return code 404 is “Not Found” temporarily and 410 is “Gone” permanently. They even say in th explanation for code 404 “The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address.”

Yahoo slurp is free to treat a 404 page as if removed although I don’t think it serves the searching public well. However, I can’t understand why the Yahoo!Search blog teaches webmasters to send a 404 if a 410 return code is appropriate.

Just needed to rant about this, because this blog has for sure a wide readership.

July

27

by Kaj Kandler

Google has a new lab project “Google Accessible Search” which ranks the results for ease of accessibility.

This new service (currently on Google Labs) adds a small twist to Google web search: in addition to finding the most relevant results from Google as usual, Accessible Search further prioritizes results based on the simplicity of their page layouts. When you search from the Accessible site, you’ll get results that are prioritized based on their usability. This tends to favor pages with few visual distractions, and pages that are likely to render well with images turned off. Google Accessible Search is built on Google Co-op’s technology, which emphasizes search results based on specialized interests. (from Friends of Google newsletter)

Search for “tutorials” on Google (regular) and you find 468 millions of results and on top are some that are reach in graphics. The first item is Section 508 compliant. However, the second item, Sun’s Java tutorials violates this important accessibility test.

Search for “tutorials” on Accessible Search and you’ll find a different set of supposedly clear cut text based websites with no or little images. In my test this is not to obvious. Number one, the University at Albany, has only a header image. However number two, CProgramming.com, does create pop-ups, is quite image rich and fails the Section 508 test as well.

I’m not sure if this is so helpful for blind people or those with other impairments.