Search Engines: biases and problems

I had a recent post disappear from listing on Word Press and shortly after it disappeared almost entirely from search engine results as well.  The post only managed to remain as a shadow in Google results in the form of indirect links and some cached pages of when Word Press had listed it, but it disappeared without a trace in Yahoo results.  The last time I checked it never even showed up at all in other search engines.  This got me wondering how search engines work.  Both Google and Yahoo had originally shown and cached the direct link to the post, and so their web crawlers had already discovered it.  However, when it disappeared from Word Press listing the search engines followed suit.  Were the web crawlers no longer able to see my post even though Google and Yahoo previously had the direct link to it?

Also, I’d noticed in the past that the search engines seem to treat the various blogging sites differently.  For a while, I had several blogs going on several hosting sites because I was testing them out.  I was posting the exact same things to each of them, but I often noticed that the My Opera blog often showed up higher in search results than my other blogs.  Now, I use only Word Press because I like its functionality the best.  This recent event, however, made me wonder how often my posts might not show up at all in search results. 

To test it out, I did a search of a blog title that was posted when I was using all of the blogging sites.  In Yahoo search results, only the My Opera post was given a direct link and the other posts such as from Word Press only were given indirect links through the blogs home link, through tag listings, or through other websites’ hyperlinking.  Google gave very different results which gave direct links to the postings on all of the blogging sites, but put Word Press as the top result.  Did Google put Word Press on top because it’s the only blog of mine that is active right now?  If so, why did Yahoo give preference to My Opera which I haven’t used in recent months?  Also, why didn’t Google show direct links to my recent disappeared post on Word Press? 

I did another comparison search between Google and Yahoo using a different early post of mine.  This time Google showed the direct links to my posts on all of the blogging sites except it left out the direct link to the Word Press post.  Yahoo, for some reason, didn’t show a direct link to my post on any of the blogging sites, but did show several indirect links.  As a further experiment, I did a search of the Word Press web address for that post and it doesn’t show up at all in either Google or Yahoo.

Another question that comes to mind is the matter of the biases of search engines.  Do search engines filter their results to fit my past searches?  I’d be fine if they do this as long as they tell me they’re doing this.  And to what degree does advertising and vested interests influence results?  Furthermore, what about the government?  Covert government sites get erased from Google Earth for example.  It wouldn’t surprise me if they don’t simply erase those sites but even replace them with natural looking terrain so that no one would realize something was missing.  It is without a doubt that the government censors some information on the internet.  The question is what kind of information and how often? 

But not everything is nefarious or intentional.  Quite possibly, my disappeared posting was just a glitch.  So, how typical are such technical failures?  If a search engine doesn’t show something as existing, how does someone know it exists?  Even if someone knows it exists and even know an exact title or phrase, how do they seek it out if search engines aren’t helpful?  Do traces remain of disappeared, removed, and lost information?  How can someone recognize a trace of something once having existed or still existing unseen?  How often can those traces lead someone to finding the information?

The first example that made me aware of problems with search engines had to do with the fairly popular writer Acharya S.  She comes up a lot on the internet.  She was partly involved with the heavily watched Zeitgeist film which created the biggest buzz on the internet than any other web realeased film before.  She runs a website that has tons of useful info about her field of expertise.  There really is no other website that is even close to being comparable if you’re interested in researching the subject of astrotheology.  However, when in the past I did a direct “in quote” Google search for the name of her website, I didn’t find it in the top results.  The direct link to her website only showed up several pages beyond the first page of results.  The first several pages were filled with her detractors and other websites linking her website.  If I do a Google search for an exact title, why doesn’t it give me the most exact result right at the top?  Why does it give pages of indirect links before showing the direct link itself?

Are there search engines that give you more control instead of feeding you the info it thinks you want?  Is there a search engine that is upfront and transparent about its biases?