AmbitiousProcess

AmbitiousProcess@piefed.social · 3 days ago

Could you elaborate on how it’s ableist?

As far as I’m aware, not only are they making a version that doesn’t even require JS, but the JS is only needed for the challenge itself, and the browser can then view the page(s) afterwards entirely without JS being necessary to parse the content in any way. Things like screen readers should still do perfectly fine at parsing content after the browser solves the challenge.

AmbitiousProcess@piefed.social · 3 days ago

Because the easiest solution for them is a simple web scraper. If they don’t give a shit about ethics, then something that just crawls every page it can find is loads easier for them to set up than a custom implementation to get torrent downloads for wikipedia, making lemmy/mastodon/pixelfed instances for the fediverse, using rss feeds and checking if they have full or only partial articles, implementing proper checks to prevent double (or more) downloading of the same content, etc.

AmbitiousProcess@piefed.social · 27 days ago

If you have any family members who still play the “I still think Trump is doing what’s right” card because the media they consume has managed to spin every possible event as positive somehow, ask them how allowing a known cancer-causing chemical that directly financially benefits Russian corporate interests is beneficial to this country.

If the media can find a way to convincingly spin this as a good thing, I’d be shocked. Not that shocked, but shocked nonetheless.

AmbitiousProcess@piefed.social · 1 month ago

I’d actually found them to be better than Google for a while, but coincidentally after the AI craze started really taking off, search quality significantly degraded. Maybe that’s not so much of a coincidence after all.

AmbitiousProcess@piefed.social · edit-2 1 month ago

Presearch is not fully decentralized.

All the services that manage advertising, staking/marketplace/rewards functionality, and unnamed “other critical Presearch services” are all “centrally managed by Presearch” according to their own documentation.

The nodes that actually help scrape and serve content are also reliant on Presearch’s centralized servers. Every search must go through Presearch’s “Node Gateway Server,” which is centrally managed by them. That removes identifying metadata and IP info.

That central server then determines where your request goes. It could be going to open nodes run by volunteers, or it could be their own personal nodes. You cannot verify this due to how the structure of the network works.

Presearch’s search index is not decentralized. It’s a frontend for other indexes. (e.g. it outsources queries to other search engines, databases, and APIs for services it’s configured to use) This means it does not actually have an index that is independent from these central services. I’ll give it a pass for this since most search engines are like this today, but many of them are developing their own indexes that are much more robust than what Presearch seems to be doing.

This node can return results to the gateway. There doesn’t seem to be any way that the gateway can verify that what it’s being provided is actually what was available on the open web. For example, the node could just send back results with links that are all affiliate links to services it thinks are vaguely relevant to the query, and the gateway would assume that these queries are valid.

For the gateway to verify these are accurate, it would have to additionally scrape these services itself, which would render the entire purpose of the nodes pointless. The docs claim it can “ensure that each node is only running trusted Presearch software,” but it does not control the root of trust, and thus it has the same pitfalls that games have had for years trying to enforce anticheat (that is to say, it’s simply impossible to guarantee unless presearch could do all the processing within a TPM module that they entirely control, which they don’t. Not to mention that it would cause a number of privacy issues)

A better model would be one where nodes are solely used for hosting to take the burden off a central server for storing the index, and chunks sent to nodes would be hashed, with the hash stored on the central server. When the central server needs a chunk of data based on a query, it sends a request, verifies the hash matches, then forwards it to the user, thus taking the storage burden off the main server and making the only cost bottleneck the bandwidth, but that’s not what Presearch is doing here.

This doesn’t make Presearch bad in itself, but it’s most definitely not decentralized. All core search functionality relies on their servers alone, and it simply adds additional risk of bad actors being able to manipulate search results.

AmbitiousProcess@piefed.social · 1 month ago

Is there truly an audience for “I don’t want any proof, just answer my question”?

More people than I think we’d like to admit. Most people don’t spend time verifying whether or not what they’ve seen is true, they just believe what they see first, especially if it conforms to their existing beliefs.

After all, these models are quite literally plausibility machines. Their entire goal is to generate text that sounds plausibly accurate, because that’s how manual content reviewers fine-tune them. Their sole purpose is to generate whatever sounds plausible, not what’s necessarily correct, so if there’s one thing that will convince the masses that what it says is correct, it will be these “AI” models.

AmbitiousProcess@piefed.social · 1 month ago

I don’t understand why anyone uses any of their platforms.

The answer to this question about almost any shitty platform is almost always the network effect.

Leaving Meta’s platforms means leaving where most of your friends and family spend their time digitally, which makes it harder to connect with the people you know. No one can collectively agree on an alternative platform to all simultaneously move to, so in most cases, leaving Meta practically means cutting yourself off from your entire social graph.