

Interesting, thanks for doing the research!
As an extreme non-expert, I would say “deliberate removal of a part of a model in order to study the structure of that model” is a somewhat different concept to “intrinsic and inexorable averaging of language by LLM tools as they currently exist”, but they may well involve similar mechanisms, and that may be what the OP is referencing, I don’t know enough of the technical side to say.
That paper looks pretty interesting in itself; other issues aside, LLMs are really fascinating in the way they build (statistical) representations of language.
This definition of social media is new to me as well, thanks for sharing it. This sort of clarifies a term I really dislike, and which you’ve used: “the algorithm”. It’s always seemed a little murky to me which algorithms it refers to. It’s like saying “don’t eat food with chemicals in it”.
Lemmy does have “an algorithm”, it’s just a relatively simple one based on communities one is subscribed to plus some vote/comment data for the various sort orderings.
Lemmy also absolutely implements a social graph – the data about who has interacted with whom is all stored by the system. It’s not explicitly stored as a graph structure, but then we’re arguing database schemas.
As I understand it, however, you’re saying “social media” arises when the “social graph” data structure is used as an input to “the algorithm”. That seems like a pretty robust definition to me.
One bit of pedantry: user blocks on Lemmy are, by a general definition, a form of social graph, and they do affect what content people see. So Lemmy could technically qualify as social media by the definition I’ve written here. I’m not sure what a more precise definition could be that avoids this technicality.