Discussion about this post

User's avatar
Jam's avatar

I run a personal website where I share my knowledge, perspectives, and creations not in search of revenue but as a form of self expression. I'd like to respond to your article point by point.

1. "The implicit bargain of web content—”I let you index my work, you send me traffic”—was never actually a contract." --- Full agree

2. "What we’re witnessing isn’t theft." ---Hard disagree there. Just because a contract wasn't signed. Copyright and authorship are codified by legal principle without the need of a contract. The act of a person producing something new (image, audio, text) implies authorship and copyright, and while I am no legal expert, authorship and copyright go a long way to establish ownership. And I don't know about you but it is natural to want to defend what you own.

3. "Current AI systems often can’t tell you where specific knowledge originated because that knowledge has been stripped of provenance during training." I wish I could respond with a GIF and that GIF would be Leonardo DiCaprio whistling, snapping, and pointing at the screen, you definitely know the one. "Knowledge stripped of provenance" is the original sin, if this wasn't a malicious choice, it reflects the unprepared nature of the LLM rollout. We live in the era of attribution automation, whether it's content ID, affiliate and tag managers...I could go on. If LLM training did not account for provenance then it points to two possibilities: 1. LLMs are trained with copyright and attribution circumvention by default (malicious) 2. LLM training was designed for research and testing purposes in a closed and controlled environment which was then coopted and exploited by malicious actors who extracted it from a research context without due dilligence and unleashed it publicly because of the sociopathic and myopic greed.

However, what is a red line for me, especially when it comes to knowledge systems and learning as a human being, is the lack of provenance by design (whether malicious or underbaked). My english, history, geography, and social sciences teachers in middle, high school and college would have crucified me (and rightfully so) for not citing the sources I used to develop research and arguments. Any information system that does not accurately cite, and more importantly, has been trained to circumvent attribution or provenance is fundamentally untrustworthy, vulnerable to manipulation, and profoundly harmful. This point says so much about LLMs, unreliable by design, and an abdication of responsibility (this last point applies to all the business leaders who have myopically foisted LLMs across society inspite of their colossal shortcomings and documented harms -- LLMs are the mechanism through which business leaders circumvent accountability).

4. The content worth producing is the work AI cannot do: ... ---agree with these points BUT you miss a crucial one: self-expression...sure not exactly a business or LinkedIn friendly term but this was the point of the web before it was ruined by big business. Before the rot economy growth cult siloes of social media, the web functioned as Gutenberg's second coming, anyone could publish. The cost and speed of expressing oneself and accessing information was drastically slashed. Sure, search engines came along and commodified the convenience of finding but at the end of the day, the purest expression of the internet is a series of interlinked files (who knew that "the series of tubes guy was onto something", if you know you know). More crucially, in the same way that Grok cannot respond to a request for comment as to why the xAI LLM produces CSAM in the same way that a dedicated human spox can, LLMs cannot express themselves because a language model does not have agency. Any crude attempts at anthropomorphizing a language model reveal more about the person doing the anthropomorphizing than about the LLM.

Aside from that, I agree with the rest of your article with the caveats expressed above (spelling mistakes, warts and all)

Expand full comment

No posts

Ready for more?