this post was submitted on 19 Feb 2024
460 points (96.0% liked)
Technology
59243 readers
3428 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
fucking bonkers that institutionalized racism can exist to such a degree that it shows up IN OUR COMPUTERS.
we’re so racist we made the computers discriminatory too.
I don't think you know how LLM's are trained then. It can become racist by mistake.
An example is, that there's 100.000 white people and 50.000 black people in a society. The statistic shows that there has been hired 50% more white people than black. What does this tell you?
Obvious! There's also 50% more white people to begin with, so black and white people are hired at the same rate! But what does the AI see?
It sees 50% increase in hiring white people. And then it can lean towards doing the same.
You see how this was / is in no way racist, but it ends up as it, as a consequence of something completely different.
TLDR People are still racist though, but it's not always why the AI is.
The bias is really introduced at the design stage. Designers should be aware of demographic differences and incorporate that into the model to produce something more balanced. It's far from impossible to design models that do not become biased in this way, even from biased data - although, that is no to say it's easy.
I suppose it depends on how you define by mistake. Your example is an odd bit of narrowing the dataset, which I would certainly describe as an unintended error in the design. But the original is more pertinent- it wasn't intended to be sexist (etc). But since it was designed to mimic us, it also copied our bad decisions.
Oh there is so much racist data that the AI is being trained on.
Your example is a simple one. But there are discriminations based on names for instance, so Johns are hired more than Quachin is, and that is by people, before it gets to the AI.
you are right, i don’t know how LLMs are trained, but ironically, this is a perfect example of a minority being privelaged by a system, and racism is still very much involved.
an important assumption you have to consider: in your example, why did the AI know what race people are in the first place? it seems a small consideration but it’s so wildly significant.
the modern understanding of race was not present throughout all of history, and only arose in the 17th century. without getting into the weeds, the fact that your fictional AI can distinguish between whiteness and non-whiteness already means it was designed by someone who understands those structures, and let them slip into the AI itself.
a perfectly well-meaning and anti-racist designer would prevent the AI from even recognizing race at all costs, both directly by sanitizing training data to remove race from the inputs, and indirectly by noting correlations with other data (such as sports, in this article) and controlling for that.