Google Captcha Class Action Controversy

Google Antitrust

Coming out of Massachusetts is a new antitrust lawsuit against Google. This is hardly an unheard of situation. However, this one is pretty interesting. The plaintiff is alleging that Google is unfairly using unpaid labor for its own profit. How? It’s really quite ingenious. You know those captcha/recaptcha things we all deal with from time to time so we can convince the algorithm that we aren’t bots? If you’ve noticed, they tend to be some form of distorted text. Letters that are blurry, poorly written or with extra lines through them for some reason. Sometimes you even need to do that twice.

Well, the lawsuit is alleging that at least that second captcha text is being used by Google to train its text recognition AI. That matters because Google Books scans in thousands upon thousands of books, digitizes them and uploads them to the internet for free. Those scans though, are often from rough copies that might have pen marks on them, suffer from damage due to age or just have artifacts from the original printing. By making use of the captcha system, Google is teaching its AI to better deal with those problems and thus create more accurate digital versions of the work.

So what’s the big deal? Sure, they’re sort of tricking people into doing work for them but at least they are doing it for the end of making more knowledge available to more people. Obviously that’s a good thing in itself. It’s a little shady that they didn’t really tell people about it, but if that was all it was, no harm no foul. Yet, as often happens, Google goes right ahead and takes it to the next step, eating up some of that good will that we would otherwise have. How so?

There are newspapers (yes, they still exist) and magazines that are interested in digitizing their archives. Universities and governments are also trying to get their documents, books and research converted to a digital format. Along comes Google, offering their scanning and conversion software to take care of that. For a substantial price of course. That is where problems arise. Because now Google is profiting off the software that you (and everyone else) helped develop. It’s pretty understandable why that might bother someone. And in all honesty, if Google were upfront about what they were doing with the data gleaned from the captcha system, then it would be fine. People would have the opportunity at least to know what they were doing and why. But again, Google doesn’t tell people about that. It seems only fair that since Google is making a profit, they should offer at least a shekel or two for the trouble.

In fact, if they were willing to both be upfront about how they are using the data and offer something to the people helping them with it, it would be great if Google expanded the program. They could scan documents not just individually but as part of a searchable and cross-referenced database that would be a massive benefit to researchers everywhere, making it easy to find not just the one item you’re looking for but several related documents that could then be compared and contrasted. It would make Wikipedia look like LiveJournal. Hopefully, Google or someone else gets to work providing something like that in the near future.

In the meantime, this kind of situation is exactly why TARTLE exists. For a long time now, businesses have been benefiting from data generated by others. We’re offering people the ability to take control of their data again by signing up with us and funneling all of your data through TARTLE, which allows you to actually be rewarded when you share it. If you even want to share it. The choice is yours.

What’s your data worth? Sign up and join the TARTLE Marketplace with this link here.

Google Antitrust

Made with ❤ in

New Mexico