« Last post by Code Refugee on April 09, 2017, 10:18:38 AM »
A little while back I was asked to implement something to do polling. Among polls of interest are polling employees, polling customers and polling the general public.
The general problem to avoid is people voting more than once and gaming the poll.
We know this is a big problem since the media is constantly having polls with responses that turn out to be bogus, such as who are you going to vote for in the next election, even when they do random phone screening. However, I think random phone screening is pretty solid if you do a big enough sample, the problem was with their "adjustments" to the raw data in order to push their agenda. Let's assume that's not a problem here, we're going with the raw counts and now skewing. Also not concerned with whether the sample is a valid cross section in this case.
When big companies get their polls games we laugh at them. The general public says Clinton has a 99% chance of winning. The general public wants the boat to be named Boaty McBoatface. The general public says their new chip flavor should be an ode to Hitler. Well obviously none of these things were the actual opinions of the general public. The polls were either rigged or culture jammed. What idiots the poll workers and tech guys are. Obviously they used dumb tech noobs. Surely if they implemented simple safeguards none of that would have happened. However, whether they are noobs or not, the problem they are up against is a lot more of a challenge than most people suggest with solutions such as "just require user accounts", "just collect a cell phone number", "just use captchas", etc. and so forth.
Now the case of polling employees reliably is not a hard problem. We know who those employees are and they can be assigned a voting token that allows them to vote once, anonymously. If you have an employee email account on the system I can see that and that you're current, and there's not a bunch of fake accounts or such since accounts are handed out, and not grabbed by random anonymous people. Works probably the same as online election engines in some countries I imagine. Sure maybe their spouse or friend voted, but that's Ok, the problem is with multiple votes, and non-existent people voting, both which are the same problem.
Polling customers is similar to polling employees and is handled in a similar way. So no problems there, other than that of trust - some people don't trust that the system is anonymous and their opinions won't be tracked back to them, so they don't vote. Also people who think they don't care about an issue are less likely to vote. Their not caring can be a useful data point but it's OK to just assume that from their non-vote that they accept the results in advance.
The problem of polling the general public is very different from either of these.
Now a phone poll is maybe a bit more reliable. 4chan can't game it. Maybe you call the same person twice on their two cell phones, but that's not them gaming the system and isn't really going to affect results much.
But anything involving user created and selected accounts open to the general public can be gamed and subverted by a motivated opponent. And that opponent doesn't even need financial incentive. 4chan in particular will spend infinite hours gaming a system to make sure the new Doritos flavor is called "Hitler Did Nothing Wrong". These guys are far more motivated to game systems and have the ability to do so than the most motivated state actors pushing an agenda for actual personal gain. 4chan's motivation of getting "lolz" is much stronger than any other force.
Anything involving cookies and ip addresses can be gamed as well.
The only thing that works at all is weaponized tracking. By this I mean methods of dodgy legality such as persistent zombie cookies that take advantage of security defects in Flash and in browsers, browser fingerprinting, and using toolkits of dubious origin that are able to break the veil of Tor secrecy. And these methods work and you get more valid survey results if used, and you'll definitely find that any survey that becomes notable is massively gamed.
To be clear, I am not asking for any advice at all and don't want any. This is just sharing info, like a public lecture. I feel I've been all up and down studying and experimenting with this for some time. The issue isn't I need advice, the issue is that I understand the fundamental problem and I have an insight to share with you: it's impossible to have a valid anonymous public vote unless you resort to underhanded back door NSA style tracking methods.