In the first part of this article we got into the idea of reputation systems and started to look at how we can tie them into the abuse reporting systems to help an abuse team prioritize which problems to investigate first and how to adapt the system based on user actions and their reputations to encourage constructive behavior and make it harder for destructive users on the system to cause problems.

When we left off, we had talked about the actions a system should take on users based on deciding that a given tweet was abusive and deciding to delete it from the system. Let’s carry that forward to see how that decision can affect content throughout the system and leverage that decision to solve other problems as a side effect.

Since we’ve now flagged the URL as inappropriate in the URL reputation system, we can do a number of things. Assuming that reputation system actually recognizes the final content, we build a list of ALL URLs in Twitter that point to that piece of content and flag them with a negative infinity reputation. That would trigger Twitter to remove all of the tweets including all of those URLS. Removing all of those tweets would then trigger reputation changes for the poster of that tweet as done above.

So, by using user actions (positive and negative) about a single tweet to generate a ranked listing of the tweets generating the largest negative response, we can bring that tweet to the notice of the abuse team, who can evaluate it. If they decide the tweet is abusive, they can delete it and that act will affect the reputations of everyone who interacted with that tweet. If the tweet includes a URL (and things like graphics have internal URLs so would be included in this) then that action can be rippled out to all tweets that include that URL or any URL that ultimately links to that content, and the same actions can be taken on all users interacting with all tweets that involve that URL. So the single administrative action can remove a problematic piece of content from the entire system with thoughtful systems design in tracking the content in the system.

On the flip side: let’s say the admins evaluate the content and decide it is NOT abusive, so they take no action. The result of that would then be that all users who reported the as a problem get their reputation reduced (because they were wrong) and all of the users who liked the item get their reputation increased (because they were right).

The numbers I’m using here are all arbitrary and intended to show the scale of the changes: an individual report doesn’t mean a lot, so the change is minor. A change made because it was reviewed by the abuse team causes a much larger change, because they are (at least in theory) trained and able to make judgements against company policy consistently.

There’s a third component we should add that can modify reputations: with the emergence of machine learning systems, we can teach the system to watch the firehose of content and learn how to make administrative judgements. This can have great value in helping these systems scale, but they have to be trained carefully.

One place where machine learning could be very useful is in bot identification and disarming. There are specific patterns bots show, from using the same set of profile images and descriptions to sharing the same URLs in parallel to many users to using the same tweets and phrases. Tracking these kind of repeated patterns across a large number of users would be one way to better identify bots, and when a bot is found to be doing something abusive, then it can be disabled along with the other bots that the system have identified to be related to it in activity. By rolling these relationships across the entire botnet, and machine learning can be used to help defang and disable they large botnets being controlled from a single command source.

While talking about this, I’m putting a focus on abusive material, but these systems don’t need to be limited to that. They can be set up to handle all types of inappropriate material including copyright violations and the challenges we’re seeing with fake news being distributed.

Two final notes to close this part of the series: Reputations are internal numbers only; users should have no way to know what their reputation is. This will be controversial, I think, but necessary because you risk turning “winning the reputation game” into its own problem. The more visible this number is, the more users will change their behavior and look for ways to avoid judgement, and the more data you give the organized groups to find ways to hack the system. To be most effective, it has to be something the system uses to make judgements and not something the users have as a scoreboard.

We’ve talked a bit about some of the ways we can use this data to solve problems on the system.lets dive into application of reputation data more deeply.

Lets think about how we can turn Twitter or some other social network into a safe space for its users. There’s some basic functionality that ought to be available for users to protect themselves:

  • Users should be able to report a tweet or a user. Reporting a tweet or a user also automatically adds a block against that user or content.
  • Users should be able to extend that block to the social network of the user, both following and followers
  • Twitter needs to flag bots the way verified users are flagged, so it’s clear content is coming from a bot. And users need ways to mute or block bots, either temporarily or permanently.

But with these reputation systems, users have new options as well. For instance, users can be given the capability to mute any posting by someone below a certain reputation level. By default, if a new user reputation is set at 1,000, we could set this to, say, 500. We can do the same with the data out of the URL reputation system and users can choose to mute any tweet pointing to content with a reputation below some given number.

That allows users to clear low-value tweets out of their stream. This filtering is invisible to the senders, so they will for the most part assume what they’re doing is working, limiting how often they try to escalate the abuse. If a user wants to see what their unfiltered stream is, they can look at their stream on the twitter web page not logged in, with the filters only applied to the logged in stream.

To me one of the strongest aspects of this system is how the reporting system interacts with the URL reputation database and how that data ripples out across the system to defang problematic URLs wherever used, not just the single tweet the abuse team recognized.

Using Reputations to enable user capabilities

If you think about it, the use of weighted averages to evaluate user reports is a way of affecting a user’s capability to impact the system. The lower the reputation for a user, the less chance they have to impact other users with their actions.

You can take that a step further: at some point, a user’s reputation is so low you can choose to throw out their reports completely, and set their ability to impact other users to zero.

Then consider that from the flip side: the higher a reputation, the more impact they have on whether a tweet ends up high in the abuse evaluation queue. At some point, a few key users will build high enough reputations that when they report a problem, you can skip the queue directly and simply act on the report directly. Think about it: if enough of their previous actions were evaluated by your abuse team and found to be correct, you can at some point come to the conclusion that future actions will be correct as well. You need to set this bar high, but it gives you a way to use the reputation systems to identify the key users that can supplement your abuse team by trusting their actions, which amplifies the ability of the abuse team to scale and cover more issues.

You can extend this out one more level. Even if a specific tweet doesn’t get reviewed by the abuse team, the weighted average can be used to allow the system to make an automated decision on the content. Perhaps it’s the relationship between negative and positive reports on the content, and/or the speed those reports come in along with the weighted averages of the reporting users that allow the system to circumvent humans and come to a decision on a tweet. I’d probably NOT ripple that decision across the system based on an automated decision, but that’s another thing that ought to be investigated and evaluated.

With these kind of tools, you can build community management and abuse tools that work with a reasonably sized team overseeing their actions and tuning the results — and to it at web scale. Even better, all of these techniques are well understood and how to implement them is fairly straightforward, even at web scale into the tens of millions of users and more. These sites just need to decide to do so. Given how poorly the existing abuse systems work at Twitter and other social sites like Facebook, I’m hoping this will at least generate some discussion inside the social companies about ways they can enhance what they’re doing by adopting some or all of these ideas.

Challenges and problems with these reputation systems

These systems aren’t perfect, and they are going to cause some disruptions and problems. Here are some of the issues I’ve identified that should be investigated by teams looking to implement these ideas.

First, as I said above, the numbers I’ve used above are examples of the kind of numbers to start with, but what those numbers should be will involve policy decisions and experimentation. Perhaps the reputation modification might be 1 for copyright violations, 3 for fake news reports and five for abuse reports. What I feel is important is that reports that are validated by the abuse team — we have a person involved doing the evaluation against policy and standards — have a much greater effect than non-evaluated decisions. An order of magnitude may seem huge, but I think the difference between an automated decision and a human-curated one deserves that significant a difference. Having said that, though, anyone implementing systems like this need to make decisions based on their own policies and do testing and evaluation to understand how these policies affect their site and user behavior.

Second, these systems will bias a system towards a reduced diversity of opinion because it will be biased by the reporting tendency of the larger sets of users. That’s inevitable and one reason to weight the abuse team decisions heavily is to give them influence to counter-act that. It should be noted that communities tend towards this reduced diversity over time with or without systems like this and I haven’t seen a reporting system designed yet that doesn’t introduce some bias against diversity, but it’s something to be aware of so that your management policies can try to minimize the bias. Echo chambers are inevitable, they seem to be human nature. I used to feel they needed to be actively discouraged, but these days, I’m not so sure that the fight is worth the effort and stress to the community. It’s a big subjective grey area.

Finally, all systems are vulnerable to gaming, especially from skilled and coordinated groups looking to find advantage by hacking the system. You can’t design to be fully safe from this kind of activity, but you can minimize it and be aware of the possibility so that it can be recognized and neutralized before it gets widely established. It’s one of the realities of social systems that you simply can’t algorithm away, although companies continue to try (and fail).

A final thought

Of the failures in management I’ve seen in the social networks, a few common elements appear. One is an over-reliance on technology to solve problems: an attempt to replace the human aspect and judgement of community management instead of leverage and amplify it. Algorithms, no matter how well you build them, are stupid, and people will find ways to exploit and circumvent them. When you’re dealing with human beings on your system, you need a human element managing those interactions.

The other common problem: these networks try too hard to not have an opinion. Allowing a diversity of views is good, but there are some views that don’t deserve a platform, and attempting to pretend you can stay out of that discussion is a mistake sites invariably regret, because the people behind those opinions will take advantage of your unwillingness to take a stand to leverage your site to their goals, and those sites invariably end up being seen as supporters of those views, whether they want to or not. Decide what you’re for and against, and then set and enforce limits based on that; if you don’t, others will set those limits for you and you won’t like the results — and putting this genie back in the bottle once you’ve let it loose is difficult, and rehabilitating your reputation after is slow and painful.

Just ask Twitter about that.