How to stop spam contributions?

auryn · March 15, 2021, 5:21pm

In clr.fund round 4, we hit the message cap for the first time, which meant that no new messages could be broadcast and a handful of contributors were unable to vote. There was nothing to suggest anything about this was malicious, it was just signs of early success.

However, in Gitcoin Grants round 9, there has been some contribution behaviour that is clearly spam.

Bot accounts making hundreds of tiny contributions to every project in the round, presumably to farm future airdrops that use contributing to Gitcoin as recipient validation.

Along with potential airdrop farming, in clr.fund’s case, there is additional incentive for spam transactions. As an attacker, you can gain disproportionate influence over a round if you can be certain that no new messages can be sent.

How can we solve for this issue?

I haven’t put any significant amount of thought into it yet, but wanted to kick off a discussion nonetheless.

A naive solution might be a flat message fee that is either burned or contributed to the next matching round. This would hurt the UX, make it more complex to do things like using a relayer for messages, and essentially create a fixed cost for denying service to the round. So long as owning the round was more valuable than the cost of all of the messages, it would still be a viable attack.

Another might be a dynamic message fee that is a function of the on the rate/demand for publishing messages (sounds kind of like Ethereum’s gas market). Assuming the round was not being attacked, then this fee would probably be insignificant, but the cost would grow significantly for any user trying to spam messages. In this case, owning the round is probably cost prohibitive, but an attacker could make it cost prohibitive for other users to contribute (a least for short periods of time).

What other mechanisms could we use to combat this?

proofoftom · March 15, 2021, 5:48pm

What if the fee were respective to the user and not all users where the cost to submit a new batch of messages is exponentially more expensive than the last?

A third, less flexible solution may be to put a hard cap on the amount of times a given user can submit a new batch of messages.

Both of these could be applied to publishing single messages as well, like you can only change keys so many times.

spengrah · March 15, 2021, 5:52pm

One advantage clrfund has over Gitcoin is that there is an established set of registered users. We could take advantage of that in a couple of different ways:

Hard cap on the number of messages a single user can send in a round. The cap could be set as a percentage of the total message cap for the round, for example. The main downsides here are that this threshold is arbitrary and that it could create some issues if hitting that cap was a useful signal to bribers. (is the number of messages sent per user public?)
Per-user message price schedules. For example, past a certain number of messages, we could require that a user start paying per message (potentially increasing).

A general issue these approaches might create is if these user-centric costs introduce new signal for attackers to detect defection from attempted collusion or a bribe.

auryn · March 15, 2021, 7:00pm

I don’t think that charging a per-address / per-user fee is a viable option, as a users messages must be able to be published by any Ethereum address in order to be resultant against bribery.

spengrah · March 15, 2021, 7:15pm

My rough intuition is that making contributions by other users cost-prohibitive is just as effective an attack as owning the entire round (i.e. sending the vast majority of the message cap).

More generally, I’m worried that any costs we impose on messages in order to deter attackers imposes the same costs on legitimate users. And since an attacker has more to gain from sending a marginal message compared to a legitimate user, and since message space competition is a zero sum game, those costs would actually benefit attackers.

I suppose the previous paragraph assumes a single attacker or little/no competition between attackers. Competition between multiple attackers complicates the situation a bit, but I’m not sure whether that helps us much.

auryn · March 15, 2021, 7:24pm

That’s why I suggested that these fees should ether be burned or contributed to the next matching round.

For a message fee to work, it essentially needs to make the cost of owning the round, or pricing legitimate contributors out of the round, high enough that it’s not worth it for an attacker.

weijiekoh · March 15, 2021, 7:25pm

At the risk of adding more work to my plate, this is my suggestion:

Upgrade MACI to support this flow:

Make there no limit to the number of messages. We achieve this by creating a new message tree when the current one is full. To create a new tree, either the clrfund team calls an onlyCoordinator function, or the contract does this autonomously (if (numMessages + 1 > limit: deployNewTree()). The former is probably better due to the high gas cost of deploying a new tree.
Set an exponentially increasing voting fee for each address but start only after some cap. e.g. the first 20 messages are free (sans gas). The 21th message costs $1. The 22th message costs $1.50. The 23rd message costs $3, etc. The contract burns half the fee and pays the coordinator the other half (this disincentivises the coordinator from spamming MACI).
If MACI is actually spammed, then the spammer has either paid a lot of fees or they have created a bunch of separate EOAs at their own cost. The downside for the coordinator is that they have to spend much more time with proof generation.
To avoid having to pay a lot of gas to submit all proofs to the chain, I’d recommend a multisig + social agreement approach. The multisig signers and the public can verify the proofs; no need to pay the gas to verify on-chain. Otherwise the round’s funds will be locked and if the system is spammed, it might not be worth it to submit the proof txes.

auryn · March 15, 2021, 7:34pm

I see no reason not to make this a public function, but maybe the UI doesn’t prompt people to do it unexpectedly. That said, the gas should be relatively insignificant on L2, no?

In L2, this cost of creating addresses and funding them with {native asset} would be pretty insignificant, no?

I really like this idea, as I’m generally a fan of lazy or optimistic solutions. It obviously implies some trust and/or governance though.

xuhcc · March 15, 2021, 10:24pm

It is much harder to farm airdrops using clrfund because contributors are verified and contributions are private. However, in some cases it could be profitable for a funding recipient to do airdrop to all contributors on the condition that their project receives a certain amount of funding (this can be codified in a smart contract). I think the only way to deal with this is to remove the recipient from the recipient registry as soon as an attack is detected.

This is a good idea. If coordinator sees that creating a new message tree does not make sense anymore (when proving becomes too expensive), they can cancel the funding round.
Though simply increasing the limit should be sufficient in the short term. The clrfund UI can be changed to prevent people from publishing too many messages. The difficulty of sending messages without UI could serve as a deterrent.

@weijiekoh As the long term solution, what do you think about rate-limiting using zero-knowledge proofs (as desribed in this article https://vac.dev/rln-relay)? Could a similar technique be used in MACI?

weijiekoh · March 15, 2021, 10:38pm

True! Ok, I agree with this being a public function. No reason to limit it to onlyCoordinator.

In L2, this cost of creating addresses and funding them with {native asset} would be pretty insignificant, no?

I agree. It seems that sybil resistance for messages (not signups) is unavoidable as anti-bribery requires us to allow any EOA to vote.

I really like this idea, as I’m generally a fan of lazy or optimistic solutions. It obviously implies some trust and/or governance though.

How about keeping both approaches?

If no spam -> use the existing round finalisation contract logic to release funds.
If there is spam (e.g. the number of messages is larger than a predefined threshold) -> coordinator and multisig have the option to release funds, maybe after some time delay.

weijiekoh · March 15, 2021, 11:05pm

@weijiekoh As the long term solution, what do you think about rate-limiting using zero-knowledge proofs (as desribed in this article https://vac.dev/rln-relay)? Could a similar technique be used in MACI?

This is really interesting. On the first read it looks pretty complex for this on-chain use case. I’ll have to think a bit more. We can ping Barry on Telegram too.

samajammin · June 7, 2021, 6:16pm

Hey @weijiekoh @auryn @xuhcc - have we reached a decision on how we plan to address this? Do we think this is something important to solve for in our next Eth2 CLR round?

@weijiekoh are any of your suggestions in the plan for MACI v1.0? Both “Make there no limit to the number of messages” or “Set an exponentially increasing voting fee for each address” seem like they would solve the issue.

If these are MACI upgrades that will not be prioritized soon, I think @auryn’s original suggestion to add a flat message fee (to be contributed to the matching pool?) could be a nice way to go. I’m just not sure what an appropriate fee would be or what sort of modeling we could do to arrive at a number here. In theory, this is exactly what gas fees help prevent but given our Eth2 round will be on an L2, perhaps we want to add this matching pool fee for additional security.

Thanks all.

ryan · June 8, 2021, 8:53am

To add some color here’s an exploration of how we might portray this in the UI – it’s not ideal (showing. target total, some funds to pay, and instructions if the user is under/over allocate) so if we can remove limit on number of messages before eth2 clr that would be hella sweeeet!

auryn · June 8, 2021, 9:06am

The best option is definitely to remove the hard caps to the number of messages (and ideally to the number of recipients and contributors as well). Since a fee would force people to send messages from an address with funds, it would likely weaken the anti-collision properties (because people are bad at separating their addresses).

@ryan, this relates to the relayer stuff we’ve been talking about. If we don’t implement the relayer, then most users will just send messages from the same address they contributed with (as they do currently). The proposed fee would be ready to levy, but an attacker child really see when someone votes again.

If we do use the relayer, then it’s much more difficult to take a fee. One option might be to rate limit messages, each unique use gets x free messages per day via the relayer. Any other address must pay the flat fee.

weijiekoh · June 12, 2021, 9:09am

Without a fee, an attacker with enough gas can spam the system whether or not MACI has a message cap or not. At this stage I’m reluctant to modify MACI to have unlimited messages because it doesn’t seem worth the added technical complexity.

I have an alternative suggestion.

When users vote, their votes are recorded in a contract H.

def vote(v):
    H.votes.push(v)

After the voting period, the coordinator spins up a MACI instance which supports the number of votes received, and inserts all votes from H into the MACI instance.

for v in H.votes:
   MACI.publishMessage(v)

Finally, generate and submit proofs as per usual operation.

This way, there is no need to predict how many votes the clrfund instance will receive beforehand. Moreover, raising the message limit does not prevent anyone from spamming the system in the first place.

samajammin · June 12, 2021, 5:34pm

But isn’t this the whole purpose of gas?

weijiekoh · June 12, 2021, 5:50pm

By “without a fee”, I’m referring to an additional anti-spam fee mechanism, on top of gas.

auryn · June 12, 2021, 6:46pm

Also, this would also be more efficient to compute, no?

ezra · June 13, 2021, 12:59am

This definitely seems like the best option suggested so far to me, but does this make deploying the MACI instance too trustful?

I guess we can guarantee the coordinator can’t deploy a MACI instance for a given vote round with the wrong number of votes, right? Something about the vote contract H only releasing funds to a MACI instance with the correct vote number?

auryn · June 13, 2021, 1:23am

H could have a public function that deploys the MACI instance with its parameters based on the number of votes recorded. Or an onlyOwner function where the correct parameters are passed by the owner. In either case, there is no more trust placed in the coordinator.