It turns out people have researched how to do algorithmic recommendations without users having to reveal their personal preferences, and I am intrigued. Apparently, in principle we could have the good parts of, say, Netflix suggesting more things you might want to watch, without exposing ourselves to entities like Facebook selling all our data.
See "Distributed Differential Privacy and Applications" by Narayan, for example. (Also that's the first CC-BY licensed PhD thesis I've seen!)
@b_cavello Okay, I've now skimmed the Leaking in Data Mining paper and watched Octavio Good's talk. They were both interesting and I learned things, but I'm not yet seeing how either one is related to either deidentification or differential privacy. Could you explain more?
At this point I'm nervous about any deidentification technique that doesn't have a differential privacy proof. There have been too many successful reidentification attacks; this feels like "don't roll your own crypto" again.
@b_cavello I still don't see the relation, but I agree that the use of adversarial networks to limit over-training was a really interesting part of that talk. I've seen stuff before about trying to remove bias from word2vec embeddings so that for example "doctor" doesn't get associated to "man" and "nurse" doesn't get associated to "woman", and I could imagine using the GAN approach to try to tackle that kind of problem too.
@alcinnz Ah, yes, I keep looking at the IPFS PubSub stuff because it sure seems like that ought to be useful for something. 😅
The design goals and implementation details aren't well documented as far as I've found, but superficially it doesn't look like it provides any anonymity at present. At the least it looks like you can tell who the peers are in a group you've joined?