Monday, November 06, 2006

open communities: you can't win even when you fight

i just chuckled when i read this. essentially a guy with a bone to pick with wikipedia claims there's probably lots of plagiarism. to back it up he does the necessary research and then shares it with the world. this is supposed to show that the collective editing process doesn't work.

but wait! with the evidence shown wikipedia editors start combing through them and cleaning up the actual offenders. which actually shows the collective editing process does work. in this case it was someone trying to show how messed up it was but the only way to do so was to (inadvertently) get involved and in the end help things improve.

and how long before someone comes up with a way to (largely?) automate the process he went through so that plagiarism is watched for and caught in a systemic manner?

i suppose he could have made the claim without proof. but then would the claim have been credible? no. so he could have not said anything, which would have made his point true but nobody would know about it.

self-balancing systems that have systemic defense and reaction mechanisms are a bitch, aren't they? you can join us or you can join us, them's your options. ;)

5 comments:

Anonymous said...

How's the work with the NSA?

MK said...

and how long before someone comes up with a way to (largely?) automate the process he went through so that plagiarism is watched for and caught in a systemic manner?

I wrote a tool which is doing something similar for new articles in Wikipedia (at the moment German Wikipedia only). The script extracts 5-6 words from several sentences and looks them up in Google. If several text fragments stem from the same source, then this is in general a strong indication for plagiarism. Human checking is nevertheless needed, e.g., the source might be in the public domain.

Alan Horkan said...

> a way to automate the process

unpossible!

automation in wikipedia? why bother when problems like spellchecking and abysmally poor writing can so much more easily be solved by brute force and throwing more monkeys at the problem

/endsarcasm

Wikipedia needs vast amounts of automation and it needed it yesterday. Spellchecking for starters, but more browsers with built in spellchecking might help.
Despite having so many writers contributing I expect it is very difficult to actually get developers to work on Wikimedia. I have seen similar problems attracting split groups of contributors in other projects, one group tends to get more praise and it is hard to get volunteers to work on areas with less perceived respect.

Anonymous said...

So let me see now. He found 142 cases out of 12,000 checked, a 'handful' are false positives, so say 120 out 12,000 are real case of plagarism, so 1% of Wikipedia is plagarised.

Wow. Massive. Someone call the President.

Give me a break, thats statiscally insignificant

free one way links said...

hi,

Even after finding the plagiarism we cant stop them then what is the use of doing all this. it does not help because according to copy write if you change one word in 1000 then its out of copy write.

thank you
keyur parmar