The Secrets Of Digitalkoot: Lessons Learned Crowdsourcing Data Entry To 50,000 People (For Free)

Many of you will be aware that Microtask and the National Library of Finland launched a project called Digitalkoot several months ago. The use of volunteers to complete microtasks, our ideas about gamification and rewards, and in general us as good, upstanding citizens were all put to the test when it came to Digitalkoot being the first Microtask-powered public service.

With over 50,000 volunteers to date, the project has been a success (which is nice for us, because it means we can sleep again). We’ve learned a lot from this experience, and we’d like to share it with you, the crowd that made it happen.

Making work play: Move over Super Mario, hello Super a-Mole-d

For those of you who have not comprehended a word, I’ve said thus far, the goal of Digitalkoot is to convert the National Library’s huge archives into searchable digital form. It leverages the help of online volunteers to input data that OCR software has difficulty with (for example, handwritten or printed in obsolete typefaces, such as ancient copies of the newspaper Aamulehti).

Digitalkoot is a game that is based on computers, people, and a gaming twist. This is how it works in practice: old text newspaper is broken down into separate words by an OCR reader. We made two online games for them (Mole Hunt and Mole Bridge) to send the following message to volunteers: Volunteers must correctly interpret the phrases in order to complete certain game objectives, such as assisting moles across a bridge or keeping them away from a garden.

For volunteers, playing cards was a fun pastime; for the National Library, however, digitization of the complete collection of Aamulehti was an affordable option. The book is available in a number of formats, including eBook and hardcover.

Engineering success: Golden tasks and diligent pests

Microtask’s engineers had their work cut out for them, with difficult obstacles to overcome (we will present a paper on Digitalkoot at HCOMP 2011). One issue, for example, was how to handle malicious players who enter in the wrong words on purpose (one incredible volunteer spent over an hour and a half doing so. Host: Someone has branded themselves as a mole, so we like to believe he/she is either a terrible typist or an intense loathing for moles…or maybe someone let their dog volunteer.)

The system begins the game feeding the player only “golden tasks,” which we know the answer to, in order to identify such volunteers. The number of verification activities consistently rises when a player is attempting to play rather than show off their gaming knowledge. The steps involved in this method are all completely hidden from view, so even if spammers comprehend how it works, they will not be able to cheat it.

The following difficulties were encountered: thorny mechanical problems such as determining the sort of gameplay to utilize (the first prototypes needed input methods other than typing, which proved to be very inefficient), questions about the number of parallel players to crosscheck answers, and system scalability. There were a number of more basic difficulties, such as some people being unwilling to connect with the games using their Facebook account (a rather vocal minority requested login by email, so we added it after the launch).

Holy moley: some staggering statistics

The figures for Digitalkoot are as intriguing as they are remarkable. The amount of time each player spent playing the game varied considerably, from a few seconds to more than 100 hours. The average time was 9 minutes and 18 seconds, which isn’t as far-fetched as you’d think. Although women have a slight edge in terms of number of participants, they lag behind by far when it comes to chart topping skills (the top four players are men).

At the time of writing, 55,000 individuals (and perhaps one dog) have completed the test. When you think about it, this is a nation of only 5 million people. They spent over 3,400 hours together on a voluntary basis and achieved an amazing 99 percent accuracy in the transcription of the Aamulehti archive.

Tasks for the future

Despite the fact that the project has gone very well, there is still a lot we can do. Next time around, for example, we would like to: minimize redundancy while checking the accuracy of tasks; be able to distinguish between words belonging to a title or to the text body; add soft keys for keyboards without the letters “å, ö, ä”; add new types of microtasks (typing is not the only thing we can do, of course); improve gameplay mechanics and reward mechanisms working side by side with professionals (or figure out in which contexts the game interface undermines efficiency instead of increasing it).

It’s clear from the lengthy list above that Microtask will not be running out of things anytime soon.

Share This Post

More To Explore

Subscribe To Our Newsletter

Get updates and learn from the best