VABALOG

Auhinnad – tagasilöögid ja uued horisondid

Hiljuti avastasin enda suureks üllatuseks, et minu magistritöö (PDF) on leidnud viitamist McKinsey & Company poolt koostatud auhinda kasutamist tutvustavas uurimuses “And the winner is …”: capturing the promise of philanthropic prizes. Kellele terve uurimus huvi ei paku võib piirduda New York Times’is ilmunud artikliga Philanthropy in the Form of a Fat Cash Prize, mis annab soovitusi filantroopidele, kes soovivad raha jagamise asemel hoopis auhindu pakkuda.

Vältimatult on peaaegu neli aastat tagasi valminud magistritöö juba vaikselt aeguma hakanud, sest vahepeal on nii mõndagi juhtunud, mis on auhindade profiili oluliselt tõstnud, kuid millest pole olnud erilist põhjust kirjutada. Viga, mida annab mingil määral parandada.

Kui viimati Netflixi auhinnast kirjutasin, siis oli potentsiaalseid võitjaid kaks, kuid Netflix tegi lõpuks enda põhjendatud valiku, mida kajastas teiste seas ka BusinessWeek:

More than 50,000 people registered for the prize and downloaded the Netflix data. And on Sept. 21, Netflix (NFLX) CEO Reed Hastings handed a $1 million check to the winners, BellKor’s Pragmatic Chaos, a seven-member international coalition.

The winning team, which includes scientists from AT&T (T) Research, Yahoo’s (YHOO) Israel lab, and computer scientists from Austria and Canada, blended more than 700 different statistical models into their formula. They studied every conceivable angle. They looked at the correlations of one movie to the next (Which Godfather Part II lovers are most likely to rent Scarface?). They studied users’ moods. (Once people start panning movies, it turns out, they give lower-than-normal ratings even to movies they like.) And they found that when people rank lots of movies in a single day, from dozens to 5,000, they think differently than when rating a movie they just saw.

More than 50,000 people registered for the prize and downloaded the Netflix data. And on Sept. 21, Netflix (NFLX) CEO Reed Hastings handed a $1 million check to the winners, BellKor’s Pragmatic Chaos, a seven-member international coalition.
The winning team, which includes scientists from AT&T (T) Research, Yahoo’s (YHOO) Israel lab, and computer scientists from Austria and Canada, blended more than 700 different statistical models into their formula. They studied every conceivable angle. They looked at the correlations of one movie to the next (Which Godfather Part II lovers are most likely to rent Scarface?). They studied users’ moods. (Once people start panning movies, it turns out, they give lower-than-normal ratings even to movies they like.) And they found that when people rank lots of movies in a single day, from dozens to 5,000, they think differently than when rating a movie they just saw.

Auhinna üleandmise üritusel lubas Netflix’i esindaja uusi auhindu ja eduka programmiga jätkamist:

While announcing the prize winners, Netflix officials announced plans for a second Netflix Prize. Details, they say, will be revealed in coming days. But it’s clear that data miners will be combing through much richer sets of user info. While the first set of data included only the renting and recommending behavior of anonymous customers, the second contest, says Hunt, will include demographic information, such as geography, age, and gender, along with details of movies the users have previously rented. The goal, he says, is to be able to size up customers when they first arrive on the site—without waiting for them to establish a data footprint. “We want to predict people earlier in the cycle,” says Hunt.

Uue auhinna ja eesmärkide väljakuulutamine aga venis ja venis ja venis. Lõpuks selgus, et Netflix’i poolt väljastatud andmebaas polnud piisavalt anonüümne ja osade arvates riivas see inimeste õigust privaatsusele, mis videolaenutuste puhul rajaneb ühele vanemale seadusele:

The lawsuit claims that Netflix’s information disclosure was illegal under the Video Privacy Protection Act (VPPA), which was itself passed in response to a bizarre video rental data leak. In the late 1980s, during the infamous Robert Bork Supreme Court confirmation process, a reporter from Washington’s City Paper simply went down to a local video store, asked the manager for Bork’s rental history, and was given a photocopied set of records… which he then wrote about.

There was no “Brokeback Factor” here, though, just a set of films unremarkable for anything salacious. The privacy break was so egregious, however, that Congress soon passed the VPPA, which requires those renting video cassettes and similar items to abide by a host of consumer privacy protections. The issue in the current lawsuit is whether Netflix was negligent in releasing the “anonymized” and “perturbed” data set.

Kui kaasus oleks kohtusse jõudnud, siis oleks see võinud Netflixi jaoks kõige mustema stsenaariumi järgi tähendada miljonitesse dollaritesse ulatuvaid väljamakseid, mida iga ettevõtte üritab võimalusel vältida. Netflix ei ole erand:

Netflix has canceled its $1 million contest aimed at finding a better recommendation engine in the wake of a privacy lawsuit settlement. The company informed its users today via the company blog, noting that it had “reached an understanding” with the Federal Trade Commission, leading it to ditch the Netflix Prize contes

Nii palju siis sellest.

Tundub, et kohtukaasus rajanes peamiselt ühel uurimusel (pdf), mille autorid on Netflixi andmebaaside anonüümsuse kohta ka korduma kippuvad küsimused kokku pannud. Huvitav on ka uurimuse autorite avalik kiri Netflixile, eriti kommentaaride osas, kus arutletakse anonüümsuse ja privaatsuse üle andmebaasides ning progressist teaduses, kui nõuded privaatsusele hakkavad ületama mõistlikkuse piire. Dan Kaminsky jõuab ühes enda kommentaaris olulise seigani:

The reality is that the first open competition with real, solid data, has ended with a coda that says “Don’t do this, the privacy people might get the lawyers to fire you.” And you (should) know, every large company is really run by the lawyers. So the game is probably over.

It is sad, though. Social science was about to have more, and better data, than the hard sciences.

Kahetsusväärne areng, kuid õnneks saatuslik mitte niivõrd auhindade kasutamisele tervikuna kuivõrd ühele auhinnale, mis käsitles mahukate andmebaaside analüüsi kindlate seoste leidmiseks.

Positiivse arenguna võib välja tuua, et natuke rohkem kui kuu tagasi anti USA föderaalsetele agentuuridele õigus auhindasid ametlikult kasutama hakata ja seda grantidega sarnastel alustel:

An Office of Management and Budget (OMB) memo last week gave federal agencies the green light to use more grand challenges and prizes to spur innovation. The memo, signed by Jeffrey D. Zients, OMB’s Deputy Director for Management, points out that agencies with funds for grants can use that funding and authority to sponsor grand challenges and prizes.

Zients encouraged agencies to use such competitions, a form of “crowdsourcing” that gathers broad public input in the search for innovative solutions to problems. Agencies were urged to collaborate with outside organizations for the design and management of these prize competitions. OMB promised that within four months the Administration would have a Web-based platform for agencies to post their prize and challenge competitions and invite communities of problem solvers to take part.

Loodetavasti annab see tõuke laiaulatuslikumaks katsetamiseks auhindadega, mis võimaldab ehk mõnevõrra täpsemalt välja selgitada, millistel tingimustel ja millistes valdkondades on auhindade potentsiaal kõige suurem nagu märgiti postituse alguses mainitud New York Times’i artiklis:

Five years into their prize, the Kravises are happy with how their prize is evolving — having learned from past mistakes. “It’s trial and error,” Mrs. Kravis said. “You’re not going to hit a home run each time.”


Categorised as: ...


Lisa kommentaar

Sinu e-postiaadressi ei avaldata. Nõutavad väljad on tähistatud *-ga