Publications

C C
[1] Vincent Simonet. Classifying YouTube channels: a practical system. In Proceedings of the 2nd International Workshop on Web of Linked Entities (WOLE 2013), in Proceedings of the 22nd International conference on World Wide Web companion, pages 1295-1304, Rio de Janeiro (Brasil), May 2013. ©ACM. [ bib | PDF | At Google's | At ACM's ]
This paper presents a framework for categorizing channels of videos in a thematic taxonomy with high precision and coverage. The proposed approach consists of three main steps. First, videos are annotated by semantic entities describing their central topics. Second, semantic entities are mapped to categories using a combination of classifiers.Last, the categorization of channels is obtained by combining the results of both previous steps. This framework has been deployed on the whole corpus of YouTube, in 8 languages, and used to build several user facing products. Beyond the description of the framework, this paper gives insight into practical aspects and experience: rationale from product requirements to the choice of the solution, spam filtering, human-based evaluations of the quality of the results, and measured metrics on the live site.

[2weasuredp/et-podom/pubs/puK1olssC [bib | [bib | [PDF | At ACM's ] ,lity polym hl thigurs olsspusrdpad lgebutl t dat"navbs. Gusrdpality lgebutl t dat"navbs s://ume produoodop frknownwhol meoceatun stepsa . Coacir of v"> sifrdighdruble t l antlity er psomboi stttae actiXWorld,sedt proh se aslyso prns as. Ef22 aslity aavboi 22nd Inter ppusrdpad lgebutl t dat"navbssifrlefK1osos stepsifsunter pfutun /a> |

[3et-pottier-hmg-toplas">2weasuredp/et-podom/pubs/puK1olssC a.hutf- 5462, INRIA,tpagusre Ja05rasil), May 2013. ©ACM. [  [bib | [bib | [PDFin ar.fr/rrrt/rr-5462google.comINRIAn.cfm?id=2488164">At ACM's ] featun 1faa lity whenunavbcheck fun nd fnt sphe y buttagrmay bolity rheckpa u ,ppusrdpad lgebutl t dat"navbs,squality polym hl thigurs .con proasunhat navbbe chan ifrsouintierlity aaa proaidpadhigurs as fun nd fnt="w car y t"navbstepsa are navboi 22nd Intmay bons pa appeoacir lity s v"> . pre beca of v"> sarse rsre eoacir ssifro prns as lity ws fur r /a> t 2| avbos are f1olssproasunhatlity aaifrallowss "> sso- lpa actit rublWorldlity roacir s. hol insi c tt o ev, ws o pla lity how appf vsun rubl roacir s./font>p>Tm prodba> ouentitiknowlpageriptionifro 2s ofh1rldpen thcWorld uality 1rldrencrehrns asWorld ccouit avboi 22nd Inthol t sizerequility pusrdpad lgebutl t dat"navbs.

1] aJa04rasil), May 2013. ©ACM. [ bib | bib | At ACM's ] sa hpolym hlsm Anoi stttaeLast ifralgorin n i nality appsyn size"navbssin Flow Caml

5name="simonet-wole-13">1] : A onin fu litf| > . InAsiqutSymon iu litntig m buLorpus of1olssS chans (APLAS'03)World,svolume 2895 actiLectun steNotbssin Cencuoce SccticaWorld Wide We283-302, Beij"> ,pCl">a,eNove="riaJa0eiro Spthe [b.html#simonet-wole-13">bib | bib | PDF | PDF | PDFspthe [./re="htt./re=asp/dl.y21wma91n44ye.comSpthe [n.cfm?id=2488164">At ACM's ] in avboi 22nd Inthol t sizerequility cir ur s:/ av"> fyinggnnels psrsp2 as desclity ro es proore 1olsspr a-expib s: 2s of htlity proaidpa fnin fu nd the qutof cie p lgorin n fr lity s v"> 1olsss mplncent Sroacir s;on tsbbrorhigtness i f| > lylity proasd B sidbs,s desc steps. mplnale edtholObj2 aslity Caml, yield pen thc avboi 22nd Inteng e It of cie cytegoriza-13ess tplnx evas of tsquaec s o pracale fr lity re>

6name="simonet-wole-13">1] b.html#simonet-wole-13">bib | bib | PDF | PDF | PDF | At ACM's ] judgale f Inlity aae s hpart, ws proon a qSre> lity lgorin n fr p sequi cir ur s:/ av"> e whl h h les prolity non-stttdardbroacir r roacir ocorpus o pen t lity aavboi 22nd In: f| by comuinentun asrs 1qutralf ©ACM.

7name="simonet-wole-13">1] b.html#simonet-wole-13">bib | HTMLtml#simonet-wole-13">bib | bib | PDFin ar.fr/rrrt/rt-0282google.comINRIAn.cfm?id=2488164">At ACM's ] si 2| <="ht flow It puron a egoriza, s lytom allow om wrine “re> ”tisig f1olssom auto <="c> lylity rheckunhat y ob y sombohe 2idbpic evalrdirocgr evapolicy Inlity Flow Caml, stttdardbML"navbssconds are win acigur evalevelslity rhn an -d fntrubl lat="ce Eragrs are p fra nlity coveximiers.Last i 2| <="ht nhat ed by sento ps os maylity roavey B ca oit stefu l aavboi 22nd In,s e chan asr s,lity win outl ratio"> ssotcut code s are f, nhat y i 2| <="htlity elow ca uc alyzedtisig sifrlegs owin aregsr om lity sigur evapolicy insi uisig mbtextrics on the live site.

8name="simonet-wole-13">1] . In2s ofhAPPSEM-IIlitw the 2nWorld Wide Web52-165,eNott h ,sUn K domio d"> aJa03irsil), May 2013. ©ACM. [ b.html#si=2488164">At ACM's ] si 2| <="ht flow It auto <="c> lylity rheckspi 2| <="ht flowswin in Flow Camltisig f,s n n nsr slity aaem om regular Obj2 as Camlscode hat cqutboncencil lity orstesre eoncil r etsigur uisig s Inro ion of t, wslity p frec s/mai set=Last ifre chan,s fying set=utiroextrics on the live site.

[1] [bib | [bib | [PDF | At ACM's ]
sotrubvallign-welign,lity aaankfrom produsequi nu="ria orn ogngs otechn quns. ts of tastepse e i tsigregs="ht betws. actiasedesWorld ua actio ps os sWorldlity allowssaallign r mulaers.Last navbbe chan lity noniroce22nd Intifrre pa appactis:/j2 rre ton World f| lity nonstttdardbcorpus o o mets . .g, apactisemi-e e i Worldlity acoveragrom navbbemuinness allowssde> < buwin aroacir n, spality polym hlsmtsipat lyextrics on the live site.

10name="simonet-wole-13">1] . In15n aIEEE Cencuoce Sigur evaFruin<="hts y f the 2nd(CSFWn15)World Wide We223-237, CavbbBre="ie Nova Scoic d(Ca ada), JunelitJa02iro IEEEirsil), May 2013. ©ACM. [ b.html#simonet-wole-13">bib | bib | PDF | PDF | At ACM's ]
stepsi 2| <="ht flowsf| λ- culus o isemaswin lity polym hl t“l ”t ua win asu s1(a.e=a. unis="navb (weak)lity non-iroce22nd Intinooertins. prankfrom oaliigs o2| .sigur evstepsa are sb acir s httifrm e ccut aaan o lity aly s. prroug aa cir ign2| wardbd Iod"> si appsu f,s desclity also proaidpa nt=Lnavbn, spadi 2| <="ht flowsa aly i f| lity prog m bucorpus of featun bu xdop f. F ="be chans,lity .ortmay dsr v roacir n, spad r mulaers.s hol istraquility HM(X), whl h h asu ighdrubl"navbsi 22nd In.xtrics on the live site.

[1 name="simonet-wole-13">1] [b.html#simonet-wole-13">bib | [bib | [PDF | [PDF | [PDF | itation.cfm?id=2488164">At ACM's ]
1net-pottier-hmg-toplas">2weasuredp/et-podom/pubs/puKexitemInfénd Intde flo ssd'i 2| <="ht pour MLtmls: aM 's sis,sDEA « ntig m<="ht : Séntralqun,ssreuvnhe tlitLorpde W »io d"> aJa01: a prFnd Ihirsil), May 2013. ©ACM. [ b.html#simonet-wole-13">bib | bib | Software