Gary Robinson: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
rm links in quote
 
(28 intermediate revisions by 19 users not shown)
Line 1: Line 1:
{{Short description|American software engineer and mathematician}}
{{For|the Canadian football player|Gary Robinson (Canadian football)}}
{{Infobox person
{{Infobox person
|name = Gary Robinson
|name = Gary Robinson
|image = Gary Robinson.jpg
|image = Gary Robinson.jpg
|caption =
|caption =
|birth name =
|birth_name =
|birth date = {{birth date and age|1956|2|6}}
|birth_date = {{Birth date and age|1956|2|6|mf=yes}}
|birth place = [[Bronxville, New York]], US
|birth_place = [[Bronxville, New York|Bronxville]], [[New York State|NY]], [[United States|US]]
|known for = [[SpamBayes]], [[SpamAssassin]], [[Recommender system|Recommendation engine]], [[Collaborative filtering]]
|known for = [[SpamBayes]], [[SpamAssassin]], [[Recommender system|Recommendation engine]], [[Collaborative filtering]]
|education = [[Bard College]];[[Courant Institute of Mathematical Sciences|Courant Institute]]<ref name=twsSepaxxf/>
|education = [[Bard College]];[[Courant Institute of Mathematical Sciences|Courant Institute]]<ref name=twsSepaxxf/>
|employer = Emergent Music LLC<ref name=twsSepaxxf>{{cite web
|employer = Emergent Music LLC<ref name=twsSepaxxf>{{cite web
|title= Gary Robinson
|title= Gary Robinson
|quote= I make the music recommendation technology at FlyFi — Where I grew up Bronxville, NY — Companies I've worked for Athenium, OLI Systems, Lambda Technology — Schools I've attended Bard College; Courant Institute of Mathematical Sciences
|publisher= ''Google''
|quote= I make the music recommendation technology at [http://www.flyfi.com/ FlyFi] — Where I grew up [[Bronxville, NY]] — Companies I've worked for [[Athenium]], OLI Systems, Lambda Technology — Schools I've attended [[Bard College]]; [[Courant Institute of Mathematical Sciences]]
|date= 2010-09-18
|date= 2010-09-18
|url= http://spambayes.sourceforge.net/
|url= http://spambayes.sourceforge.net/
Line 22: Line 23:
|website = [http://www.garyrobinson.net/ GaryRobinson.net]
|website = [http://www.garyrobinson.net/ GaryRobinson.net]
|footnotes =
|footnotes =
|box width =
}}
}}
'''Gary Robinson''' is an American [[software engineer]] and inventor notable for his mathematical algorithms to fight [[spam (electronic)|spam]].<ref name=twsSepa>{{cite news
'''Gary Robinson''' is an American [[software engineer]] and [[mathematician]]<ref name=twsBBJ/> and inventor notable for his mathematical algorithms to fight [[spam (electronic)|spam]].<ref name=twsSepa>{{cite news
|title= SpamBayes Project Page
|title= SpamBayes Project Page
|publisher= ''SpamBayes''
|publisher= SpamBayes
|quote= Gary Robinson provided a lot of the serious maths and theory, as well as his essay on "how to do it better" (see the background page for a link).
|quote= Gary Robinson provided a lot of the serious maths and theory, as well as his essay on "how to do it better" (see the background page for a link).
|date= 2010-09-18
|date= 2010-09-18
|url= http://spambayes.sourceforge.net/
|url= http://spambayes.sourceforge.net/
|accessdate= 2010-09-18
|accessdate= 2010-09-18
}}</ref> In addition, he patented a method to use web browser cookies to track consumers across different web sites, allowing marketers to better match advertisements with consumers.<ref name=USPatent> US 5918014 A, Application number US 08/774,180, Publication date Jun 29, 1999, Filing date Dec 26, 1996, [http://www.google.com/patents/US5918014 Automated collaborative filtering in world wide web advertising], "... This invention combines techniques for: determining the subject's community, and determining which ads to show ... to determine whether a given individual should be in the subject's community is gleaned from the individual's activities ... Means are provided to track a consumer's activities ... e.g. by means of "cookies"...."</ref><ref> Patent Buddy, [http://www.patentbuddy.com/Inventor/Robinson-Gary-B/2873083 Gary B Robinson Inventor], Patent years: 1999, 2001, "...Automated collaborative filtering in world wide web advertising..."</ref> The patent was bought by [[DoubleClick]], and then DoubleClick was bought by [[Google]].<ref> TechCrunch, Apr 13, 2007 by Michael Arrington, [http://techcrunch.com/2007/04/13/google-spends-31-billion-for-doubleclick/ Breaking: Google Spends $3.1 Billion To Acquire DoubleClick], Accessed March 12, 2014, "...About 20 minutes ago Google announced that they have agreed to acquired DoubleClick for $3.1 billion in cash..."</ref><ref> Bill Slawski, Apr 14, 2007, SEO by the Sea, [http://www.seobythesea.com/2007/04/doubleclick-google-looking-at-some-of-the-doubleclick-patent-filings/ Doubleclick + Google: Looking at Some of the Doubleclick Patent Filings], Accessed March 12, 2014, "...smart ad box showing on a page that displays different advertisements to users over time, based upon a recommendations system. ..."</ref>
}}</ref> In addition, he patented a method to use web browser cookies to track consumers across different web sites, allowing marketers to better match advertisements with consumers.<ref name=USPatent>US 5918014 A, Application number US 08/774,180, Publication date Jun 29, 1999, Filing date Dec 26, 1996, [http://www.google.com/patents/US5918014 Automated collaborative filtering in world wide web advertising], "... This invention combines techniques for: determining the subject's community, and determining which ads to show ... to determine whether a given individual should be in the subject's community is gleaned from the individual's activities ... Means are provided to track a consumer's activities ... e.g. by means of "cookies"..."</ref><ref>Patent Buddy, [http://www.patentbuddy.com/Inventor/Robinson-Gary-B/2873083 Gary B Robinson Inventor], Patent years: 1999, 2001, "... Automated collaborative filtering in world wide web advertising ..."</ref> The patent was bought by [[DoubleClick]], and then DoubleClick was bought by [[Google]].<ref>TechCrunch, Apr 13, 2007 by Michael Arrington, [https://techcrunch.com/2007/04/13/google-spends-31-billion-for-doubleclick/ Breaking: Google Spends $3.1 Billion To Acquire DoubleClick], Accessed March 12, 2014, "... About 20 minutes ago Google announced that they have agreed to acquired DoubleClick for $3.1 billion in cash ..."</ref><ref>Bill Slawski, Apr 14, 2007, SEO by the Sea, [http://www.seobythesea.com/2007/04/doubleclick-google-looking-at-some-of-the-doubleclick-patent-filings/ Doubleclick + Google: Looking at Some of the Doubleclick Patent Filings], Accessed March 12, 2014, "... smart ad box showing on a page that displays different advertisements to users over time, based upon a recommendations system. ..."</ref> He is credited as being one of the first to use automated [[collaborative filtering]] technologies to turn word-of-mouth recommendations into useful data.<ref name=twsBBJ>Matthew French, May 20, 2002, Boston Business Journal, [http://www.bizjournals.com/boston/blog/mass-high-tech/2002/05/romantic-beginnings-have-worldwide-effect.html Romantic beginnings have worldwide effect], Retrieved August 6, 2016, "... Gary Robinson ... a mathematician by training ... first automated collaborative filtering applications ..."</ref>


==Algorithms to identify spam==
In 2003, Robinson's article in ''[[Linux Journal]]'' detailed a new approach to [[computer programming]] perhaps best described as a ''general purpose classifier'' which expanded on the usefulness of [[Naive Bayes spam filtering|Bayesian filtering]]. Robinson's method used math-intensive [[algorithm]]s combined with [[Chi-square test|Chi-square statistical testing]] to enable computers to examine an unknown file and make intelligent guesses about what was in it.<ref name=twsSep2/> The technique had wide applicability; for example, Robinson's method enabled computers to examine a file and guess, with much greater accuracy, whether it contained [[pornography]], or whether an incoming email to a corporation was a technical question or a sales-related question.<ref name=twsKamens>Ben Kamens, Fog Creek Publishing, [http://www.fogcreek.com/fogbugz/downloads/kamenspaper.pdf Bayesian Filtering: Beyond Binary Classification ], Retrieved February 7, 2015, "...Of these, Robinson’s technique ... borrowed from R.A. Fischer’s combination of probabilities into a chi-squared distribution, has been extensively tested and is used by the most successful filters, including SpamBayes. Robinson provides ample theoretical justification for this improvement in practical accuracy over the original filters..."</ref> The method became the basis for [[anti-spam techniques]] used by [[Tim Peters (software engineer)|Tim Peters]] and Rob Hooft of the influential [[SpamBayes]] project.<ref name=twsSep4>{{cite web
In 2003, Robinson's article in ''[[Linux Journal]]'' detailed a new approach to [[computer programming]] perhaps best described as a ''general purpose classifier'' which expanded on the usefulness of [[Naive Bayes spam filtering|Bayesian filtering]]. Robinson's method used math-intensive [[algorithm]]s combined with [[Chi-square test|Chi-square statistical testing]] to enable computers to examine an unknown file and make intelligent guesses about what was in it.<ref name=twsSep2/> The technique had wide applicability; for example, Robinson's method enabled computers to examine a file and guess, with much greater accuracy, whether it contained [[pornography]], or whether an incoming email to a corporation was a technical question or a sales-related question.<ref name=twsKamens>Ben Kamens, Fog Creek Publishing, [http://www.fogcreek.com/fogbugz/downloads/kamenspaper.pdf Bayesian Filtering: Beyond Binary Classification ] {{webarchive|url=https://web.archive.org/web/20150924014132/http://www.fogcreek.com/fogbugz/downloads/kamenspaper.pdf |date=2015-09-24 }}, Retrieved February 7, 2015, "... Of these, Robinson's technique ... borrowed from R.A. Fischer's combination of probabilities into a chi-squared distribution, has been extensively tested and is used by the most successful filters, including SpamBayes. Robinson provides ample theoretical justification for this improvement in practical accuracy over the original filters ..."</ref> The method became the basis for [[anti-spam techniques]] used by Tim Peters and Rob Hooft of the influential [[SpamBayes]] project.<ref name=twsSep4>{{cite web
|author= T.A Meyer and B Whateley
|author= T.A. Meyer and B Whateley
|title= SpamBayes: Effective open-source, Bayesian based, email classification system.
|title= SpamBayes: Effective open-source, Bayesian based, email classification system.
|publisher= ''Massey University, Auckland, New Zealand''
|publisher= Massey University, Auckland, New Zealand
|quote= G. Robinson, "Spam Detection", [online] 2002, ... G. Robinson, "Instructions for Training to Exhaustion", (Gary' Longer Rants), [online] 2004, (see page 8)
|quote= G. Robinson, "Spam Detection", [online] 2002, ... G. Robinson, "Instructions for Training to Exhaustion", (Gary' Longer Rants), [online] 2004, (see page 8)
|date= 2010-09-18
|date= 2010-09-18
|url= http://docs.google.com/viewer?a=v&q=cache:5ce4WIDJEVwJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.3.9543%26rep%3Drep1%26type%3Dpdf+%22gary+robinson%22+%28%22flyfi%22+OR+%22emergent+music%22+OR+%22ActiveState%22+OR+%22SpamAssassin%22+OR+%22SpamBayes%22+OR+%22SpamSieve%22+OR+%22MicroVox%22+OR+%22212-Romance%22+OR+%22collaborative+filtering%22+OR+%22recommendation+engine%22%29&hl=en&gl=us&pid=bl&srcid=ADGEESjOwTBZueQJffNf0oQU9G4tuyAYGppW5w-V3zl8bi3ip7whZFuddbjQkynjDxtHmvaN0E-hAshLivz2l4FTLMKIHcY9JEZ6TRqHojqyfDprrwtfDhhkaxGPzmhRfYgjQYb7hPBm&sig=AHIEtbSumwKidrQ4Vgdaih_GEJdqwt21gQ
|url= https://docs.google.com/viewer?a=v&q=cache:5ce4WIDJEVwJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.3.9543%26rep%3Drep1%26type%3Dpdf+%22gary+robinson%22+%28%22flyfi%22+OR+%22emergent+music%22+OR+%22ActiveState%22+OR+%22SpamAssassin%22+OR+%22SpamBayes%22+OR+%22SpamSieve%22+OR+%22MicroVox%22+OR+%22212-Romance%22+OR+%22collaborative+filtering%22+OR+%22recommendation+engine%22%29&hl=en&gl=us&pid=bl&srcid=ADGEESjOwTBZueQJffNf0oQU9G4tuyAYGppW5w-V3zl8bi3ip7whZFuddbjQkynjDxtHmvaN0E-hAshLivz2l4FTLMKIHcY9JEZ6TRqHojqyfDprrwtfDhhkaxGPzmhRfYgjQYb7hPBm&sig=AHIEtbSumwKidrQ4Vgdaih_GEJdqwt21gQ
|accessdate= 2010-09-18
|accessdate= 2010-09-18
}}</ref><ref name=twsSepb>{{cite news
}}</ref><ref name=twsSepb>{{cite magazine
|author= Gary Robinson
|author= Gary Robinson
|title= A Statistical Approach to the Spam Problem: Using Bayesian statistics to detect an e-mail's spamminess.
|title= A Statistical Approach to the Spam Problem: Using Bayesian statistics to detect an e-mail's spamminess.
|publisher= ''Linux Journal''
|magazine= Linux Journal
|quote= This article discusses one of many possible mathematical foundations for a key aspect of spam filtering—generating an indicator of “spamminess” from a collection of tokens representing the content of an e-mail.
|quote= This article discusses one of many possible mathematical foundations for a key aspect of spam filtering—generating an indicator of "spamminess" from a collection of tokens representing the content of an e-mail.
|date= Mar 1, 2003
|date= Mar 1, 2003
|url= http://www.linuxjournal.com/article/6467
|url= http://www.linuxjournal.com/article/6467
Line 52: Line 53:
|author= David Anderson
|author= David Anderson
|title= Statistical Spam Filtering — EECS595, Fall 2006
|title= Statistical Spam Filtering — EECS595, Fall 2006
|quote= Gary Robinson proposes an improved method for calculating the word value of a token W. His method modifies Graham's by adding a confidence factor to scale the word value by the amount of historical data that is available for the token. Let N be ...
|publisher= ''Google''
|quote= Gary Robinson proposes an improved method for calculating the word value of a token W. His method modifies Graham’s by adding a confidence factor to scale the word value by the amount of historical data that is available for the token. Let N be the...
|date= September 2006
|date= September 2006
|url= http://docs.google.com/viewer?a=v&q=cache:jp8T80yMPDUJ:www.eecs.umich.edu/~rthomaso/courses/nlp/projects.06/David_Anderson.pdf+%22gary+robinson%22+%28%22flyfi%22+OR+%22emergent+music%22+OR+%22ActiveState%22+OR+%22SpamAssassin%22+OR+%22SpamBayes%22+OR+%22SpamSieve%22+OR+%22MicroVox%22+OR+%22212-Romance%22+OR+%22collaborative+filtering%22+OR+%22recommendation+engine%22%29&hl=en&gl=us&pid=bl&srcid=ADGEESjqc45TcjNyVMuvqO53HSX9ZiXaOjdj4EwB7R6Bh3OabJ8pzAhANPcEymv6Abhnhl8DFBOKeOkBsVAVLCFG6532dr2cmC8ZvWhMPJJsZLEt_O50xcL13nknegrEE5wRwgJxKDJF&sig=AHIEtbQVftspT_ZkwKt4GBmgAUM1vLyjIA
|url= https://docs.google.com/viewer?a=v&q=cache:jp8T80yMPDUJ:www.eecs.umich.edu/~rthomaso/courses/nlp/projects.06/David_Anderson.pdf+%22gary+robinson%22+%28%22flyfi%22+OR+%22emergent+music%22+OR+%22ActiveState%22+OR+%22SpamAssassin%22+OR+%22SpamBayes%22+OR+%22SpamSieve%22+OR+%22MicroVox%22+OR+%22212-Romance%22+OR+%22collaborative+filtering%22+OR+%22recommendation+engine%22%29&hl=en&gl=us&pid=bl&srcid=ADGEESjqc45TcjNyVMuvqO53HSX9ZiXaOjdj4EwB7R6Bh3OabJ8pzAhANPcEymv6Abhnhl8DFBOKeOkBsVAVLCFG6532dr2cmC8ZvWhMPJJsZLEt_O50xcL13nknegrEE5wRwgJxKDJF&sig=AHIEtbQVftspT_ZkwKt4GBmgAUM1vLyjIA
|accessdate= 2010-09-18
|accessdate= 2010-09-18
}}</ref> SpamBayes assigned probability scores to both ''spam'' and ''ham'' (useful emails) to guess intelligently whether an incoming email was spam; the scoring system enabled the program to return a value of ''unsure'' if both the ''spam'' and ''ham'' scores were high.<ref name=twsSep2>{{cite news
}}</ref> SpamBayes assigned probability scores to both ''spam'' and ''ham'' (useful emails) to guess intelligently whether an incoming email was spam; the scoring system enabled the program to return a value of ''unsure'' if both the ''spam'' and ''ham'' scores were high.<ref name=twsSep2>{{cite news
|title= Background Reading
|title= Background Reading
|publisher= ''SpamBayes project''
|publisher= SpamBayes project
|quote= Sharpen your pencils, this is the mathematical background (such as it is).* The paper that started the ball rolling: [[Paul Graham (computer programmer)|Paul Graham]]'s [http://www.paulgraham.com/spam.html A Plan for Spam].* Gary Robinson has an interesting essay suggesting some improvements to Graham's original approach.* Gary Robinson's Linux Journal article discussed using the chi squared distribution.
|date= 2010-09-18
|date= 2010-09-18
|url= http://spambayes.sourceforge.net/background.html
|url= http://spambayes.sourceforge.net/background.html
|accessdate= 2010-09-18
|accessdate= 2010-09-18
}}
}}</ref> Robinson's method was used in other anti-spam projects such as [[SpamAssassin]].<ref name=twsSep3>{{cite web
"Sharpen your pencils, this is the mathematical background (such as it is).
* The paper that started the ball rolling: [[Paul Graham (computer programmer)|Paul Graham]]'s [http://www.paulgraham.com/spam.html A Plan for Spam].
* Gary Robinson has an interesting essay suggesting some improvements to Graham's original approach.
* Gary Robinson's Linux Journal article discussed using the chi squared distribution."
</ref> Robinson's method was used in other anti-spam projects such as [[SpamAssassin]].<ref name=twsSep3>{{cite web
|author= The SpamAssassin Project
|author= The SpamAssassin Project
|title= train SpamAssassin's Bayesian classifier
|title= train SpamAssassin's Bayesian classifier
|publisher= ''SpamAssassin website''
|publisher= SpamAssassin website
|quote= Gary Robinson's f(x) and combining algorithms, as used in SpamAssassin
|quote= Gary Robinson's f(x) and combining algorithms, as used in SpamAssassin
|url= http://spamassassin.apache.org/full/3.2.x/doc/sa-learn.html
|url= http://spamassassin.apache.org/full/3.2.x/doc/sa-learn.html
Line 73: Line 77:
}}</ref><ref name=twsSep14xx>{{cite news
}}</ref><ref name=twsSep14xx>{{cite news
|title= Credits — the Perl Programming Language — Algorithms
|title= Credits — the Perl Programming Language — Algorithms
|publisher= ''Perl''
|publisher= Perl
|quote= Algorithms: The Bayesian-style text classifier used by SpamAssassin's BAYES rules is based on an approach outlined by Gary Robinson. Thanks, Gary!
|quote= Algorithms: The Bayesian-style text classifier used by SpamAssassin's BAYES rules is based on an approach outlined by Gary Robinson. Thanks, Gary!
|date= 2010-09-18
|date= 2010-09-18
Line 79: Line 83:
|accessdate= 2010-09-18
|accessdate= 2010-09-18
}}</ref><ref name=twsSep14yy>{{cite web
}}</ref><ref name=twsSep14yy>{{cite web
|title= Installation
|title = Installation
|publisher= ''Ubuntu manuals''
|publisher = Ubuntu manuals
|quote= Gary Robinson’s f(x) and combining algorithms, as used in SpamAssassin
|quote = Gary Robinson's f(x) and combining algorithms, as used in SpamAssassin
|date= 2010-09-18
|date = 2010-09-18
|url= http://manpages.ubuntu.com/manpages/gutsy/man1/sa-learn.1p.html
|url = http://manpages.ubuntu.com/manpages/gutsy/man1/sa-learn.1p.html
|accessdate= 2010-09-18
|accessdate = 2010-09-18
|url-status = dead
|archiveurl = https://web.archive.org/web/20100929165032/http://manpages.ubuntu.com/manpages/gutsy/man1/sa-learn.1p.html
|archivedate = 2010-09-29
}}</ref> Robinson commented in ''[[Linux Journal]]'' on how fighting spam was a collaborative effort:
}}</ref> Robinson commented in ''[[Linux Journal]]'' on how fighting spam was a collaborative effort:


{{quote|The approach described here truly has been a distributed effort in the best open-source tradition. [[Paul Graham (computer programmer)|Paul Graham]], an author of books on [[Lisp (programming language)|Lisp]], suggested an approach to filtering spam in his on-line article, "A Plan for Spam". I took his approach for generating probabilities associated with words, altered it slightly and proposed a Bayesian calculation for dealing with words that hadn't appeared very often ... an approach based on the chi-square distribution for combining the individual word probabilities into a combined probability (actually a pair of probabilities—see below) representing an e-mail. Finally, Tim Peters of the Spambayes Project proposed a way of generating a particularly useful spamminess indicator based on the combined probabilities. All along the way the work was guided by ongoing testing of embodiments written in Python by Tim Peters for [[Spambayes]] and in C by Greg Louis of the Bogofilter Project. The testing was done by a number of people involved with those projects.|Gary Robinson, 2003.<ref name=twsSepb/>}}
<blockquote>
The approach described here truly has been a distributed effort in the best open-source tradition. [[Paul Graham (computer programmer)|Paul Graham]], an author of books on [[Lisp (programming language)|Lisp]], suggested an approach to filtering spam in his on-line article, “A Plan for Spam”. I took his approach for generating probabilities associated with words, altered it slightly and proposed a Bayesian calculation for dealing with words that hadn't appeared very often ... an approach based on the chi-square distribution for combining the individual word probabilities into a combined probability (actually a pair of probabilities—see below) representing an e-mail. Finally, [[Tim Peters (software engineer)|Tim Peters]] of the Spambayes Project proposed a way of generating a particularly useful spamminess indicator based on the combined probabilities. All along the way the work was guided by ongoing testing of embodiments written in Python by [[Tim Peters (software engineer)|Tim Peters]] for [[Spambayes]] and in C by Greg Louis of the Bogofilter Project. The testing was done by a number of people involved with those projects.Gary Robinson, 2003.<ref name=twsSepb/>
</blockquote>


In 1996, Robinson patented a method to help marketers focus their online advertisements to consumers. He explained:
In 1996, Robinson patented a method to help marketers focus their online advertisements to consumers. He explained:


{{Quote|As far as I have been able to tell, it's the very first patent ... to mention using web browser ''cookies'' to track consumers across different web sites and build a profile of their interests in order to determine what ads to show them ... There was an aspect in the way browser cookies were implemented that allowed them to be used ... I hired programmers to do the programming to actually test it ... the hypothesis turned out to be correct.|Gary B. Robinson, 2014}}
{{quote|As far as I have been able to tell, it's the very first patent ... to mention using web browser ''cookies'' to track consumers across different web sites and build a profile of their interests in order to determine what ads to show them ... There was an aspect in the way browser cookies were implemented that allowed them to be used ... I hired programmers to do the programming to actually test it ... the hypothesis turned out to be correct.|Gary B. Robinson, 2014}}


==Entrepreneurial activity==
In 2010, Robinson was the chief technology officer at FlyFi, an online music service owned by [[Maine]]-based<ref name=twsOctExcs>{{cite web
In 2010, Robinson was the chief technology officer at FlyFi, an online music service owned by [[Maine]]-based<ref name=twsOctExcs>{{cite web
|title= Contact "Emergent Discovery"
|title= Contact "Emergent Discovery"
|publisher= ''Emergent Discovery''
|publisher= Emergent Discovery
|quote= Emergent Discovery — 565 Congress Street — Suite 201 —Portland, ME 04101
|quote= Emergent Discovery — 565 Congress Street — Suite 201 —Portland, ME 04101
|date= 2010-10-14
|date= 2010-10-14
Line 104: Line 110:
}}</ref> Emergent Discovery which uses his anti-spam programming techniques along with [[collaborative filtering]] technologies to help make music recommendations to web users.<ref name=twsSep11>{{cite news
}}</ref> Emergent Discovery which uses his anti-spam programming techniques along with [[collaborative filtering]] technologies to help make music recommendations to web users.<ref name=twsSep11>{{cite news
|author= Kevin Dangoor
|author= Kevin Dangoor
|title= Gary Robinson’s Three Steps to Freedom
|title= Gary Robinson's Three Steps to Freedom
|publisher= ''BlueSkyOnMars''
|publisher= BlueSkyOnMars
|quote= Gary Robinson, the head of Emergent Music has an article on his blog about the Three Steps To Freedom. His opinion on this definitely counts, because EM might very well be the future of music. I’m going to chime in with my thoughts here and copy them over to EM’s forum as well.
|quote= Gary Robinson, the head of Emergent Music has an article on his blog about the Three Steps To Freedom. His opinion on this definitely counts, because EM might very well be the future of music. I'm going to chime in with my thoughts here and copy them over to EM's forum as well.
|date= April 30, 2002
|date= April 30, 2002
|url= http://www.blueskyonmars.com/2002/04/30/gary-robinsons-three-steps-to-freedom/
|url= http://www.blueskyonmars.com/2002/04/30/gary-robinsons-three-steps-to-freedom/
Line 112: Line 118:
}}</ref><ref name=twsSep1>{{cite news
}}</ref><ref name=twsSep1>{{cite news
|title= Management Team
|title= Management Team
|publisher= ''FlyFi''
|publisher= FlyFi
|quote= Gary Robinson, CTO, is both a musician and leader in the "recommendation engine" field. Gary’s background reflects his pioneering work in mathematics, technology and collaborative filtering.
|quote= Gary Robinson, CTO, is both a musician and leader in the "recommendation engine" field. Gary's background reflects his pioneering work in mathematics, technology and collaborative filtering.
|date= 2010-09-18
|date= 2010-09-18
|url= http://www.emergentmusic.com/about/team/
|url= http://www.emergentmusic.com/about/team/
Line 120: Line 126:
|author= Gary Robinson
|author= Gary Robinson
|title= Request for Your Input Regarding Three Steps To Freedom: THE 3 STEPS TO FREEDOM
|title= Request for Your Input Regarding Three Steps To Freedom: THE 3 STEPS TO FREEDOM
|publisher= ''Gary Robinson's Rants: Rants on spam, business, digital music, patents, and other assorted random stuff.''
|publisher= Gary Robinson's Rants: Rants on spam, business, digital music, patents, and other assorted random stuff.
|quote= So, as a "thought experiment," I have imagined the following path to creating an alternative music industry.
|quote= So, as a "thought experiment," I have imagined the following path to creating an alternative music industry.
|date= 2006-01-30
|date= 2006-01-30
Line 127: Line 133:
}}</ref><ref name=twsSep14uu>{{cite web
}}</ref><ref name=twsSep14uu>{{cite web
|title= FlyFi iTunes Helper 2.0.0.1 for Mac
|title= FlyFi iTunes Helper 2.0.0.1 for Mac
|publisher= ''CNet''
|publisher= CNet
|quote= The FlyFi iTunes Helper sends the contents of your iTunes data file (a behind the scenes part of your iTunes library) to FlyFi server to be analyzed. By looking at your iTunes music, which is one of the best reflections of your musical tastes, FlyFi can make better new music suggestion. FlyFi can also use this information to better serve other members.
|quote= The FlyFi iTunes Helper sends the contents of your iTunes data file (a behind the scenes part of your iTunes library) to FlyFi server to be analyzed. By looking at your iTunes music, which is one of the best reflections of your musical tastes, FlyFi can make better new music suggestion. FlyFi can also use this information to better serve other members.
|date= 2010-09-18
|date= 2010-09-18
Line 134: Line 140:
}}</ref> Robinson helped develop [[recommendation engine]] technology which applies high-power mathematical techniques using software algorithms to have a computer guess intelligently about what a consumer might like.<ref name=twsSep5>{{cite news
}}</ref> Robinson helped develop [[recommendation engine]] technology which applies high-power mathematical techniques using software algorithms to have a computer guess intelligently about what a consumer might like.<ref name=twsSep5>{{cite news
|title= Management Team
|title= Management Team
|publisher= ''Emergent Discovery''
|publisher= Emergent Discovery
|quote= Gary Robinson, CTO, is a leader in the "recommendation engine" field. Gary’s background reflects his pioneering work in mathematics, technology and collaborative filtering. For instance, as a Research Director at ActiveState, Gary’s work on spam detection is now being widely adopted by the anti-spam industry, including such leading filters as SpamAssassin (PC Magazine's Editor's Choice for spam filtering), SpamSieve (MacWorld's Software of the Year) and SpamBayes (PC World's Editor's Choice for spam filtering).
|quote= Gary Robinson, CTO, is a leader in the "recommendation engine" field. Gary's background reflects his pioneering work in mathematics, technology and collaborative filtering. For instance, as a Research Director at ActiveState, Gary's work on spam detection is now being widely adopted by the anti-spam industry, including such leading filters as SpamAssassin (PC Magazine's Editor's Choice for spam filtering), SpamSieve (MacWorld's Software of the Year) and SpamBayes (PC World's Editor's Choice for spam filtering).
|date= 2010-09-18
|date= 2010-09-18
|url= http://devwww.emergentdiscovery.com/dev/about_us/team/
|url= http://devwww.emergentdiscovery.com/dev/about_us/team/
|accessdate= 2010-09-18
|accessdate= 2010-09-18
}}</ref> For example, if a consumer likes music by artists such as the ''Beach Boys'', ''Bob Dylan'' and the ''Talking Heads'', the computer software will match these preferences with a much larger dataset of other consumers who ''also like'' those three artists but which cumulatively has much greater musical knowledge than the single consumer. Accordingly, the computer will find music that the user might like but hasn't been exposed to, and therefore hopefully offer intelligent recommendations. But the mathematics behind such comparisons can become quite complex and involved. Robinson studied [[mathematics]] at [[Bard College]] and graduated in 1979 and studied further at the [[Courant Institute of Mathematical Sciences|Courant Institute]] of [[New York University]].<ref name=twsSepaxxf/> In the 1980s, Robinson worked on an entrepreneurial start-up dating service called ''212-Romance'' which used similar computer algorithms to match singles romantically.<ref name=twsSep14pp>{{cite news
}}</ref> For example, if a consumer likes music by artists such as the ''Beach Boys'', ''Bob Dylan'' and ''Talking Heads'', the computer software will match these preferences with a much larger dataset of other consumers who ''also like'' those three artists but which cumulatively has much greater musical knowledge than the single consumer. Accordingly, the computer will find music that the user might like but hasn't been exposed to, and therefore hopefully offer intelligent recommendations, in a process which has come to be called [[knowledge management]].<ref name=twsBBJ/> But the mathematics behind such comparisons can become quite complex and involved. Robinson studied [[mathematics]] at [[Bard College]] and graduated in 1979 and studied further at the [[Courant Institute of Mathematical Sciences|Courant Institute]] of [[New York University]].<ref name=twsSepaxxf/> In the 1980s, Robinson worked on an entrepreneurial start-up dating service called ''212-Romance'' which used similar computer algorithms to match singles romantically.<ref name=twsBBJ/><ref name=twsSep14pp>{{cite news
|title= New York Magazine
|title= New York Magazine
|publisher= ''Google Books''
|quote= (ad for 212-Romance on left side of page)
|quote= (ad for 212-Romance on left side of page)
|date= Sep 12, 1988
|date= Sep 12, 1988
|url= http://books.google.com/books?id=quUCAAAAMBAJ&pg=PA229&lpg=PA229&dq=%22212-Romance%22&source=bl&ots=CId8ffvaP4&sig=81uOWUTPXBVLtvJaTNYu4YtlQ_E&hl=en&ei=tAyVTMSlNoT7lwf2ib2mCg&sa=X&oi=book_result&ct=result&resnum=3&ved=0CBoQ6AEwAg#v=onepage&q=%22212-Romance%22&f=false
|url= https://books.google.com/books?id=quUCAAAAMBAJ&q=%22212-Romance%22&pg=PA229
|accessdate= 2010-09-18
|accessdate= 2010-09-18
}}</ref> The New York City-based voice mail dating service created community-based automated recommendations and used ''collaborative filtering'' technologies which Robinson developed further in other capacities.
}}</ref> The New York City-based voice mail dating service created community-based automated recommendations and used [[collaborative filtering]] technologies which Robinson developed further in other capacities.


==References==
==References==
{{reflist|2}}
{{Reflist}}


==External links==
==External links==
Line 155: Line 160:
* [http://www.google.com/patents/US5918014 Automated collaborative filtering] patent
* [http://www.google.com/patents/US5918014 Automated collaborative filtering] patent


{{Authority control}}
{{Persondata <!-- Metadata: see [[Wikipedia:Persondata]]. -->

| NAME = Robinson, Gary
| ALTERNATIVE NAMES =
| SHORT DESCRIPTION = American software engineer
| DATE OF BIRTH = 1956-02-06
| PLACE OF BIRTH = [[Bronxville, New York|Bronxville]], [[New York State|N.Y.]] [[United States|U.S.A.]]
| DATE OF DEATH =
| PLACE OF DEATH =
}}
{{DEFAULTSORT:Robinson, Gary}}
{{DEFAULTSORT:Robinson, Gary}}
[[Category:Bard College alumni]]
[[Category:Bard College alumni]]
Line 171: Line 169:
[[Category:Living people]]
[[Category:Living people]]
[[Category:1956 births]]
[[Category:1956 births]]
[[Category:American chief technologists]]
[[Category:American chief technology officers]]
[[Category:Courant Institute of Mathematical Sciences alumni]]
[[Category:Courant Institute of Mathematical Sciences alumni]]
[[Category:Engineers from New York (state)]]

Latest revision as of 22:51, 16 March 2024

Gary Robinson
Born (1956-02-06) February 6, 1956 (age 68)
EducationBard College;Courant Institute[1]
OccupationComputer programmer
EmployerEmergent Music LLC[1]
Known forSpamBayes, SpamAssassin, Recommendation engine, Collaborative filtering
TitleChief Technology officer[1]
WebsiteGaryRobinson.net

Gary Robinson is an American software engineer and mathematician[2] and inventor notable for his mathematical algorithms to fight spam.[3] In addition, he patented a method to use web browser cookies to track consumers across different web sites, allowing marketers to better match advertisements with consumers.[4][5] The patent was bought by DoubleClick, and then DoubleClick was bought by Google.[6][7] He is credited as being one of the first to use automated collaborative filtering technologies to turn word-of-mouth recommendations into useful data.[2]

Algorithms to identify spam[edit]

In 2003, Robinson's article in Linux Journal detailed a new approach to computer programming perhaps best described as a general purpose classifier which expanded on the usefulness of Bayesian filtering. Robinson's method used math-intensive algorithms combined with Chi-square statistical testing to enable computers to examine an unknown file and make intelligent guesses about what was in it.[8] The technique had wide applicability; for example, Robinson's method enabled computers to examine a file and guess, with much greater accuracy, whether it contained pornography, or whether an incoming email to a corporation was a technical question or a sales-related question.[9] The method became the basis for anti-spam techniques used by Tim Peters and Rob Hooft of the influential SpamBayes project.[10][11] Spamming is the abuse of electronic messaging systems to send unsolicited, undesired bulk messages.[12] SpamBayes assigned probability scores to both spam and ham (useful emails) to guess intelligently whether an incoming email was spam; the scoring system enabled the program to return a value of unsure if both the spam and ham scores were high.[8] Robinson's method was used in other anti-spam projects such as SpamAssassin.[13][14][15] Robinson commented in Linux Journal on how fighting spam was a collaborative effort:

The approach described here truly has been a distributed effort in the best open-source tradition. Paul Graham, an author of books on Lisp, suggested an approach to filtering spam in his on-line article, "A Plan for Spam". I took his approach for generating probabilities associated with words, altered it slightly and proposed a Bayesian calculation for dealing with words that hadn't appeared very often ... an approach based on the chi-square distribution for combining the individual word probabilities into a combined probability (actually a pair of probabilities—see below) representing an e-mail. Finally, Tim Peters of the Spambayes Project proposed a way of generating a particularly useful spamminess indicator based on the combined probabilities. All along the way the work was guided by ongoing testing of embodiments written in Python by Tim Peters for Spambayes and in C by Greg Louis of the Bogofilter Project. The testing was done by a number of people involved with those projects.

— Gary Robinson, 2003.[11]

In 1996, Robinson patented a method to help marketers focus their online advertisements to consumers. He explained:

As far as I have been able to tell, it's the very first patent ... to mention using web browser cookies to track consumers across different web sites and build a profile of their interests in order to determine what ads to show them ... There was an aspect in the way browser cookies were implemented that allowed them to be used ... I hired programmers to do the programming to actually test it ... the hypothesis turned out to be correct.

— Gary B. Robinson, 2014

Entrepreneurial activity[edit]

In 2010, Robinson was the chief technology officer at FlyFi, an online music service owned by Maine-based[16] Emergent Discovery which uses his anti-spam programming techniques along with collaborative filtering technologies to help make music recommendations to web users.[17][18] His blog Gary Robinson's Rants has been quoted by others in the computer and online music industries[17] and cited by academic papers.[12][19][20] Robinson helped develop recommendation engine technology which applies high-power mathematical techniques using software algorithms to have a computer guess intelligently about what a consumer might like.[21] For example, if a consumer likes music by artists such as the Beach Boys, Bob Dylan and Talking Heads, the computer software will match these preferences with a much larger dataset of other consumers who also like those three artists but which cumulatively has much greater musical knowledge than the single consumer. Accordingly, the computer will find music that the user might like but hasn't been exposed to, and therefore hopefully offer intelligent recommendations, in a process which has come to be called knowledge management.[2] But the mathematics behind such comparisons can become quite complex and involved. Robinson studied mathematics at Bard College and graduated in 1979 and studied further at the Courant Institute of New York University.[1] In the 1980s, Robinson worked on an entrepreneurial start-up dating service called 212-Romance which used similar computer algorithms to match singles romantically.[2][22] The New York City-based voice mail dating service created community-based automated recommendations and used collaborative filtering technologies which Robinson developed further in other capacities.

References[edit]

  1. ^ a b c d "Gary Robinson". 2010-09-18. Retrieved 2010-09-18. I make the music recommendation technology at FlyFi — Where I grew up Bronxville, NY — Companies I've worked for Athenium, OLI Systems, Lambda Technology — Schools I've attended Bard College; Courant Institute of Mathematical Sciences
  2. ^ a b c d Matthew French, May 20, 2002, Boston Business Journal, Romantic beginnings have worldwide effect, Retrieved August 6, 2016, "... Gary Robinson ... a mathematician by training ... first automated collaborative filtering applications ..."
  3. ^ "SpamBayes Project Page". SpamBayes. 2010-09-18. Retrieved 2010-09-18. Gary Robinson provided a lot of the serious maths and theory, as well as his essay on "how to do it better" (see the background page for a link).
  4. ^ US 5918014 A, Application number US 08/774,180, Publication date Jun 29, 1999, Filing date Dec 26, 1996, Automated collaborative filtering in world wide web advertising, "... This invention combines techniques for: determining the subject's community, and determining which ads to show ... to determine whether a given individual should be in the subject's community is gleaned from the individual's activities ... Means are provided to track a consumer's activities ... e.g. by means of "cookies"..."
  5. ^ Patent Buddy, Gary B Robinson Inventor, Patent years: 1999, 2001, "... Automated collaborative filtering in world wide web advertising ..."
  6. ^ TechCrunch, Apr 13, 2007 by Michael Arrington, Breaking: Google Spends $3.1 Billion To Acquire DoubleClick, Accessed March 12, 2014, "... About 20 minutes ago Google announced that they have agreed to acquired DoubleClick for $3.1 billion in cash ..."
  7. ^ Bill Slawski, Apr 14, 2007, SEO by the Sea, Doubleclick + Google: Looking at Some of the Doubleclick Patent Filings, Accessed March 12, 2014, "... smart ad box showing on a page that displays different advertisements to users over time, based upon a recommendations system. ..."
  8. ^ a b "Background Reading". SpamBayes project. 2010-09-18. Retrieved 2010-09-18. "Sharpen your pencils, this is the mathematical background (such as it is).
    • The paper that started the ball rolling: Paul Graham's A Plan for Spam.
    • Gary Robinson has an interesting essay suggesting some improvements to Graham's original approach.
    • Gary Robinson's Linux Journal article discussed using the chi squared distribution."
  9. ^ Ben Kamens, Fog Creek Publishing, Bayesian Filtering: Beyond Binary Classification Archived 2015-09-24 at the Wayback Machine, Retrieved February 7, 2015, "... Of these, Robinson's technique ... borrowed from R.A. Fischer's combination of probabilities into a chi-squared distribution, has been extensively tested and is used by the most successful filters, including SpamBayes. Robinson provides ample theoretical justification for this improvement in practical accuracy over the original filters ..."
  10. ^ T.A. Meyer and B Whateley (2010-09-18). "SpamBayes: Effective open-source, Bayesian based, email classification system". Massey University, Auckland, New Zealand. Retrieved 2010-09-18. G. Robinson, "Spam Detection", [online] 2002, ... G. Robinson, "Instructions for Training to Exhaustion", (Gary' Longer Rants), [online] 2004, (see page 8)
  11. ^ a b Gary Robinson (Mar 1, 2003). "A Statistical Approach to the Spam Problem: Using Bayesian statistics to detect an e-mail's spamminess". Linux Journal. Retrieved 2010-09-18. This article discusses one of many possible mathematical foundations for a key aspect of spam filtering—generating an indicator of "spamminess" from a collection of tokens representing the content of an e-mail.
  12. ^ a b David Anderson (September 2006). "Statistical Spam Filtering — EECS595, Fall 2006". Retrieved 2010-09-18. Gary Robinson proposes an improved method for calculating the word value of a token W. His method modifies Graham's by adding a confidence factor to scale the word value by the amount of historical data that is available for the token. Let N be ...
  13. ^ The SpamAssassin Project. "train SpamAssassin's Bayesian classifier". SpamAssassin website. Retrieved 2010-09-18. Gary Robinson's f(x) and combining algorithms, as used in SpamAssassin
  14. ^ "Credits — the Perl Programming Language — Algorithms". Perl. 2010-09-18. Retrieved 2010-09-18. Algorithms: The Bayesian-style text classifier used by SpamAssassin's BAYES rules is based on an approach outlined by Gary Robinson. Thanks, Gary!
  15. ^ "Installation". Ubuntu manuals. 2010-09-18. Archived from the original on 2010-09-29. Retrieved 2010-09-18. Gary Robinson's f(x) and combining algorithms, as used in SpamAssassin
  16. ^ "Contact "Emergent Discovery"". Emergent Discovery. 2010-10-14. Retrieved 2010-10-14. Emergent Discovery — 565 Congress Street — Suite 201 —Portland, ME 04101
  17. ^ a b Kevin Dangoor (April 30, 2002). "Gary Robinson's Three Steps to Freedom". BlueSkyOnMars. Retrieved 2010-09-18. Gary Robinson, the head of Emergent Music has an article on his blog about the Three Steps To Freedom. His opinion on this definitely counts, because EM might very well be the future of music. I'm going to chime in with my thoughts here and copy them over to EM's forum as well.
  18. ^ "Management Team". FlyFi. 2010-09-18. Retrieved 2010-09-18. Gary Robinson, CTO, is both a musician and leader in the "recommendation engine" field. Gary's background reflects his pioneering work in mathematics, technology and collaborative filtering.
  19. ^ Gary Robinson (2006-01-30). "Request for Your Input Regarding Three Steps To Freedom: THE 3 STEPS TO FREEDOM". Gary Robinson's Rants: Rants on spam, business, digital music, patents, and other assorted random stuff. Retrieved 2010-09-18. So, as a "thought experiment," I have imagined the following path to creating an alternative music industry.
  20. ^ "FlyFi iTunes Helper 2.0.0.1 for Mac". CNet. 2010-09-18. Retrieved 2010-09-18. The FlyFi iTunes Helper sends the contents of your iTunes data file (a behind the scenes part of your iTunes library) to FlyFi server to be analyzed. By looking at your iTunes music, which is one of the best reflections of your musical tastes, FlyFi can make better new music suggestion. FlyFi can also use this information to better serve other members.
  21. ^ "Management Team". Emergent Discovery. 2010-09-18. Retrieved 2010-09-18. Gary Robinson, CTO, is a leader in the "recommendation engine" field. Gary's background reflects his pioneering work in mathematics, technology and collaborative filtering. For instance, as a Research Director at ActiveState, Gary's work on spam detection is now being widely adopted by the anti-spam industry, including such leading filters as SpamAssassin (PC Magazine's Editor's Choice for spam filtering), SpamSieve (MacWorld's Software of the Year) and SpamBayes (PC World's Editor's Choice for spam filtering).
  22. ^ "New York Magazine". Sep 12, 1988. Retrieved 2010-09-18. (ad for 212-Romance on left side of page)

External links[edit]