Tut031 Zhu
Tut031 Zhu
Modeling at Scale
Qiang Zhu1, Songtao Guo1, Paul Ogilvie2, Yan Liu1
Business Analytics1 and Engineering2 at LinkedIn Corporation
2029 Stierln Ct, Mountain View, CA 94043 USA
{qzhu, soguo, pogilvie, yliu}@linkedin.com
1
Permission to make digital or hard copies of part or all of this work for https://aws.amazon.com/machine-learning
personal or classroom use is granted without fee provided that copies are 2 https://databricks.com
not made or distributed for profit or commercial advantage and that 3
copies bear this notice and the full citation on the first page. Copyrights https://azure.microsoft.com/en-us/services/machine-learning
for third-party components of this work must be honored. For all other 4 https://cloud.google.com/ml/
uses, contact the Owner/Author.
5
Copyright is held by the owner/author(s). http://www.h2o.ai
KDD '16, August 13-17, 2016, San Francisco, CA, USA 6 https://dato.com
ACM 978-1-4503-4232-2/16/08.
http://dx.doi.org/10.1145/2939672.2945388
Open source software functions. Before joining LinkedIn, she worked on search
o Vowpal Wabbit7 relevance and personalization at NexTag. Yan holds a Ph.D. in
o Spark MLlib8 statistics from University of Virginia and B.S. in computer
o DMLC9 science from China.
o Scikit-learn10
o R11 4. REFERENCES
Modoop - example of a scaled framework [1] Agarwal, D., Chen, B.C., Gupta, R., Hartman, J., He,
Q., Iyer, A., Kolar, S., Ma, Y., Shivaswamy, P., Singh,
3. PRESENTER INFORMATION A., and Zhang, L. 2014. Activity ranking in LinkedIn
Qiang Zhu is a Staff member of Business Analytics Data Mining feed. Proceedings of the 20th ACM SIGKDD
team at LinkedIn. He and his team apply advanced Data Mining international conference on Knowledge discovery and
techniques to drive LinkedIn’s monetization efforts, ranging from data mining (KDD '14). ACM, New York, NY, USA,
a machine learning platform which powers member Email 1603-1612.
Marketing, to Sales Intelligence tools while help salespeople sell [2] Agarwal, D., Chen, B.C., He, Q., Hua, Z., Lebanon, G.,
smarter. Prior to joining LinkedIn, he worked at StumbleUpon as Ma, Y., Shivaswamy, P., Tseng, H.P., Yang, J., and
a Data Scientist. Qiang holds a PhD in Computer Science from Zhang, L. 2015. Personalizing LinkedIn Feed. In
University of California, Riverside. His work has appeared in Proceedings of the 21th ACM SIGKDD International
many top tier Data Mining conferences and journals, including the Conference on Knowledge Discovery and Data Mining
one which won the Best Paper Award in SIGKDD 2012. (KDD '15). ACM, New York, NY, USA, 1651-1660.
Songtao Guo is a Principal Data Scientist and tech lead of Data [3] Lebanon, G. 2015. Making Your Feed More Relevant –
Mining team at Linkedin where he leads many of data driven Part I. November 17, 2015. Retrieved June 12, 2016
products and analytics systems. His work involves building large- from
scale knowledge base as one of the foundations of LinkedIn's https://engineering.linkedin.com/blog/2015/11/making-
Economic Graph, inventing data mining platforms to scale your-feed-more-relevant--part-i
business analytics and partnering with product, sales, and
[4] Lebanon, G. 2016. Making Your Feed More Relevant –
marketing to deliver impactful solutions. Before joining LinkedIn,
Part 2: Relevance models and features. March 15, 2016.
Songtao was a senior researcher at AT&T interactive, focusing on
Retrieved June 12, 2016 from
improving data quality and search relevancy for local business
https://engineering.linkedin.com/blog/2016/03/making-
search. He holds a PhD in computer science from University of
your-feed-more-relevant--part-2--relevance-models-
North Carolina at Charlotte where he studied privacy preserving
and-fea
data mining.
[5] Rosenberg, C. 2015. B2B Predictive Analytics
Paul Ogilvie manages the Machine Learning Algorithms team in Technology Report: Best practices, tools, and vendor
the Engineering organization of LinkedIn. The team’s mission is evaluations to help marketing and sales organizations
to research and develop the learning algorithm libraries and adopt predictive analytics. July, 2015. Retrieved June
datasets that help research scientists more productively build 12, 2016 from Infer: https://www.infer.com/wp-
state-of-the art relevance models. He earned his PhD in Language content/uploads/2015/08/TOPO-Predictive-Analytics-
and Information Technologies from Carnegie Mellon University 08-03-15.pdf
in 2010, where he studied semi-structured information retrieval
with applications to web search, XML element retrieval, and [6] Sculley, D., Holt, G., Golovin, D., Davydov, E.,
question answering systems. He has previously worked on news Phillips, T., Ebner, D., Chaudhary, V., and Young, M.
recommendation at a startup (mSpoke) and at LinkedIn. 2014. Machine Learning: The High-Interest Credit Card
of Technical Debt. Software Engineering for Machine
Yan Liu manages the Data Mining team at LinkedIn Analytics Learning (NIPS 2014 Workshop).
group. She leads various data mining initiatives and efforts in
building advanced intelligence solutions and scalable data mining
platforms to create leverage and drive business impact across
7 https://github.com/JohnLangford/vowpal_wabbit/wiki
8 http://spark.apache.org/docs/latest/mllib-guide.html
9 http://dmlc.ml
10 http://scikit-learn.org
11 https://www.r-project.org