Yandex Blog

YSDA-HSE’s Annual Machine Learning Summer School Heads to Germany

We’ve long felt that being one of Europe’s largest tech companies means we have a responsibility to help educate current and future generations of data scientists.  We’re continually looking for ways to advance machine learning for our users and the greater AI community, and one way of doing that is to encourage data science learning.  Our education initiatives offer opportunities for a broad range of learners, from those interested in online courses to professionals looking for career advancement in computer science.  Many of our education programs stem from our collaborations with higher education institutions, which enable us to work with the brightest scientific minds to teach diverse topics in machine learning. 

The annual Machine Learning in High Energy Physics summer school which we help organize is an excellent example of our commitment to academic collaboration.  The Yandex School of Data Analysis and the Laboratory of Methods for Big Data Analysis at Moscow’s Higher School of Economics (HSE) have annually staged the summer school since 2015.  Each year, we work with a different scientific partner in Europe to host the summer program.  This year, the DESY research center in Hamburg, Germany, will host the fifth MLHEP summer school from July 1st to July 10th.  The program will welcome 71 postgraduates and postdoctoral researchers from 17 countries, with most coming from the EU, the United States, and Russia.  

The MLHEP summer school focuses on the emerging fields of data analysis and computational research in High Energy Physics (HEP), also known as particle physics.  Machine learning helps solve essential problems in HEP that range from online data filtering and reconstruction to offline data analysis. Over ten days, students at the summer school will have both a theoretical and practical introduction to machine learning in HEP, covering topics from decision trees to deep learning and hyperparameter optimisation.  Students will have the opportunity to apply what they learn with concrete examples and hands-on tutorials.

Participants in previous years have come from all over the world with diverse backgrounds to enhance their machine learning skills.  

“During the MLHEP school, I widened my understanding of machine learning methods,” says Mikkel Bjorn, a DPhil student in Elementary Particle Physics at the University of Oxford.  “I learned new ideas about where the techniques we studied can be useful in the work of myself and my group.” 

Alexey Kharlamov, a recent graduate of HSE, adds that “Most of all I liked the atmosphere of the program, which cultivated an interest in machine learning as a result of working with both motivated students and excellent teachers who love their subject.  In such an environment, it’s exciting to develop your data science skills.”

The MLHEP summer program emphasizes both theoretical knowledge and practical application to ensure students come away with applicable skills.  We organize a related machine learning competition that spans two to three months to provide a continued opportunity for students to apply their knowledge.  The competition is inspired by Yandex’s long-standing relationship with CERN, where researchers from Yandex have been working with physicists to solve issues related to matter and energy.  In particular, students will be creating solutions related to the Large Hadron Collider beauty experiment at CERN. The competition will require students to process particle information using modelling techniques.  The two-part contest will be similar to Kaggle machine learning competitions and take place in a co-learning environment, encouraging students to work together to solve challenges.

Lecturers from the Faculty of Computer Science at HSE, a department Yandex co-founded, will teach most of the sessions.  As we’re always eager to promote an atmosphere of collaboration in our education initiatives, we’re excited to announce they will be joined by several guest lecturers from Facebook, Oracle, Caltech, and more, who will be teaching sessions on causal inference, probabilistic programming, and other machine learning topics.

The MLHEP summer school is yet another exciting opportunity for Yandex to collaborate with academia and encourage data science learning.  For more information about the program, please visit the website and follow @yandexcom on Twitter to get updates during the summer school!

Welcome Learners: Yandex Hosts NLP Week 2019

Education is something we feel passionate about, and as STEM skills become essential in today’s work environments, we are committed to providing avenues for people to improve their data science expertise.  As one of Europe’s largest internet companies, we have a responsibility to help educate future generations in data science, artificial intelligence, and machine learning.  The Yandex School of Data Analysis (YSDA), a free master’s-level program in computer science, is at the centerpiece of these efforts.

Through the MA program, YSDA provides students with the opportunity to take courses in many different data science fields, and participate in internships, online courses, and additional extended educational opportunities.  As part of our commitment to education and our support of data science learning, we’re excited to kick off a new extended education effort this week - the first Natural Language Processing (NLP) Week at Yandex’s headquarters in Moscow.

NLP Week 2019 offers students a research-oriented intensive program on NLP in English.  Each course will be three hours long, for a total of twelve hours of in-person teaching throughout the week.  NLP Week acts as an extension to the NLP course taught at YSDA, but any students or professionals with a sufficient background in NLP and a strong command of English could also register for the course.  By the end of the four-day course, students at NLP Week will have learned the NLP applications of latent variables, deep generative models and semantic parsing.

Students of the Yandex School of Data Analysis

We are thrilled to welcome two respected experts in the field to teach NLP week!  Wilker Aziz of the Institute for Logic, Language and Computation at the University of Amsterdam will teach courses on latent variable models, deep generative models, and advanced topics.  Mirella Lapata of the School of Informatics at the University of Edinburgh will teach semantic parsing.

Wilker Aziz
Mirella Lapata

We look forward to all the exciting classes ahead this week.  We will be sharing more on our social channels and welcome our students to join the conversation!

Yandex Introduces the Ilya Segalovich Award in Computer Science

Yandex is thrilled to announce a new annual award for students and faculty in computer science and related fields, named after Ilya Segalovich, Yandex co-founder and creator of Yandex search.  This award honors Ilya’s commitment to supporting education and his philanthropic pursuits and introduces a new Yandex education initiative to encourage the study of computer science.

The Ilya Segalovich Award recognizes academic achievement and research contributing to technological advancements in areas relevant to Yandex.  These fields include speech recognition and speech synthesis, information search and data analysis, machine learning, computer vision, and natural language processing and machine translation.

The award is open to graduate or postgraduate students and academic advisors in computer science fields at institutions in Russia, Belarus or Kazakhstan.  Students can directly apply for the award, while academic advisors must be nominated.  An award committee composed of members of the Yandex management team and top machine learning experts will consider the quality of candidates’ published work to select winners.

"Yandex has always strongly valued education in computer science," says Arkady Volozh, CEO and co-founder of Yandex.  "We believe education in the field will continue to be central to the advancement of AI and delivering intelligent products and services to users everywhere.  With this award, we want to support researchers who, like us, are engaged in computer science and are inspired to build the technologies of the future.  We named the award after Ilya to honor his commitment to progress and his achievements supporting the IT community."

Ilya Segalovich

The Ilya Segalovich Award follows another Yandex initiative to recognize Ilya’s passion for education, the Ilya Segalovich Scholarship.  Established in 2014, this scholarship supports computer science students at the National Research University Higher School of Economics in Moscow.  Yandex has also partnered with HSE to establish the Faculty of Computer Science, which trains developers, analysts, and researchers in data analysis and software engineering.  

These two academic awards represent just part of Yandex’s commitment to education, a key part of which includes Yandex.Lyceum for secondary students and the Yandex School of Data Analysis (YSDA).  YSDA is a Master’s level program in computer science and data analysis that Ilya helped establish in 2007 together with Arkady Volozh, and pattern recognition specialist, Ilya Muchnik.  YSDA graduates and Yandex professionals regularly advance the computer science field with their contributions of published articles, and their expertise is key to powering Yandex’s intelligent products and services.

Students at the first Ilya Segalovich Scholarship ceremony in 2015

The Ilya Segalovich Award Committee will award a total of up to 15 million rubles (about $230,000) to thirteen winners.  Student awardees will receive 350,000 rubles ($5,300), a grant to travel to an international conference on artificial intelligence, and an internship opportunity at Yandex that includes a professional mentorship.  Academic advisors will receive 700,000 rubles ($10,600).  The application deadline is the end of February, and the award ceremony will take place in Moscow this spring.

Yandex Partners with Tel Aviv University to Enhance AI Education

Yandex is thrilled to partner with Tel Aviv University (TAU) to create the Yandex Machine Learning Initiative and open the sixth branch of the Yandex School of Data Analysis (YSDA), where a new career advancement program will be offered.

From digital learning platforms to our master’s program, Yandex has led several major education initiatives to improve the development and impact of AI in both academia and the private sector over the last decade. Our new partnership with TAU marks a key step in expanding our machine learning education and innovation programs to one of the world’s leading tech centers.

Today the tech community in Israel hosts many of the top startups creating innovative products and services that are shaping the future of AI globally.  As the largest university in Israel, TAU plays a critical role in the development of both the local and global tech community. The team at TAU provides a world-class computer science education and shares our vision to enhance the global AI ecosystem by creating more experts and innovators in the field.  Yandex CEO, Arkady Volozh, who spearheaded the partnership explains, “It’s our goal to not only ensure education in AI continues to grow the global community but also to keep challenging and advancing our ability to shape the future of AI.”  

The Yandex Machine Learning Initiative will be run through TAU’s Blavatnik School of Computer Science as part of the BSc Program in Computer Science. The program will introduce a cluster of courses that will focus on machine learning, deep learning, natural language processing, computer vision and robotics.

The majority of the initiative will be dedicated to expanding opportunities for students through Yandex Fellowships, which will award scholarships to the brightest students at the master’s, doctoral, and post-doctoral level.  As part of the initiative, the Yandex International Distinguished Lecture Series in Machine Learning will enhance collaboration by bringing experts from around the world to TAU to give lectures and conduct research with TAU faculty.  Furthermore, the Yandex Initiative will also support new faculty recruitment and the acquisition of state-of-the-art equipment to enhance the program.

In addition to the Yandex Machine Learning Initiative at the university, TAU will become the sixth campus location of the Yandex School of Data Analysis (YSDA), where YSDA will launch a one-year career advancement program for the first time.  For over ten years now, YSDA has been offering a master’s level program in Computer Science and Data Analysis that challenges students with two vigorous years of training from top experts in the most advanced data science topics.  To date, over 600 YSDA graduates have brought their understanding of theoretical foundations and hands-on experience at YSDA to academia and leading technology companies around the world. 

While the majority of YSDA graduates have gone on to work at companies like Yandex, Microsoft, and Apple, the future of AI is not limited to tech companies and academia but will continue to expand into a wide range of industries and businesses where machine learning experts are in growing demand.  In recognition of the increasing need for skilled AI professionals, the YSDA faculty is designing a new one-year machine learning career advancement program for students and professionals that will be offered at the new TAU location.  The new program aims at serving students with diverse academic and professional backgrounds who can apply their machine learning expertise across many industries.

We are excited about the work ahead in 2018 and the opportunity to help educate the next generation of AI experts who will be innovating for years to come.  Last week the Yandex TAU partnership kicked off with a ceremony in which Yandex CEO, Arkady Volozh signed the agreement for the initiative with TAU President Joseph Klafter.  

Celebrating 10 Years of the Yandex School of Data Analysis

Over the years Yandex has launched several education initiatives ranging from learning platforms to high school and master’s level courses. 2017 marks the 10-year anniversary of one of our most impactful educational initiatives, the Yandex School of Data Analysis (YSDA), a free Master's-level program in Computer Science and Data Analysis. In Russia, Yandex is privileged to have access to some of the most talented math and science minds in the world but 10 years ago, Yandex co-founders Arkady Volozh and Ilya Segalovich realized there was a real need to foster these talents and offer students a program for advanced data science.  

 “Together with well-known pattern recognition specialist, Ilya Muchnik, our co-founders considered the need beyond programmers, to programmers with advanced knowledge of the most modern machine learning practices. There was a serious demand for people to lead Yandex and the entire industry down a new path,” says Yandex Director of Human Resources and Director of Computer Science Department at YSDA, Lena Bunina. “And it was clear that we needed to undertake this ourselves.”

Since 2007, the YSDA has been offering Russian students two vigorous years of training from top experts in the most advanced data science topics. Students leave with a profound understanding of theoretical foundations and hands-on experience in applications such as computer vision and  machine translation. “At YSDA, we focus less on theories and more on developing well-rounded students,” says Bunina. “It’s important for students to work in labs, solve problems and receive practical experience.” YSDA students have a unique opportunity to put their data processing and research skills to use as part of YSDA’s collaboration with the LHCb experiment at CERN, the European Organization for Nuclear Research.

YSDA students also have internship opportunities at Yandex that allow them to further expand their training. Interns have the opportunity to work side-by-side with Yandex data scientists helping to deliver our users exceptional customer experiences.

Initially, the school started in Moscow instructing a class of 80 students in the department of Data Analysis. Our academic reputation, coupled with the growing demand for data scientists in today’s AI-centric world,, have created a huge demand for YSDA courses. Last year we received over 4,000 applications from top universities, welcoming 211 students who passed our rigorous entrance exams.

Today, the program has campus branches in Moscow, Yekaterinburg, Novosibirsk and Minsk, plus online offerings in both English and Russian on Coursera and partnerships with some of Russia's leading research institutions and universities.  The Yandex School of Data Analysis offers the Department of Data Analysis and the Department of Computer Science, and with a specialization in Big Data.

At the YSDA, it’s our mission to prepare students to succeed far beyond the walls of our classrooms. “YSDA’s unique blend of science and practice prepares graduates for a wide range of professional options in science, research, product development, analytics and more,” says Misha Levin, Chief Data Scientist at Yandex.Market and YSDA lecturer.

In 2009, we graduated our first 36 students and this past year proudly graduated 123 students.  After ten years, over 600 YSDA graduates are changing the way technology impacts our lives. One of them is Ruslan Mavlyutov, a 2011 YSDA graduate who now works as a Machine Learning Engineer at Apple. “YSDA elevated my machine learning career and I feel privileged to have had the opportunity to study there,” says Mavlyutov. “My peers and I had the unique opportunity to be early ML ‘adopters’ at YSDA enabling us to start and lead ML endeavours at Yandex and other tech giants like Google, Facebook, Microsoft and Apple.” 

“YSDA has created a network of bright minds that pumps ideas across countries, industries and companies. In almost every well-known IT company I can find someone who has studied at the YSDA." Mavlyutov adds, "Those are the people whom you can trust and rely on their expertise.”

We’re excited to be at the forefront educating the next generation of data scientists over the next 10 years. Thank you to all of past and present students and instructors of the Yandex School of Data Analysis! 

Now We’re Looking for Lepton Flavour Violation

Wouldn’t we all like to think that the world that we’re living in is more or less stable? Isn’t there a certain pleasure to be sure that our feet will be pulled to the ground as firmly tomorrow as they are today? Isn’t it reassuring to know that the cup of tea we’ve just put on our desk won’t disappear instantly and reappear on the bottom of the sea on the other side of the planet having traveled its diameter on a straight line? In classical physics, Newton’s laws give us this reassurance. These laws bestow predictability on objects or events as they exist or happen in our reality - on a macroscopic level. On a microscopic level - in particle physics - Fermi’s interaction theory, for instance, postulates that the laws of physics remain the same even after a particle undergoes substantial transformation.

In 1964, however, it became apparent that this isn’t always the case. James Cronin and Val Fitch showed, by examining the decay of subatomic particles called kaons, that a reaction run in reverse does not necessarily retrace the path of the original reaction. This discovery opened a pathway to the theory of electroweak interaction, which in turn gave rise to the theory we all now know as the Standard Model of particle physics.

Although the Standard Model is currently the most convenient paradigm to live with, it doesn’t explain a number of problems, including gravity or dark matter. Other theories compete very actively for the leading role in describing the laws of nature in the most accurate and comprehensive way. To succeed, they have to provide evidence of something that happens outside the limitations of the Standard Model. A promising area to look for this kind of evidence is the decay of a charged lepton (tau lepton) into three lighter leptons (muons), which happen to have a certain characteristic - flavour - that is different from the same characteristic of their ‘mother’ particle. According to the Standard Model, the probability of this decay is vanishingly low, but it can be much higher in other theories.

One experiment at CERN, LHCb, aims at finding this τ → 3μ decay. How are they going to find it? By searching for statistically significant anomalies in an unthinkably large amount of data. How can they find statistically significant anomalies in an unthinkably large amount of data? By using algorithms. These can be trained to separate signal (lepton decays) from background (anything else, really) better than humans. The problem here, however, is not only to find these lepton decays, but also find them in statistically significant numbers. If the Standard Model is correct, the τ → 3μ decays are so rare that their observations are below experimental sensitivity.

To come up with a more sensitive and scale-appropriate solution that would help physicists find evidence of the tau lepton decay into three muons at a statistically significant level, Yandex and CERN’s LHCb experiment have launched a contest for a perfect algorithm. The contest, called ‘Flavours of Physics’, starts on July 20th with the deadline for code submissions on October 12th. It is co-organised with an associated member of the LHCb collaboration, the Yandex School of Data Analysis, and Yandex Data Factory - a big data analytics division of Yandex - and is hosted on a website for predictive modeling and analytics competitions, Kaggle. The winning team or participant will claim a cash prize of $7,000, with $5,000 and $3,000 awarded to the first and the second runners-up. An additional prize in the form of an opportunity to participate in an LHCb workshop at the University of Zurich and $2,000 provided by Intel will be given to the creator of an algorithm that will prove to be the most useful to the LHCb experiment. The data used in this contest will consist both of simulated and real data, acquired in 2011 and 2012, that was used for the τ → 3μ decay analysis in the LHCb experiment.

Contest participants can build on the algorithm provided by the Yandex School of Data Analysis and Yandex Data Factory to make an algorithm of their own.

The metric for evaluation of the algorithms submitted for this contest is very similar to the one used by physicists to evaluate significance of their results, but is much more simple and robust thanks to the collective effort of the Yandex School of Data Analysis and LHCb specialists who have adapted procedures routinely used in the LHCb experiment specifically for this contest. Our expectation is that this metric will help scientists choose the algorithms that they could use on data that will be collected in the LHCb experiment in 2015, and in a wide range of other experiments.

Finding the tau lepton decay might take us out of the comfort zone of the Standard Model, but it just as well may open the door to extra dimensions, shed light on dark matter, and finally explain how gravity works on a quantum level.


lhcbevent.jpg

Collisions as seen within the LHCb experiment's detector (Image: LHCb/CERN)

Yandex’s School of Data Analysis Joins LHCb Collaboration

The Yandex School of Data Analysis has joined in collaboration with CERN’s Large Hadron Collider beauty (LHCb) experiment. The project is one of four large particle detector experiments at the Large Hadron Collider, and collects data to study the interactions of heavy particles, called b-hadrons.

As a result of this collaboration, the LHCb researchers will receive continuous support from existing applications (EventIndex, EventFilter) and the development of new services designed for the LHCb by the Yandex School of Data Analysis. YSDA will contribute its data processing skills and capabilities, and perform interdisciplinary research and development on the edge of physics and data science that will serve the aims and needs of the LHCb experiment.

LHCb 81 copy.jpg

LHCb experiment. Photo by Tim Parchikov.

The researchers at the LHCb experiment are seeking, among other things, to explain the imbalance of matter and antimatter in the observable universe. This programme requires collecting, processing and analysing a very large amount of data. Yandex has already been contributing its search technologies, computing capabilities and machine-learning methods to the LHCb experiment since 2011, helping the physicists gain quick access to the data they need. Since January 2013, Yandex has been providing its core machine-learning technology MatrixNet for the needs of particle physics as an associate member of CERN openlab, CERN’s collaboration with industrial partners.

The Yandex School of Data Analysis is now part of the game, with its exceptional talent, a strong tradition in hard-core mathematics, and proven experience of converting new theoretical knowledge into practical solutions. The YSDA is the only member of the LHCb collaboration that does not specialise in physics. Other collaborators in the project include such prestigious institutions as MIT (USA), EPFL (Switzerland), University of Oxford and Imperial College, London (UK).

The Yandex School of Data Analysis is a free Master’s-level program in computer science and data analysis, offered by Yandex since 2007 to graduates in engineering, mathematics, computer science or related fields. It trains specialists in data analysis and information retrieval. The school’s program includes courses in machine learning, data structures and algorithms, computational linguistics and other related subjects. It runs a number of joint programs, both at Master’s and PhD levels, with leading education and research institutions including the Moscow Institute of Physics and Technology, the National Research University Higher School of Economics (HSE), and the Department of Mechanics and Mathematics of Moscow State University. In seven years, the Yandex School of Data analysis has prepared more than 320 specialists.