Jürgen Schmidhuber

Professor Schmidhuber has recommended books in the following areas:

References: [1], [2]


Jürgen Schmidhuber (born 17 January 1963) is a computer scientist most noted for his work in the field of artificial intelligence, deep learning and artificial neural networks. He is a co-director of the Dalle Molle Institute for Artificial Intelligence Research in Lugano, in Ticino in southern Switzerland. Following Google Scholar, from 2016 to 2021 he has received more than 100,000 scientific citations. He has been referred to as “father of modern AI,” “father of AI,” “dad of mature AI,” “Papa” of famous AI products, “Godfather,” and “father of deep learning.” (Schmidhuber himself, however, has called Alexey Grigorevich Ivakhnenko the “father of deep learning.”)

Schmidhuber did his studies at the Technical University of Munich in Munich, Germany. He taught there from 2004 until 2009 when he became a professor of artificial intelligence at the Università della Svizzera Italiana in Lugano, Switzerland.

With his students Sepp Hochreiter, Felix Gers, Fred Cummins, Alex Graves, and others, Schmidhuber published increasingly sophisticated versions of a type of recurrent neural network called the long short-term memory (LSTM). First results were already reported in Hochreiter’s diploma thesis (1991) which analyzed and overcame the famous vanishing gradient problem. The name LSTM was introduced in a tech report (1995) leading to the most cited LSTM publication (1997).

The standard LSTM architecture which is used in almost all current applications was introduced in 2000. Today’s “vanilla LSTM” using backpropagation through time was published in 2005, and its connectionist temporal classification (CTC) training algorithm in 2006. CTC enabled end-to-end speech recognition with LSTM. In 2015, LSTM trained by CTC was used in a new implementation of speech recognition in Google’s software for smartphones. Google also used LSTM for the smart assistant Allo and for Google Translate. Apple used LSTM for the “Quicktype” function on the iPhone and for Siri. Amazon used LSTM for Amazon Alexa. In 2017, Facebook performed some 4.5 billion automatic translations every day using LSTM networks. Bloomberg Business Week wrote: “These powers make LSTM arguably the most commercial AI achievement, used for everything from predicting diseases to composing music.”

In 2011, Schmidhuber’s team at IDSIA with his postdoc Dan Ciresan also achieved dramatic speedups of convolutional neural networks (CNNs) on fast parallel computers called GPUs. An earlier CNN on GPU by Chellapilla et al. (2006) was 4 times faster than an equivalent implementation on CPU. The deep CNN of Dan Ciresan et al. (2011) at IDSIA was already 60 times faster and achieved the first superhuman performance in a computer vision contest in August 2011. Between 15 May 2011 and 10 September 2012, their fast and deep CNNs won no fewer than four image competitions. They also significantly improved on the best performance in the literature for multiple image databases. The approach has become central to the field of computer vision. It is based on CNN designs introduced much earlier by Yann LeCun et al. (1989) who applied the backpropagation algorithm to a variant of Kunihiko Fukushima’s original CNN architecture called neocognitron, later modified by J. Weng’s method called max-pooling.

In 2014, Schmidhuber formed a company, Nnaisense, to work on commercial applications of artificial intelligence in fields such as finance, heavy industry and self-driving cars. Sepp Hochreiter, Jaan Tallinn, and Marcus Hutter are advisers to the company. Sales were under US$11 million in 2016; however, Schmidhuber states that the current emphasis is on research and not revenue. Nnaisense raised its first round of capital funding in January 2017. Schmidhuber’s overall goal is to create an all-purpose AI by training a single AI in sequence on a variety of narrow tasks.

Schmidhuber received the Helmholtz Award of the International Neural Network Society in 2013, and the Neural Networks Pioneer Award of the IEEE Computational Intelligence Society in 2016 for “pioneering contributions to deep learning and neural networks.” He is a member of the European Academy of Sciences and Arts.