Artificial intelligence(AI) has been an atypical technology trend. In a traditional technology cycle, innovation typically begins with startups trying to disrupt industry incumbents. In the case of AI, most of the innovation in the space has been coming from the big corporate labs of companies like Google, Facebook, Uber or Microsoft. Those companies are not only leading impressive tracks of research but also regularly open sourcing new frameworks and tools that streamline the adoption of AI technologies. In that context, Uber has emerged as one of the most active contributors to open source AI technologies in the current ecosystems. In just a few years, Uber has regularly open sourced projects across different areas of the AI lifecycle. Today, I would like to review a few of my favorites.
Uber is a near-perfect playground for AI technologies. The company combines all the traditional AI requirements of a large scale tech company with a front row seat to AI-first transportation scenarios. As a result, Uber has been building machine/deep learning applications across largely diverse scenarios ranging from customer classifications to self-driving vehicles. Many of the technologies used by Uber teams have been open sourced and received accolades from the machine learning community. Let’s look at some of my favorites:
Note: I am not covering technologies like Michelangelo or PyML, as they are well documented having been open sourced.
Ludwig is a TensorFlow based toolbox that allows to train and test deep learning models without the need to write code. Conceptually, Ludwig was created under five fundamental principles:
- No coding required: no coding skills are required to train a model and use it for obtaining predictions.
- Generality: a new data type-based approach to deep learning model design that makes the tool usable across many different use cases.
- Flexibility: experienced users have extensive control over model building and training, while newcomers will find it easy to use.
- Extensibility: easy to add new model architecture and new feature data types.
- Understandability: deep learning model internals are often considered black boxes, but we provide standard visualizations to understand their performance and compare their predictions.
Using Ludwig, a data scientist can train a deep learning model by simply providing a CSV file that contains the training data as well as a YAML file with the inputs and outputs of the model. Using those two data points, Ludwig performs a multi-task learning routine to predict all outputs simultaneously and evaluate the results. Under the covers, Ludwig provides a series of deep learning models that are constantly evaluated and can be combined in a final architecture. The Uber engineering team explains this process by using the following analogy: “if deep learning libraries provide the building blocks to make your building, Ludwig provides the buildings to make your city, and you can choose among the available buildings or add your own building to the set of available ones.”
Pyro is a deep probabilistic programming language(PPL) released by Uber AI Labs. Pyro is built on top of PyTorch and is based on four fundamental principles:
- Universal: Pyro is a universal PPL — it can represent any computable probability distribution. How? By starting from a universal language with iteration and recursion (arbitrary Python code), and then adding random sampling, observation, and inference.
- Scalable: Pyro scales to large data sets with little overhead above hand-written code. How? By building modern black box optimization techniques, which use mini-batches of data, to approximate inference.
- Minimal: Pyro is agile and maintainable. How? Pyro is implemented with a small core of powerful, composable abstractions. Wherever possible, the heavy lifting is delegated to PyTorch and other libraries.
- Flexible: Pyro aims for automation when you want it and control when you need it. How? Pyro uses high-level abstractions to express generative and inference models, while allowing experts to easily customize inference.
These principles often pull Pyro’s implementation in opposite directions. Being universal, for instance, requires allowing arbitrary control structure within Pyro programs, but this generality makes it difficult to scale. However, in general, Pyro achieves a brilliant balance between these capabilities making one of the best PPLs for real world applications.
Manifold is Uber technologies for debugging and interpreting machine learning models at scale. With Manifold, the Uber engineering team wanted to accomplish some very tangible goals:
· Debug code errors in a machine learning model.
· Understand the strengths and weaknesses of one model both in isolation and in comparison, with other models.
· Compare and ensemble different models.
· Incorporate insights gathered through inspection and performance analysis into model iterations.
To accomplish those goals, Manifold segments the machine learning analysis process into three main phases: Inspection, Explanation and Refinement.
· Inspection: In the first part of the analysis process, the user designs a model and attempts to investigate and compare the model outcome with other existing ones. During this phase, the user compares typical performance metrics, such as accuracy, precision/recall, and receiver operating characteristic curve (ROC), to have coarse-grained information of whether the new model outperforms the existing ones.
· Explanation: This phase of the analysis process attempts to explain the different hypotheses formulated in the previous phase. This phase relies on comparative analysis to explain some of the symptoms of the specific models.
· Refinement: In this phase, the user attempts to verify the explanations generated from the previous phase by encoding the knowledge extracted from the explanation into the model and testing the performance.
Uber built the Plato Research Dialogue System(PRDS) to address the challenges of building large scale conversational applications. Conceptually, PRDS is a framework to create, train and evaluate conversational AI agents on diverse environments. From a functional standpoint, PRDS includes the following building blocks:
- Speech recognition (transcribe speech to text)
- Language understanding (extract meaning from that text)
- State tracking (aggregate information about what has been said and done so far)
- API call (search a database, query an API, etc.)
- Dialogue policy (generate abstract meaning of agent’s response)
- Language generation (convert abstract meaning into text)
- Speech synthesis (convert text into speech)
PRDS was designed with modularity in mind in order to incorporate state-of-the-art research in conversational systems as well as continuously evolve every component of the platform. In PRDS, each component can be trained either online (from interactions) or offline and incorporate into the core engine. From the training standpoint, PRDS supports interactions with human and simulated users. The latter are common to jumpstart conversational AI agents in research scenarios while the former is more representative of live interactions.
Horovod is one of the Uber ML stacks that has become extremely popular within the community and has been adopted by research teams at AI-powerhouses like DeepMind or OpenAI. Conceptually, Horovod is a framework for running distributed deep learning training jobs at scale.
Horovod leverages message passing interface stacks such as OpenMPI to enable a training job to run on a highly parallel and distributed infrastructure without any modifications. Running a distributed TensorFlow training job in Horovod is accomplished in four simple steps:
- hvd.init() initializes Horovod.
- config.gpu_options.visible_device_list = str(hvd.local_rank())assigns a GPU to each of the TensorFlow processes.
- opt=hvd.DistributedOptimizer(opt)wraps any regular TensorFlow optimizer with Horovod optimizer which takes care of averaging gradients using ring-allreduce.
- hvd.BroadcastGlobalVariablesHook(0) broadcasts variables from the first process to all other processes to ensure consistent initialization.
Last by not least, we should mention Uber’s active contributions to AI research. Many of Uber’s open source releases are inspired by their research efforts. Uber AI Research website is a phenomenal catalog of papers that highlight Uber’s latest effort in AI research.
These are some of the contributions of the Uber engineering team that have seen regular adoption by the AI research and development community. As Uber continues implementing AI solutions at scale, we should see new and innovated frameworks that simplify the adoption of machine learning by data scientists and researchers.
“Uber Has Been Quietly Assembling One of the Most Impressive Open Source Deep Learning Stacks in the Market”– Jesus Rodriguez Tweet