Research Spotlight

3X: A Workbench for eXecuting eXploratory eXperiments

3X is an open-source software tool to ease the burden of conducting computational experiments and managing data analytics. 3X provides a standard yet configurable structure to execute a wide variety of experiments in a systematic way, avoiding repeated creation of ad-hoc scripts and directory hierarchies. 3X organizes the code, inputs, and outputs for an experiment. The tool submits arbitrary numbers of computational runs to a variety of different compute platforms, and supervises their execution. It records the returning results, and lets the experimenter immediately visualize the data in a variety of ways. Aggregated result data shown by the tool can be drilled down to individual runs, and further runs of the experiment can be driven interactively. Our ultimate goal is to make 3X a “smart assistant” that runs experiments and analyzes results semi-automatically, so experimenters and analysts can focus their time on deeper analysis. Two features toward this end are under development: visualization recommendations and automatic selection of promising runs.

Brain-Inspired Computing

ircuits to emulate the functions of the synapses and neurons of the brain. The goal is to use nanoscale electronic devices to do information processing using algorithms and methods inspired by how the brain works. Currently, we are using phase change memory and metal oxide RRAM to perform gray-scale analog programming of the resistance values. These electronic emulations of the synapse are then connected in a neural network to process information and achieve simple learning behavior. In the past few years, we have been able to emulate a variety of spike-timing dependent plasticity (STDP) behaviors of the biological synapse using these nanoscale electronic devices. Using larger arrays of electronic synapses, we study how device variations affect system performance. The stochastic nature of the switching process of these devices has a rich set of properties that may be utilized for many applications. In the future, it may be possible to use these nanoscale electronic devices to study how the brain works, by interfacing these devices directly with biological entities.

ChucK: A Music Programming Language

ChucK is a programming language for audio and music creation. The language is designed around a unique time-based, concurrent programming model that's precise and expressive (we call this strongly-timed), and the ability to add and modify code on-the-fly. It offers composers, researchers, and performers a powerful programming tool for building and experimenting with complex audio synthesis/analysis programs, and real-time interactive music.

What kind of tools would you need to make a functional interactive prototype of a media player in 30 minutes? is a hardware and software system that enables designers to rapidly prototype the bits (the form) and the atoms (the interaction model) of physical user interfaces in concert. was built to support design thinking rather than implementation tinkering. With, designers place physical controllers (e.g., buttons, sliders), sensors (e.g., accelerometers), and output devices (e.g., LEDs, LCD screens) directly onto form prototypes, and author their behavior visually in our software workbench.

Deep Learning

Deep learning is a rapidly growing area of machine learning, that is becoming widely adopted within academia and industry. Whereas machine learning is a very successful technology, applying it today still often requires spending substantial effort hand-designing features to feed to the algorithm. This is true for applications in vision, audio, and text/NLP. To address this, Ng's group and others are working on "deep learning" algorithms, which can automatically learn feature representations (often from unlabeled data), thus bypassing most of this time-consuming engineering. These algorithms are based on building massive artificial neural networks, that were loosely inspired by cortical (brain) computations. One our group's most celebrated results in deep learning is a highly distributed neural network with over 1 billion parameters trained on 16,000 CPU cores (at Google), and that learned by itself to discover high level concepts--such as "cats"---from watching unlabeled YouTube video. For a high-level overview of the field, see the following video.

Ed Feigenbaum's Search for A.I.

This video production documents the life and career of Ed Feigenbaum, "Father of Expert Systems," through archival photographs, a Computer History Museum oral history, and the recollections of his collaborators and students. These recollections were videotaped at the Feigenbaum 70th Birthday Symposium, held on March 25-26, 2006 and co-sponsored by the Stanford Computer Forum.


ImageNet is an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or word phrases, is called a "synonym set" or "synset". There are more than 100,000 synsets in WordNet, majority of them are nouns (80,000+). In ImageNet, we aim to provide on average 1000 images to illustrate each synset. Images of each concept are quality-controlled and human-annotated. In its completion, we hope ImageNet will offer tens of millions of cleanly sorted images for most of the concepts in the WordNet hierarchy.


For transistors it is important to have an atomically thin channel that enables gate length scaling while maintaining good carrier transport required for a high current drive. At the same time, parasitic resistance from the contacts and parasitic capacitance from the device structure must be minimized. Currently, we are working on the use of carbon nanotube (CNT) and two-dimensional layered materials (the transition metal dichalcogenide family of materials) as the atomically thin channel. We are also working on techniques to minimize the contact resistance and the parasitic capacitance. By building practical systems of these emerging technologies, we learn how to solve device and materials problems that have system-level impact.


An artist might spend weeks fretting over questions of depth, scale and perspective in a landscape painting, but once it is done, what's left is a two-dimensional image with a fixed point of view. But the Make3d algorithm, developed by Stanford computer scientists, can take any two-dimensional image and create a three-dimensional "fly around" model of its content, giving viewers access to the scene's depth and a range of points of view.


I started doing research on memory devices around 2003. Research on memory had been rather “predictable” for many years until recently. It was predictable because the major advances for memory devices involved scaling down the physical dimensions of essentially the same device structure using basically the same materials. The situation has changed in the last decade. Memory devices are beginning to be difficult to scale down. But perhaps the most important change is that new applications and products (e.g. mobile phones, tablets, enterprise-scale disk storage) in the last decade are often enabled by advances in memory technology, in particular solid-state non-volatile memories. Our research on memory devices focuses on phase change memory (PCM) and metal oxide resistive switching memory(RRAM). We work on understanding the fundamental physics of these devices and develop models of how they work. We explore the use of various materials and device structures (e.g. 3D vertical RRAM) to achieve desired characteristics. We often utilize the unique properties of nanoscale materials such as carbon nanotube, graphene, and nanoparticles to help us gain understanding of the physics and scaling properties of memory devices.

Neuromorphics: Compiling Code by Configuring Connections

Energy-efficient computing platforms are sorely needed to control autonomous robots and to decode neural signals in brain-machine interfaces. Inspired by the brain’s energy efficiency, we are exploring a hybrid analog-digital approach that uses subthreshold analog circuits to emulate graded dendritic activity and asynchronous digital circuits to emulate all-or-none axonal activity. We have used this approach to build Neurogrid, a sixteen-chip neuromorphic system that can simulate biophysically-detailed cortical models with up to a million neurons and six billion synaptic connections in real-time while consuming a few watts. We are now using this approach—together with a formal method that maps arbitrary nonlinear dynamical systems onto spiking neural networks—to develop a new breed of neuromoprhic chips that can be programmed to perform arbitrary computations. Our goal is to develop a programmable neuromorphic chip with a million neurons and a billion synaptic connections that consumes tens of milliwatts—a hundred times more energy-efficient than Neurogrid. When seamlessly interconnected by on-chip routers to build spiking neural networks with millions of silicon neurons and billions of synaptic connections, these chips offer a promising alternative for robotic and prosthetic applications.

Ocarina: Designing the iPhone's Magic Flute

Ocarina, created in 2008 for the iPhone, is one of the first musical artifacts in the age of pervasive, app-based mobile computing. It presents a flute-like physical interaction using microphone input, multitouch, and accelerometers – and a social dimension that allows users to listen-in on each other around the world. To date, Ocarina has over 10 millions users worldwide, and was a first-class inductee into Apple's Hall of Fame Apps.

Opportunistic Programming

Who will be writing software in the future and how will they be doing it? As computing becomes increasingly important in people's work andhobbies, a much broader range of people are engaging in programming. Understanding and building tools for professional software developers has a long history, but there has been relatively little research on how to support amateur, opportunistic programmers. Professor Scott R. Klemmer's NSF-funded research group at Stanford University is currently studying this problem. So far, they have done fieldwork with exhibit designers at San Francisco's Exploratorium Museum, and conducted several empirical studies on how these programmers use information resources while building software. Most notably, the Web has revolutionized the way these individuals write software. They build entire applications by iteratively searching for, understanding, and integrating pieces of functionality embodied in 15-line chunks of code! Right now, Professor Klemmer's research group is building a number of tools to support amateur programmers that embody and support this reliance on Web resources. The broad goal of this work is to make software development faster, easier, and less error-prone for a much larger population.

Salisbury BioRobotics Laboratory

The Salisbury Lab conducts research in the areas of robotics, medical robotics, haptic devices and haptic rendering algorithms. One project is developing a virtual environment that enables surgeons to plan and practice surgical procedures by interacting visually and haptically with patient-specific data derived from CAT and MRI scans. Our lab developed the first version of the personal robot (PR-1), which eventually was licensed to Willow Garage and was the genesis of the PR-2 personal robot. We continue to develop robot hands, addressing design, control and perceptual issues. Our team is typically a mix of students from computer science, mechanical engineering and mathematics as well as practicing surgeons.

SLOrk: Stanford Laptop Orchestra

The Stanford Laptop Orchestra (SLOrk) is a large-scale, computer-mediated ensemble and classroom that explores cutting-edge technology in combination with conventional musical contexts - while radically transforming both. Founded in 2008 by director Ge Wang and students, faculty, and staff at Stanford University's Center for Computer Research in Music and Acoustics (CCRMA), this unique ensemble comprises more than 20 laptops, human performers, controllers, and custom multi-channel speaker arrays designed to provide each computer meta-instrument with its own identity and presence. The orchestra fuses a powerful sea of sound with the immediacy of human music-making, capturing the irreplaceable energy of a live ensemble performance as well as its sonic intimacy and grandeur. At the same time, it leverages the computer's precision, possibilities for new sounds, and potential for fantastical automation to provide a boundary-less sonic canvas on which to experiment, create, and perform music. Offstage, the ensemble serves as a one-of-a-kind learning environment that explores music, human-computer interaction, design, composition, and live performance in a naturally interdisciplinary way (it's also a cross-listed course in Music and Computer Science). SLOrk uses the ChucK programming language as its primary software platform for sound synthesis/analysis, instrument design, performance, and education.

The Red Sea Robotics Exploratorium

About The Red Sea Robotics Research Exploratorium was created in April 2012 through a generous research award from the King Abdullah University of Science and Technology (KAUST). As a part of the KAUST Global Collaborative Research Program, Stanford University is part of a team of universities working to build a major science and technology university along a marshy peninsula on Saudi Arabia’s western coast. Meka Robotics joined the collaboration and provides the hardware for the development of dexterous underwater robot arms.

The Red Sea Robotics Exploratorium

The Red Sea Robotics Research Exploratorium was created in April 2012 through a generous research award from the King Abdullah University of Science and Technology (KAUST) . As a part of the KAUST Global Collaborative Research Program , Stanford University is part of a team of universities working to build a major science and technology university along a marshy peninsula on Saudi Arabia’s western coast. Meka Robotics joined the collaboration and provides the hardware for the development of dexterous underwater robot arms.

Total Scene Understanding

Given an image, we propose a hierarchical generative model that classifies the overall scene, recognizes and segments each object component, as well as annotates the image with a list of tags. To our knowledge, this is the first model that performs all three tasks in one coherent framework. For instance, a scene of a ‘polo game’ consists of several visual objects such as ‘human’, ‘horse’, ‘grass’, etc. In addition, it can be further annotated with a list of more abstract (e.g. ‘dusk’) or visually less salient (e.g. ‘saddle’) tags. Our generative model jointly explains images through a visual model and a textual model. Visually relevant objects are represented by regions and patches, while visually irrelevant textual annotations are influenced directly by the overall scene class. We propose a fully automatic learning framework that is able to learn robust scene models from noisy web data such as images and user tags from We demonstrate the effectiveness of our framework by automatically classifying, annotating and segmenting images from eight classes depicting sport scenes. In all three tasks, our model significantly outperforms state-of- the-art algorithms.

UltraFlow: A Hybrid Future Internet Architecture

The PIs of this NSF-sponsored project are Prof. Leonid Kazovsky (Stanford), Prof. Vincent Chan (MIT), and Prof. Andrea Fumagalli (UT-Dallas). UltraFlow is a secure, agile and cost-effective architecture that will replace legacy Electronic Packet Switching (EPS), specifically for its ability to enable very large file transfers (terabits of data) in a fast and efficient manner. At Stanford, our mandate is to design and experimentally demonstrate UltraFlow Access, a novel last-mile network architecture that offers dual-mode Internet access to end users: IP and optical flow. The new hybrid Internet architecture is designed to be secure, dynamic (both agile and adaptive), and significantly more cost effective for future growth in data volumes and number of users. UltraFlow relies on a novel optical network architecture comprising new transport mechanisms and a new comprehensive control plane including network protocols from the physical layer up to the application layer. It also integrates the foregoing new network modalities with the conventional TCP/IP network architecture and provides multiple service types to suit any user needs.

Voice-Based Social Media for Developing Regions

Social software – email, blogs, wikis, forums, social networks – has revolutionized how people share expertise and collaborate on the web. However, in rural developing regions, many do not have direct access to Internet-connected PCs or the literacy skills to interact with textual content. How might we design a communications platform for these communities? In our research, we are designing voice-based applications for communities in rural India to access agricultural advice and share expertise, using the mobile phone. The key challenges are contending with limited capability speech recognition for regional languages, designing for illiterate users, and methods for search and filtering of user-generated audio content. We have deployed one pilot system for farmers in Gujarat, India, to record agricultural questions and get responses from experts and other farmers. Based on the enthusiastic response, the application will be launched later this year to serve over 500,000 farmers across the state.