Saturday, June 30, 2012

3 Remarkable Feats in 16 days

When surfing the web and catching up on technology news, it is incredibly easy to become accustomed to the rapid pace that new products are announced, new features are offered, and new capabilities are demonstrated.  Many of us have come to expect new gadgetry to be offered up every few days for our critiquing amusement, and the internet fills itself with commentary and fanaticism, both positive and negative, with each new wave of announcements.  With in a month or so, the attention span of the internet will have drifted onto the next new offerings to be placed before the alter of judgement.  This has become the expected norm.

But, I wanted to encourage everyone to take a moment to pause, and look back at the last 3 weeks..... actually, just 16 days.  In less time than it takes for refrigerated milk to spoil, Apple, Microsoft, and Google each announced a remarkable new piece of technology: (in chronological order)

June 11th: Apple MacBook Retina Display


June 18th: Microsoft Surface with Touch Cover


June 27th: Google Glass


Perhaps it's because I happen to be lucky enough to know some of the people involved with all three of these projects, it absolutely humbles me to see the culmination of all of their work presented to the public at nearly the same time. These are the results of amazing colleagues, both past and present.

I think it is very hard for outside observers to really understand exactly how dramatic these product announcements can be for the people involved in creating them.  Imagine spending 2 or more years of your life working away in complete silence on a project, never quite sure what the world is going think of it, what people are going to say about it, what people are going to do to you in the press... and then suddenly bring it on stage to unveil to an audience of literally millions hoping the demos go okay as you scramble to put on the last minute touches.  The people that invest their lives into these creations inevitably develop an emotional attachment to the product itself.  And as cheesy as it might sound, it really is a bit like releasing a child out in to the world and all you can do is pray that world treats them okay.


It is intensely nerve racking and even surreal to see something that you have helped shape suddenly appear all over international news, see people talking about you in different languages in countries that you've never even thought about visiting, or to turn on the TV and see people talking about your work in a way that is completely out of your control.  Even though I was just a backstage spectator this time, I still got a bit of a rush witnessing it happen.


It's important for people reading news about these projects to realize that these products are not created by massive armies of engineers.  In some cases, it can just be a few dozen folks.  Imagine gathering 2-3 classrooms worth of your colleagues, and setting out to create a product like one of these in 2 years.  That's is no small task.  Think back to what you have done in the past 2 years.  Not that many people can say that they accomplished more.


So before returning to the daily habit of consuming and critiquing tomorrow's offerings of gadgets, put aside any opinions you may or may not have about the companies involved and any predictions about the success of these products.... and take a moment.  Take a moment to realize that in the span of just 16 days, you were able to watch as the hard work of 200-300 people was revealed to the world in the form of 3 remarkable feats of engineering and design, pulling the world ever so slightly further into the future.


Congratulations to all the people involved.  Amazing work.  Time to build more stuff.

Friday, June 1, 2012

Yay for learning

See, learning can be cute and fun!



I donated "one truckload of gravel to pave a school yard" because I personally believe the truckload is an under utilized unit of measure, and it's satisfying to be able to say you bought a "truckload" of just about anything.

If you'd like another educational treat, I highly recommend the following Vi Hart video (who recently joined Kahn Academy earlier this year).  Awesome.



Watch Part 2
Watch Part 3

Wednesday, May 2, 2012

Ceres: solving complex problems using computing muscle

Today, Sameer Agarwal and Keir Mierle (as well as a couple others I'm sure) at Google open sourced the Ceres Non-Linear Least Squares Solver.

Since coming to Google, this is probably the most interesting code library that I have had a chance to work with. And now, you can use it too.   So, what exactly is a "non-linear least squares solver"? and why should you care?

It turns out that a solver like Ceres is at the heart of many modern computer vision and robotics algorithms. Anywhere you have a bunch of observed sensor data about the world and you want to create an explanation of all those observations, a non-linear least squares solver can probably do that for you. For example, if you have a bunch of distance sensors and you want to figure out where you are relative to the walls. Like this:



Or if you have a camera, and you want to figure out the position of the camera and objects in view:



Or say you have a quad copter, and you want to model how it will respond to thrust on different propellers:



or (as in the case of Google Street view) combining vehicle sensors in the cars with GPS data:



or even figure out the best way to position your plant so it gets the most amount of sun (assuming you could accurately measure the amount of sun hitting the leaves):


Non-linear least squares solvers, like Ceres, are a tool for optimizing many variables simultaneously in a complex model/formula to fit some target data. Many of the advanced engineering problems today come down to this.  It's basically a fancy version of your typical line fitting problem:


This is linear least-squares. The model here is:

y = m*x + b

This is "linear" because it is the simple addition of a scaled variable m*x and a constant b.  It is "least-squares" because it minimizes the square of the distance between the line and each of the data points. In this simple case, that algorithm is simply solving for m and b in the line equation. There are methods for directly computing these values. But, if the equation was non-linear such as:

y = (m*x - cos(x))^2/b

You now need a non-linear least squares solver.  Many real world problems are non-linear such as anything that involves rotation, camera projection, multiplicative effects, or compounding/exponential behavior.  You might be able to devise a clever way to calculate the optimal values for m and b directly, or you can use an iterative algorithm and let the computer tweak the values of m and b until the squared error to your data is minimized. While this example also only has two variables, Ceres can handle optimizing thousands of variables simultaneously and uses techniques for reaching an error minimizing solution quickly.  Though, it's important to note that it can only iteratively crawl toward the lowest error solution starting from the initial values of m and b you provide... like a drop of water sliding down to the bottom of a bowl.   If the bottom of the bowl is very bumpy, it can get stuck in one of the smaller divots and never reach the lowest part of the bowl.  This is known as getting stuck in a "local minima" and never finding the "global minimum" and the shape of the bowl is called the "cost surface".  When the cost surface of a problem is not very bowl-like, it can lead to problems.

Ceres can also handle something called "sparsity" efficiently.  This occurs when you have many many variables, but only a few of them interact with each other at a time. For example, the current position of a flying quad copter depends on the previous position and previous velocity. But, the current velocity doesn't really depend that much on the previous position.  Imagine if you made a giant table with all your input variables in the column names and all of your output values in row names and then put check mark in the table where ever the input was used to compute the output.  If most of the table is empty, then you have a "sparse matrix" and Ceres can take advantage of this emptiness (which indicates independence of the variables) to dramatically increase the speed of computation.

Anywhere you that have data, and you have a model (which just a fancy term for complicated formula) that should be able to generate that data and you want to tweak the values inside your generative model to best fit your data... a tool like Ceres might do the job.

For many problems, mathematicians and engineers have spent decades devising clever and complex formulas to solve them directly. But in many fields, having a computer perform non-linear optimization on data is becoming the preferred method because it makes it much easier to tackle very complicated problems with many variables, and often the result can be more stable to noisy input.

The neat thing about using a non-linear solver in a real-time system, is that the computer can respond to feedback in much the same way you do.  If an object is too far to the left of a target position, it knows to move it right.  If the wind starts blowing, and it drift backwards it will automatically respond by pushing forward.  As long as you have an equation to explain how the output will be affected by the controls.  It can figure out the best way to fiddle with the controls to minimize the distance from a target value.

If I find the time, I might try to post some tutorials on using Ceres. Because I believe this is one of the most powerful tools in modern engineering, and no one ever taught it to me in undergrad or high school.  It's like the difference between doing long division by hand and then being handed a calculator.

Wednesday, April 4, 2012

Projects at Google X

These past couple weeks, there have been a few videos released from the group I work in at Google. Congratulations to the many people in X who's hard work has gone into each of these.








Wednesday, November 30, 2011

Computer Controlling a Syma Helicopter


Recently, I've been playing with these inexpensive Syma Remote Control Helicopters. At the time they were only $20 (but seemed have been price adjusted for the holidays). They're quite robust to crashes and pretty easy to fly. For $20, they're a blast. The other interesting thing about these copters is that the controller transmits commands using simple infrared LEDs rather than a proper radio. This simplicity makes it tauntingly appealing to try reverse engineering. So tonight, I decided to do a little procrastineering and see if I could get my helicopter to become computer controlled.

For hardware, I've been liking these Teensy USB boards because they are cheap, small, versatile, and have a push-button boot loader that makes iteration very quick. They can be easily configured to appear as a USB serial port and respond to commands. For the IR protocol, I started with this web page which got the helicopter responding. But, the behavior I was getting was very stuttery and would not be sufficient for reliable autonomous control. So, I decided to take a closer look with an oscilloscope to get accurate timing from the stock remote control. Some of my measured numbers were fairly different for the web tutorial I found. But, now the control is fairly solid. So, here is the nitty gritty:

IR Protocol:

- IR signal is modulated at 38KHz.
- Packet header is 2ms on then 2ms off
- Packet payload is 4 bytes in big-endian order:
1. yaw (0-127) default 63
2. pitch (0-127) default 63
3. throttle (0-127 for channel A, 128-255 for channel B) default 0
4. yaw correction (0-127) default 63
- Packet ends with a stop '1' bit

Format of a '1' is 320us on then 680us off (1000us total)
Format of a '0' is 320us on then 280us off (600us total)

Packets are sent from the stock controller every 120ms. You can try to send commands faster, but the helicopter may start to stutter as it misses messages.

Download Teensy AVR Code (updated 11/30/2011)

The code is available at the above link. It's expecting 5 byte packets over the serial port at 9600 baud. The first byte of each packet must be 255, followed by yaw, pitch, throttle, and yaw correction (each ranging from 0-127). It will return a 'k' if 5 bytes are properly read. If it doesn't receive any serial data for 300ms, it will stop transmitting the IR signal.

Unfortunately, I can't help you write a program to communicate over serial since that will depend on your OS (Windows, Mac, Linux) and varies by language as well. But, it is fairly easy with lots of web tutorials. The harder challenge will be figuring out how to update the 3 analog values to keep it from crashing. =o) The most likely candidate is to use a camera (probably with IR markers) to monitor the position of the helicopter. But, getting that to work well is definitely a project unto itself.

Good Luck!

Tuesday, November 22, 2011

Shredder Challenge - Puzzle 2 done! Onto Puzzle 3


Puzzle 2 is now done! Puzzle 3 is a drawing (not text).

As we get to more complicated puzzles, it's clear that loading, rendering, and UI limitations will become a bigger and bigger issue. My colleague, Dan Maynes-Aminzade ("monzy" for short) is doing his best to figure out ways to handle that. There are a lot of not-ideal solutions.

It's also clear that more computer aided matching will be necessary to maintain progress. Here are zip files for the pieces of problem 4 and problem 5, if you want to try your hand at analyzing them directly.

Puzzle 4 pieces
Puzzle 5 pieces

If you come up with good ideas that work, post them in the comments.

Puzzle 1 done overnight!


Puzzle 1 was completed overnight! Very cool. They definitely get harder. But, the crowd helped UCSD complete puzzles 2 and 3 within a couple of days. So, we could easily catch up.

On to puzzle 2