All posts for the year 2012


When developing applications using OpenFlow, a small network of OpenFlow switches is often needed for testing and debugging. Using an emulated OpenFlow network for such development would be ideal, allowing more people to develop efficiently at their own pace. Mininet is developed to serve this demand. You can create virtual OpenFlow networks in your very own desktop/laptop.

See Mininet in action in the following tutorial.

OpenFlow is an open standard that enables researchers to run experimental protocols in the campus networks we use every day. OpenFlow is added as a feature to commercial Ethernet switches, routers and wireless access points – and provides a standardized hook to allow researchers to run experiments, without requiring vendors to expose the internal workings of their network devices. OpenFlow is currently being implemented by major vendors, with OpenFlow-enabled switches now commercially available.

See a video about how OpenFlow is changing Networking.

Inside Google, MapReduce is used for 80% of all the data processing needs. That includes indexing web content, running the clustering engine for Google News, generating reports for popular queries (Google Trends), processing satellite imagery , language model processing for statistical machine translation and  even mundane tasks like data backup and restore.

The other 20% is handled by a lesser known infrastructure called “Pregel” which is optimized to min erelationships from “graphs”.

While you can calculate something like ‘pagerank’ with MapReduce very quickly, you need more complex algorithms to mine some other kinds of relationships between pages or other sets of data.

See the following video tutorial to see an Introduction to Pregel

RDMA supports zero-copy networking by enabling the network adapter to transfer data directly to or from application memory, eliminating the need to copy data between application memory and the data buffers in the operating system. Such transfers require no work to be done by CPUs, caches, or context switches, and transfers continue in parallel with other system operations. When an application performs an RDMA Read or Write request, the application data is delivered directly to the network, reducing latency and enabling fast message transfer.

As disadvantage, the target node is not notified of the completion of the request, i.e. 1-sided communication. The common way to notify it is to change a memory byte when the data has been delivered, but it requires the target to poll on this byte. Not only does this polling consume CPU cycles, but the memory footprint and the latency increases linearly with the number of possible other nodes, which limits use of RDMA in High-Performance Computing (HPC). Moreover, RDMA reduces network protocol overhead, leading to improvements in communication latency. Reductions in protocol overhead can increase a network’s ability to move data quickly, allowing applications to get the data they need faster, in turn leading to more scalable clusters. However, one must be aware of the tradeoff between this reduction in network protocol overhead and additional overhead that may be incurred on each node due to the need for pinning virtual memory pages. In particular, zero-copy RDMA protocols require that the memory pages involved in a transaction be pinned, at least for the duration of the transfer. If this is not done, RDMA pages might be paged out to disk and replaced with other data by the operating system, causing the DMA engine (which knows nothing of the virtual memory system maintained by the operating system) to send the wrong data.

As advantage, the Send/Recv model used by other zero-copy HPC interconnects, such as Myrinet or Quadrics, does not have the 1-sided communication problem or the memory paging problem described above, yet provides comparable reductions in latency when used in conjunction with HPC communication frameworks that expose the Send/Recv model to the programmer.

See the following talk by Joel Scherpelz from Nvidia who presents: “RDMA for Heterogeneous Parallel Computing”

Storm makes it easy to write and scale complex realtime computations on a cluster of computers, doing for realtime processing what Hadoop did for batch processing. Storm guarantees that every message will be processed. And it’s fast — you can process millions of messages per second with a small cluster. Best of all, you can write Storm topologies using any programming language. Storm was open-sourced by Twitter in September of 2011 and has since been adopted by numerous companies around the world.

Storm provides a small set of simple, easy to understand primitives. These primitives can be used to solve a stunning number of realtime computation problems, from stream processing to continuous computation to distributed RPC.  The following Youtube video shows a talk about Storm in which you may see:

– The concepts of Storm: streams, spouts, bolts, and topologies
– Developing and testing topologies using Storm’s local mode
– Deploying topologies on Storm clusters
– How Storm achieves fault-tolerance and guarantees data processing
– Computing intense functions on the fly in parallel using Distributed RPC
– Making realtime computations idempotent using transactional topologies
– Examples of production usage of Storm