March 31, 2014
Explore the data movement through the MapReduce architecture and the API calls used to do the actual processing, along with customization techniques and function overriding for application specific needs.
March 18, 2014
Learn more about writing MapReduce programs with the language of your choice with Hadoop Streaming.
February 26, 2014
This tutorial explians how to use the REST API and OAuth together in order to create a secure web service.
January 29, 2014
Apache HBase is a distributed, non-relational and open source database written in Java that runs on top of HDFS. HBase is a suitable candidate when you have hundreds of millions or billions of rows and enough hardware to support it. Learn more about it's practical use and architectural concepts.
January 23, 2014
Cassandra is an ideal database for managing a large volume of unstructured, semi-structured and structured data across multiple data centers and the cloud environment. Exolore how to get started.
January 10, 2014
Explore a basic vision for a single and multicore approach to indexing and querying multiple log file types in Apache Solr.
December 30, 2013
Apache Hive provides a mechanism to manage data in a distributed environment and query it using an SQL-like language called Hive Query Language, or HiveQL. This article will discuss Hive scripts and execution.
December 17, 2013
There's no such thing as failure when innovating. There are only successes and "learning experiences." My guess is that OpenStack will end up being the latter.
December 16, 2013
Kaushik Pal provides some samples and tips on how to use Apache Pig for efficient analysis of large data sets.
December 6, 2013
Integrate the JVM into the OS kernel. Strip out everything in the OS that the hypervisor is taking care of (like network access) or that the JVM doesn’t need. What you end up with is an lean mean runtime-specific machine that runs much faster than a normal VM, and is more secure as well, simply because there is much less of it to hack.
November 27, 2013
Kaushik Pal explores the basics of the Hadoop Distributed File System (HDFS), the underlying file system of the Apache Hadoop framework.
November 26, 2013
Nobody really wants a Private Cloud
November 22, 2013
Kaushik Pal explores the processing of Big Data using the Apache Hadoop framework and Map-Reduce programming.
November 15, 2013
The greatest bottleneck in any large scale Hadoop deployment is the local network
November 11, 2013
Learn more about how Ant, in collaboration with JUnit, helps developers to follow the test-driven development approach.
November 8, 2013
The better Hadoop gets, the less of a Big Data tool it becomes.
November 4, 2013
If you are a developer or a designer, it is very likely that you will work in a team comprised of different people with different habits, motivation, work and coding styles. This article will provide a few simple tips that will make working in teams more efficient and productive.
October 20, 2013
The Bazaar supports change better than the Cathedral
October 18, 2013
The Runnable platform is designed to enable developers to discover, run and reuse community-sourced code from their browsers.