Open Source Zone

Composer: Dependency Manager for PHP

A few years ago, installing PHP libraries was a nightmare. The first issue was that you either found framework agnostic code, or you found just plain old classes all bound

Automate Your Infrastructure with Ansible

Ansible is a tool that allows you to control remote servers from the comfort of your laptop. It works over SSH and doesn’t require any special software or agent to

First Steps with Vagrant

Vagrant allows you to easily manage and control multiple virtual machines. It is built on top of VirtualBox and VMWare and it provides many exciting capabilities. You can create isolated

How to Create Your First Hive Script

Overview Apache Hive is an integral part of Hadoop eco-system. Hive can be defined as a data warehouse-like software that facilitates query and large data management on HDFS (Hadoop distributed

Working with MapReduce Design Patterns

Design patterns are common in almost all levels of software development and are nothing more than proven and tested design techniques used to solve business problems. MapReduce is no different

Apache Mahout and Machine Learning

Apache Mahout is an open source project from the Apache Software Foundation (ASF) with the primary goal of creating a machine learning algorithm. Introduced by a group of developers from

Intro to Apache MapReduce 2 (YARN)

Since Hadoop version 0.23, MapReduce has changed significantly. It is now known as MapReduce 2.0 or YARN. MapReduce 2.0 is based on the concept of splitting the two major functionalities