The Road to Ruby from C++

The Road to Ruby from C++

f you’re a C++ developer who’s curious about all of the hype surrounding the Ruby programming language, this article is for you. It provides a high-level overview of the key differences between C++ and Ruby, and then presents a small, complete example application implemented with each language.

Be forewarned, however: learning Ruby can be a very frustrating experience! Because once you become familiar with this powerfully concise language, you might find returning to C++ a bitter pill to swallow.

A High-Level Language Comparison and a Running Example
C++ is a statically typed, compiled language that has hybrid object orientation. Its static typing means that the type of every expression and variable is known at compile-time, allowing significant correctness checking before the program executes. Its hybrid object orientation means that it defines non-object primitive types such as int and float, and functions can exist outside of objects.

The Ruby programming language is designed to let you write code quickly and concisely. Unlike C++, it is a very dynamic, interpreted language that includes a powerful set of libraries. While it is often referred to as a scripting language, it is a pure objected-oriented language that has sufficient expressiveness for general-purpose applications.

In Ruby, variables do not need to be declared and are free to change type from statement to statement. So the following code, where the variable x changes from a FixNum (an integer that fits within a native machine word) to a String to an Array, is a perfectly legal sequence of Ruby code:

x = 10x += 4x = "My String"x = [1, "My String", ]

A significant downside of Ruby’s dynamism is its use of an interpreter. Ruby’s runtime performance just can’t compare to a compiled language like C++. So even if you find yourself in love with the features of Ruby, you’re likely better off sticking with C++ if you really need runtime efficiency.

Having digested some of the key differences between C++ and Ruby, you’re now ready to examine the small, complete example application implemented with each language. The application calculates the total number of occurrences for each word found in a set of files with a given directory, and generates an XML file that summarizes these occurrences as output. Listing 1 shows the C++ implementation, and Listing 2 shows the Ruby implementation. (Download the listings here.)

The Basics of Classes and Variables
Both versions define the following three classes:

  1. A class that represents the total number of occurrences of a word across all files
  2. A class that is derived from this class and extended to also maintain the occurrences of each word by file
  3. A counter class that reads and parses the files, creates and updates the word counts, and outputs the XML file

The first class is defined in Listing 1 at line 22 and in Listing 2 at line 4. Both implementations maintain a string word and a total_count variable. In Ruby, instance variables are preceded by a “@” symbol, and therefore Listing 2 has @word and a @total_count. Local variables have no prefix and global variables have a “$” symbol as a prefix.

The C++ code uses a struct to declare this class. Therefore, the variables word and a total_count are public by default. Ruby, however, does not allow access to instance variables from outside of the object; all instance variables are private. You’ll learn more about access control later, but for now focus on adding the needed accessor methods. Luckily, as the statement at line 7 of Listing 1 demonstrates, adding these methods is no chore. You can automatically generate the needed get accessor methods by listing the variables after attr_reader.

Both implementations of this class also define a constructor that takes the word string, a method add that increments the counters, and a method file_occurrences that returns a data structure that holds per-file information. As shown on line 9 of Listing 2, the class constructors in Ruby are named initialize.

If you ignore the “include Comparable” in the Ruby code until later, the remainder of the implementation for this base class is then fairly straightforward for both languages.

Inheritance and Polymorphism
The next class defined in both files inherits from this simple base class, extending it to also track the total number of occurrence for the word in each file. The C++ implementation uses “: public word_count” to indicate the inheritance. The Ruby implementation uses ““. In both cases, the extended class adds a hash map to store the occurrence count associated with each processed file. The method add is extended to update the hash map, and the method file_occurrences returns this information.

There are few key differences between inheritance in C++ and inheritance in Ruby. Ruby, unlike C++, does not support multiple inheritance but does support mixins. The “include Comparable” found on line 16 of Listing 2 is an example. A Module is a set of function definitions. You can’t create an instance of Module; you can only include it into class definitions. In this case, Module Comparable defines the comparison operators (, >, ==) in terms of the operator. So by defining and including Module Comparable, you get the other comparison operators for free.

In C++, you sometimes rely on inheritance combined with virtual functions to enable polymorphism. A pointer x of type T * can point to an object of type T or any object with a type below T in the class hierarchy. A virtual method invoked through x is resolved by walking up the class hierarchy, starting from the type of the object pointed to by x.

Ruby on the other hand uses duck typing?if something looks like a duck, swims like a duck, and quacks like a duck, then it’s a duck. Take the following code for example:

def my_method(x)  x.print_helloend

For Ruby, it doesn’t matter what type x is. If the object x has a method print_hello, the code will work. So unlike C++, which would require the objects to inherit from a common base type, you can pass objects of unrelated types to my_method, as long as they all implement print_hello.

Visibility and Access Control
The final class in the example applications is implemented starting at line 67 in Listing 1 and line 54 in Listing 2. This class iterates through all of the files in the provided directory, breaking the file into tokens and counting the occurrences of each word. It also defines a method to dump the XML output.

Both C++ and Ruby support public, protected, and private members. In the example, the method count_words_in_file is declared as private in both implementations.

In both C++ and Ruby, public methods can be called by anyone, and protected methods can be called only by objects of the same class or objects that inherit from the defining class. The semantics of private differ between C++ and Ruby, however. In C++, methods are private to the class, while in Ruby they are private to the instance. In other words, you can never explicitly specify the receiver for a private method call in Ruby.

Blocks and Closures
One feature of Ruby for which C++ has no good counterpart is its support of blocks. The most common use for blocks is iteration. You’ll find examples of iteration with blocks in Listing 2’s implementation of the WordCounter class at lines 65, 73, and 76.

Consider the code at line 65 for example:

    Dir.foreach(".") { |filename|       count_words_in_file filename     }

The class Dir is used to inspect directories. The method foreach is passed two arguments: the string “.” and the block of code in parentheses, which in turn specifies an argument filename within the vertical bars. The method foreach iteratively invokes the block, passing in the names of each file found in the current working directory “.”. This simple feature can save significant keystrokes and also leads to very readable code.

Blocks also are useful for more than just iteration. Line 48 uses a block to specify the comparison function to use for sorting an array of pairs:

  def file_occurrences    return @file_hash.sort { |x,y| –(x[1]y[1]) }  end

The expression x y is -1 if x y. The above code returns an array of pairs, where each pair consists of a String and a FixNum. The block specifies that the second element of each pair (the FixNum) should be compared using the negation of the operator. This method therefore returns the word occurrences pairs in decreasing order of occurrences.

At line 15 of Listing 1, the C++ example code also specifies a comparison function to be used for ordering pairs. However, lacking support for anonymous functions, the comparison is implemented as a function object, later used to define the STL multiset at line 25.

The blocks in Ruby are so useful in part because they are closures. A closure captures the context in which it is defined, so it can refer to local variables found within the scope of the definition. Take for example the code at line 76 of Listing 2:

      wc.file_occurrences.each { |pair|         f = e.add_element "file", {"occurrences"=>"#{pair[1]}"}        f.add_text pair[0]      }

The expression wc.file_occurrences returns an array of pairs. The array’s method each is then invoked with the subsequent block as an argument. It’s important to note that the block will be invoked from within the method each. However, because the block is a closure, it can still access the object e (which represents an XML element) that was in the local scope of the method where the block was defined.

While you can use C++ function objects to implement much of the functionality described above, I believe that the elegance and readability of blocks speak for themselves.

A Wide Range of Libraries, Regular Expressions
Another clear advantage that Ruby has over C++ is the vast collection of libraries that come with the standard distribution, as well as its support for regular expressions. For example, compare the ad-hoc implementation of an XML generator in method dump_results in Listing 1 to the use of the REXML library in Listing 2. Next, note the use of the pseudo-standard dirent.h for working with directories in C++ to that of the class Dir in the Ruby implementation. Finally, compare the ad-hoc parsing of the files at line 77 of Listing 1 to the much more concise code at line 90 of Listing 2.

While many available C++ libraries, such as Boost, provide a wide range of utilities, Ruby provides many of these features as part of the standard distribution. So out-of-the-box, the Ruby programming language and its standard libraries simplify many of the more common programming tasks when compared to C++.

A Summary of Language Features
To summarize, Table 1 presents some of the key features of C++ and Ruby discussed in this article (with Java included for comparison).

Language Features C++ Java Ruby
Type System Static Mostly static Dynamic
Object Orientation Hybrid Pure (with primitive types) Pure
Inheritance Multiple Single and interfaces Single and mixins
Overloading Method and Operator Method Operator
Polymorphism Yes Yes Yes
Visibility/Access Controls Public, protected, private, and friends Public, protected, package, and private Public, protected, private (instance)
Garbage Collection No Yes Yes
Generics Yes (templates) Yes (generics) No
Closures No (function objects) Yes, with some limitations (inner classes) Yes (blocks)
Exceptions Yes Yes Yes
Table 1. A Comparison of Features in C++, Java, and Ruby

Beyond the Basics
Ruby provides many more features and libraries than one article can cover, but this small sampling should give you some idea about where in your next programming project it would be useful. There are many excellent resources available for learning more about this powerful, concise programming language. The best places to start are at and


Share the Post: