DevX HomePage

Pulling it All Together with IBM Rational Data Architect

IBM's new Rational Data Architect gives database administrators powerful new tools for visually modeling and documenting databases, as well as integrating federated data sources by creating mappings, discovering relationships, and building transformations. This article gives new users a top-level view of RDA and touches on some of its core features with a couple of quick walkthroughs.

Problem: your company is going through yet another "process re-alignment" and you've been given the unenviable task of merging multiple databases into one Damn Big Mega System (DBMS). To make the project even more unappealing, your sales group has been using Informix, billing is in SQL Server, the new company you just acquired is married to Oracle, and some R&D guys have been keeping everything in Derby. Oh, and the Powers That Be want it all converted to DB2.

Before you pull out the whiteboard and order cases of Red Bull, take a moment to assess the situation. The hardest, probably the most important part of a project like this is how you start it. How will you get a birds-eye view of all of the schemas? How are you going to map the fields of all these different systems? Most of all, how long will all of this take?

Got a few minutes?

Ok, a few minutes might be a slight exaggeration. But once you discover what IBM has in store for you next, you can start planning a few extra "work from home" days after the project launches ahead of schedule. Rational Data Architect is the tool you've been waiting on for years. IBM, what took you so long?

Introducing IBM Rational Data Architect (RDA)
RDA is a data modeling tool designed for discovering, mapping, and integrating your databases. With RDA, you can create fresh schemas or connect to data sources and automatically generate models of existing schemas. Use RDA's intelligent mapping technology to discover how data is related between heterogeneous databases. This way, you can quickly model a solution based on integrated information or even design a federated data solution.

First, RDA uses JDBC to connect to a variety of data sources, then communicates with them using native queries. Probably the friendliest database tool on the market, it'll hook up with just about anything:

DB2 Universal Database for Linux, UNIX, and Windows, Enterprise Edition and Workgroup Edition

  • Version 8.1
  • Version 8.2


  • DB2 Universal Database for iSeries
  • Version 5 Release 2
  • Version 5 Release 3


  • DB2 Universal Database for z/OS
  • Version 7.1
  • Version 8.0


  • Derby
  • Version 10.0


  • Informix Dynamic Server
  • Version 9.2
  • Version 9.3
  • Version 9.4


  • Microsoft SQL Server Enterprise
  • 2000 SP2


  • Oracle 8i
  • Release 3 8.1.7


  • Oracle 8i Enterprise Edition
  • Release 3 8.1.7


  • Oracle Enterprise Edition 9i
  • Version 9.2


  • Oracle 10g

    Sybase Adaptive Server Enterprise
  • Version 12.0
  • Version 12.5

  • Second, after your data sources are connected, you can create data models based on your existing databases, then create mapping models that analyze your schemas and discover possible relationships. The mapping model provides a number of algorithms that can be used to do the discover function.

    Third, with the mappings in place, you can add some transformations using a prebuilt set of functions. Generating SQL scripts that can be deployed in the platform of your choice completes the integration.

    This is just one type of process that RDA gives the harried database developer. When designing a new database solution, the bulk of the work goes into creating and agreeing on the architecture. Use RDA to quickly draw and review your schema. Automatically generate PDF or HTML reports. Fiddle with it until you get it right, then pop it into SQL and make it happen. RDA has a way with all the little tasks that go into planning and executing these types of projects.

    Figure 1. RSDP Welcome Page

    Eclipse Lovers Rejoice
    RDA is part of a larger suite of tools called the IBM Rational Software Development Platform (RSDP). Based on Eclipse, RSDP makes good use of the incredibly rich UI of this open source favorite. While viewing your Workbench, or working environment, all the information you need is organized into Views, discreet windows of metadata and interactivity. Stick these views just about anywhere on the screen, shuffle them around, then save the entire Perspective, or layout, for use later. RSDP ships with multiple Perspectives that load automatically based on what kind of project you start. Swap or tweak your layout on the fly or customize the entire Workbench to your heart's content.

    New users will especially appreciate RSDP's use of Capabilities. Essentially, the more advanced features are hidden but accessible when needed. RSDP remembers what's been used and offers those options later, but without the all-or-nothing tact that a lot of IDEs take. Think of it as progressive robustness, an IDE that keeps pace with your level of knowledge.

    Handholding Done Right
    As one of its most impressive, and rare, characteristics, RSDP seems to embrace new users with open and loving arms. With a friendliness that is, quite frankly, very un-IBM, RSDP provides newbies with huge amounts of resources, starting with its Welcome Page, seen in Figure 1.

    Figure 2. Data Capabilities

    The clean, elegant Welcome screen gives you the following options:

    Unlike most developer products on the market, including a lot of other IBM tools, Rational Data Architect makes no assumptions about your level of knowledge, but rather welcomes everyone. New users can start from page one and go through the tutorials and resources they need to understand the tool. More advanced users can skip right to the features they need without wading through newbie stuff.

    A particularly nice treat: RDA installs with an instance of Derby, along with a sample database. It's like this—you're going through one of the hands-on tutorials and get to the point where you need to connect with a database. Anyone who has gone through the task of installing and configuring an open source database, heck any database, will be shocked to discover that they already have a fully functioning sample ready to play with. Just follow the tutorial step by step. Believe it or not, it works beautifully right out of the box.

    The only problem I encountered while testing the beta came when I was trying to switch between the tutorial and the workbench while I had a wizard or dialog box open. Sometimes the IDE was reluctant to give up control of my screen. Usually this wasn't a problem, but when I had to scroll down in the documentation but couldn't, I sometimes had to make an educated guess as to which button to push next in the wizard. Overall, I would sometimes run into a minor UI problems like this, but for the most part this was a surprisingly strong beta.

    Quickie Walkthrough: Model and Create a DB2 Database
    Enough chit chat. Let's take a look at this thing. If you download RDA from IBM's software site then you can follow along. But keep in mind the following breezethrough is intended just to give you an idea of what the product looks like and how it works. The tutorials included with the product will give you a much more extensive hands-on experience of RDA's features.


    Figure 3. Default Database Vendor
     
    Figure 4. Data Design Project

    A reader recently asked me if I knew of some software that would visually model a database and then generate creation scripts for DB2. That pretty much sounds like a description of RDA, so that's where I'll start. RDA lets you visually model databases using Logical Data Modeling, which is not based on a particular database, or Physical Data Modeling, which is. Here are the basic steps to creating, for example, a DB2 database. In this example, I want to create a Physical Model, then export the SQL creation scripts, which I can run in DB2.

    Step 1: Start by Setting Workbench Preferences and Creating a New Data Modeling Project

    Step 2: Add a Physical Data Model to the Project

    Step 3: Create a Visual Model

    We don't want to get too fancy, so this example represents an over-simplified client management system, with a Clients table linked to Contacts, Campaigns, and Invoices.


    Figure 6. Simple Diagram
     
    Figure 7. Generate DDL

    Step 4: Create the SQL Scripts

    In the Data Project Explorer, if you expand New Database, you'll see all the objects you just created.

    For extra credit, be sure to explore the advanced features of the Physical Data Model, particularly when modeling a DB2 database. This includes using the Properties view to alter a number of different aspects of the database and its objects, such as:

    Quickie Walkthrough: Map to a Derby Table

    Figure 8. New Connection Wizard
    RDA allows you to map the relationships between data sources by generating models of existing tables and mapping them to one another. Mapping helps you get a quick understanding of the relationships involved and potential complications. In addition, by visually building transformations, you can get a jumpstart on some tricky SQL code when it comes time to merge your data sources.

    As mentioned, RDA comes pre-installed with a working Derby sample database, called SAMP. No configuration necessary. So for this next trick, I'm going to map to an existing table called PROJECT, which I'll pretend is the equivalent of the Campaigns table created in the previous walkthrough. In a multi-database scenario like the one at the beginning of this article, you can use this same basic process once you add connections to your various databases.

    Step 1: Generate a Model for PROJECT

    Figure 11. Mapping Summary

    Step 2: Create a New Mapping Between Campaigns and PROJECT

    Step 3: Draw Some Mappings

    Figure 12. Discover Relationships

    If you were working on a more complex project, you would have many more fields and thus many more potential parallels between your tables. For example, if merging an Employees table from your original company with an EmployeeList table from a recently acquired company, you would have several natural matches between fields like LastName, Address, etc. One of the best features of RDA is its ability to discover these relationships and give you a big head start on your mappings. To do this, you would just right-click the Mappings space between the two tables and select Discover Relationships, as in Figure 12.

    You can learn more about this process by using the tutorials on Integration Modeling included with RSDP. For now, since the tables in question are so light, you can just create the mappings manually.

    Step 4: Create Transformations

    If you look closely at the data types, you may notice that right off, you have a problem. Your original key is CHAR(6) while your new schema specifies INTEGER. A quick way to solve this problem is to Transform the PROJNO field en route to its new identity as CampaignID.


    Figure 13. Mappings
     
    Figure 14. Transformation Expression

    Step 5: Pull It All Together

    Now that you have a mapping, what do you do with it? For a start, right-click on the mapping you created in the Data Project Explorer, in this case ProjectMap.msl. Then choose Generate Script. You'll see that the transformation you just created can be described as SQL code, which can be integrated into your migration solution. In addition, now that you have a solid visual for your inter-database relationships, you can communicate this to other team members or generate additional SELECT and INSERT scripts to assist with data migration.

    Now, imagine this same process but with four or five databases, each with dozens of tables, each of those with dozens of fields. You can see how much pain RDA will save you. You can take that Informix database from Sales, that SQL Server db from billing, the new Oracle system, and even the little Access tool from Marketing Guru, and automatically generate data models, automatically discover relationships, and quickly create transformations. When you're done, generate SQL script and you have the beginnings of a solution that will impress the boss and make you a hero in the eyes of adoring and relieved co-workers.

    Where to Go Next
    You just got a birds-eye view of RDA. This article touched on some of the core features, but touched on them very lightly. For a more detailed explanation of the product and its potential, download it from here (415MB download). Also, be sure to read Davor Gornik's article, "Use Rational Data Architect to Integrate Data Sources."



    Justin Whitney is a regular contributor to DevX.com and Jupitermedia. He currently lives in San Francisco, where he consults for leading high-tech firms and writes about emerging technologies.