Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX

Posted by Sandeep Chanda on April 28, 2016

Alberto Brandolini has been evangelizing the idea of EventStorming for a while now. It is a powerful workshop format to breakdown complex business problems pertaining to real world scenarios. The idea took its shape from the Event Sourcing implementation style laid out by Domain Driven Design. The outcome of the workshop format produces a model perfectly aligned with the idea of DDD and lets you identify aggregate and context boundaries fairly quickly. The approach also leverages easy-to-use notations and doesn't require UML, that in itself, might become a deterrent for some participants of a workshop who are not so familiar with UML notations.

The core idea of EventStorming is to make the workshop more engaging and evoke thought provoking responses from the participants. A lot of times discovery is superficial and figuring out the details are deferred for later. On the contrary, EventStorming allows participants to ask some very deep questions about the business problem that were very likely playing in their sub-conscious minds. It creates an atmosphere where the right questions can arise.

A core theme of this approach is unlimited modeling space. Modeling complex business problems is often constrained by space limitations (mostly whiteboard), but the approach allows anything to be leveraged as a platform where the problem can be modeled. You may pick anything that can come in handy and help you get rid of the space limitations.

Another core theme of this approach is the focus on Domain Events. Domain Events represent meaningful actions in the domain with suitable predecessors and successors. Placing events in a timeline on a surface allows people to visualize upstream and downstream activities and model the flow easily in their minds. Domain events are further annotated with user actions that are represented as Commands. You can also color code the representation to distinguish between user actions and system commands.

The next aspect to investigate is Aggregates. Aggregates here should represent a section of the system that receives the commands and decides on their execution. Aggregates produce Domain Events.

While a Domain Event is key to this exploration technique, along the way you are also encouraged and motivated to explore Subdomains, Bounded Context and User Personas. Subsequently, you also look at Acceptance Tests to remove any amount of ambiguity arising out of edge-case scenarios.

Posted by Gigi Sayfan on April 27, 2016

In recent years the database scene has been boiling. New databases for special purposes seem to emerge every day and for a good reason. Data processing needs have become more and more specialized and with scale you often need to store your data in a way that reflects its structure in order to support proper queries. The default of storing everything in a relational database is often not the right solution. Graphs, which are a superset of trees and hierarchies are a very natural way to represent many real world concepts.

Facebook and its social graph brought it to center stage, but graphs, trees and hierarchies were always important. It is possible to model graphs pretty easily with a relational database because a graph is really just vertices connected by edges, that map very well to the entity-relational model. But, performance is a different story at scale and with deep hierarchies.

Many graph databases exist today with different levels of maturity. One of the most interesting developments in this domain is the TinkerPop Apache incubator project, which is a graph computing framework with active industry contribution. Checkout the getting started tutorial here to get familiarized with the terrain.

Other interesting projects include DataStax (Cassandra) acquiring TitanDB and incorporating it and of course Facebook's GraphQL. And, there is always Neo4J that has a mature enterprise-ready solution and an impressive list of customers.

The exciting part is how to combine graph databases with other data systems and how to construct viable data intensive systems that scale well to address new data challenges such as the coming of the IoT era where sensors will generate unprecedented amounts of data.

Posted by Sandeep Chanda on April 20, 2016

At the recently concluded Build Conference, the Microsoft Azure team announced the preview of Functions, an app service that allows you to execute code on demand. Azure Functions is event-driven and allows serverless execution of code triggered as a result of an event or a coordinated set of events. It allows you to extend the application platform capabilities of Azure without having to worry about compute or storage. The events don’t have to only be occurring in Azure, but in virtually any environment including on-premise systems. You can also make Functions connected to be part of a data processing or messaging pipeline, thereby executing code asynchronously as part of a workflow. In addition, Azure Functions Apps can scale on demand, allowing you to pay for only what you use.

Azure Functions offers support for a myriad of languages, such as PHP, C# and most of the scripting languages (Bash and PowerShell, for example). You can also upload and trigger pre-compiled code and support dependencies using NuGet and NPM. Azure Functions can also run in a secure setting by making them part of an app service environment configured for a private network. Out of the box, it supports OAuth providers such as Azure Active Directory, Facebook, Google, etc.

To create a Functions app login to your Azure portal and search for Functions in the Marketplace search bar.

Select the Function App service, provide a name, select a resource group and create the app.

Once you create a Function app, it also creates an AppInsights monitoring app that attaches to monitor the health of the Functions app. After the Function app is deployed, you can navigate to the Function App console from the dashboard. The console has two windows. The Code window (an editor with a code highlighter) is where you can put your script and then you can verify the outcome under the Logs window.

You can then make the Function integrate with an event driven workflow as illustrated in the figure below:

You can configure an input and an output for the trigger in the Integrate tab. In this example, a storage queue is set as the default trigger but you can set it to be a Schedule, a Webhook, or Push Notifications amongst others.

There are also a bunch of pre-defined templates you can use to start creating your function from scratch.

Additionally, there are several advanced settings available for you to configure, for example setting up authentication, authorization, integrating with a source control for continuous integration, enabling CORS and providing API definitions to allow clients to easily call them.

In a world heavily dominated by the likes of Amazon, Google, and IBM, it is a good move for Microsoft to release Azure Functions, providing more options for developers and enterprises to choose and extend their PaaS implementation.

Posted by Gigi Sayfan on April 18, 2016

Many organizations have a hard time making decisions about adopting new technologies. There are two prominent camps. The early adopters will jump at any new technology and push towards integrating it or replacing last week's new tech. Then, you have the careful crowd that just doesn't want to rock the boat. As long as everything works they are not interested in switching to something better. Between those two extremes of risk taking and risk aversion are the rest of us, trying to get by and take advantage of new development, but not at the cost of bringing the whole system down. So, what is a rational process for adopting new technologies?

There are several principles that will serve you well. Newer technologies, even if demonstrably better than incumbent alternatives, take time to reach maturity. The more complicated and crucial — the longer it takes to reach maturity. For example, a new library for parsing text may be fine to switch to after playing with it a little bit and verifying it works well. A totally new distributed NoSQL database is a different story and will probably require several years until you should put your company's fate there. The crucial element then is timing.

If a new technology has been battle tested long enough, has an active community and demonstrates clear benefits on a more mature alternative you may consider switching to it even for important projects. A good example is the Flask vs. Django debate. At this point, Flask has crossed the critical threshold as far as I'm concerned. When you do decide to adopt a new technology, you should do it gradually (maybe start with a non-critical small project) and have a plan B (probably stick with what you have) in case you discover unforeseen issues.

Posted by Sandeep Chanda on April 11, 2016

The Azure Batch service is now available for general use. It is a fully managed service hosted in Azure that lets you configure scheduled jobs and supports performing compute resource management for other cloud based services. It is a turnkey solution for running large scale High Performance Computing (HPC) applications in parallel, leveraging the cloud scale. Note that it is a platform service that allows you to run resource intensive operations on a managed collection of virtual machines and can scale automatically depending on the need.

There are several use-cases for Azure Batch including scientific computations such as Monte Carlo simulations, financial modelling, media transcoding, and a more common scenario with automated testing of applications. The Azure Batch service works very well with scenarios that are intrinsically parallel in nature. Scenarios where a workload can be broken into multiple tasks that can run in parallel are the best possible use cases for Azure Batch service. Not only can the managed service run multiple workloads, it can also be configured for parallel calculations with a reduce step in the end.

To configure a new Batch service, login to your Azure portal and then find the Batch managed service from the Azure Marketplace by typing Batch in the search window.

Specify a name for the Batch service and configure the resource group and the storage account:

Once the service is deployed you will see the dashboard to configure the applications and jobs as illustrated in the following figure:

Now that you have successfully created the Batch Account, you can use the batch service most commonly in two ways:

  1. Use the Batch .NET API to programmatically schedule the job
  2. Use it part of a larger workflow like Azure Data Factory

In addition, there is an Azure Batch Explorer sample application available in GitHub that you can run to browse and manage the resources in your Batch Account.

You can use the Batch API to perform tasks such as creating a Job, provisioning a Schedule and adding Tasks to a Job. For example, you can create a console application that reads from a file and performs multiple parallel operations based on the content of the file. You can store the application in Azure Blob Storage and then configure the Job to run the application on a regular interval.

Posted by Gigi Sayfan on April 5, 2016

When building, developing and troubleshooting complex systems everybody agrees that modularity is the way to go. Different parts or components need to interact only through well-defined interfaces. This way the complexity of each component can be reduced to its inputs and outputs. That's the theory. In practice, this is extremely hard to achieve. Implementation details are sometimes hard to contain. This is where black box and white box testing can come in handy, depending on the situation.

Consider an interface that expects a JSON file as input. If you don't specify exactly in the schema, then the format of the file can change and break interactions that worked previously. But, even if you put in the work and discipline and properly separated all the concerns and rigorously defined all the interfaces, you're still not in the clear. There are two big problems:

  1. If your system is complicated enough then new development will often require changing interfaces and contracts. When that happens you still need to dive in and understand the internal implementation.
  2. When something goes wrong, you'll have to troubleshoot the components and follow the breadcrumbs. There is no escaping the complexity when debugging. Under some circumstances, very well factored systems that abstract as much as possible are more difficult to debug because you don't have access to a lot of context.

But, black box and white box testing is not just about the system. It's also a property of people working with the system. Some people thrive on the black box view and keep an abstract view of the components as black boxes and their interactions. Other people, must see a concrete implementation and understand how a component ticks before they can climb up the abstraction ladder and consider the whole system.

There are good arguments for both views and while working with most complicated systems you should be able to wear both hats at different times.

Posted by Sandeep Chanda on March 24, 2016

Azure Data Factory offers capabilities to orchestrate data movement services that can scale using Azure Infrastructure. Not just that, you can visualize the data lineage connected to both on premise and cloud data sources and monitor the health of the pipeline as well. A few weeks back the Azure team published a Code-free Copy tool for Azure Data Factory that allows hassle free configuration and management of data pipelines without having to write any script using a declarative designer. A simple wizard allows you to explore data sources between various cloud offerings like SQL Azure, Azure Storage, Azure SQL Data Warehouse etc., as well as your local SQL Server database. You can also preview the data, apply expressions to validate and perform schema mapping for simple transformations. You get the scalability benefits of Azure and hence you can move hundreds and thousands of files and rows of data efficiently.

To start, first login to your Azure portal and search for Data Factory under the Data + Analytics marketplace segment. Create an instance of Azure Data Factory as shown in the figure below.

After creating the Data Factory instance, you will now see an option called Copy Data (Preview). Click on it to launch the Copy Data wizard.

The first step in the wizard is to set the properties like name and the schedule configuration, whether you want to run it just once or create a job that runs on a regular interval.

After configuring the schedule the next step is to define the data source. Pick a connection from the available list of stores. In this example we selected the Azure Blob Storage.

After selecting the data source connection, the wizard will direct you to the connector specific steps. In this example, it will prompt you to select the folders / files from where you need the data copied.

You can also then provide additional properties to select specific content from the folder / file like the text format, delimiter etc.

You can preview the data and then set the destination data source to where the data will get copied at a regular interval. In the destination, as well, you can specify the properties to merge or append content. Once set, review the summary and then save to complete the wizard and it will get triggered based on the schedule.

Posted by Sandeep Chanda on March 11, 2016

Still in preview, Microsoft Azure Logic Apps is offered as part of the Azure App Service infrastructure that provides declarative integration services for you to automate your business processes and simplify Enterprise-grade integration. Last month the Azure team released another refresh for Logic Apps which included some major designer upgrades. It now follows a more intuitive top down declarative workflow style approach, a nice change from the rather cumbersome left to right approach that they followed during the first preview. Also a nice change is the intuitive search for connectors built into the designer itself. Earlier, the connectors were available in a pane on the right hand side of the designer and there was some confusion around using them.

In my first few tries, I was trying to drag them on to the designer surface, which didn't work and after a few tries I realized that you have to click it to position it on the designer. Not particularly intuitive. Some other major enhancements include support for Swagger (Open API), and Managed API connections. A set of secure API connections are already deployed that you can search and add during the process of adding a step on the workflow. Deployments are also made simple with this enhancement. In addition, Logic Apps will now have Native Webhook support, allowing you to subscribe to events in a Webhook. You can also trigger one on receive of an HTTP POST and wait for a Webhook in the middle of a workflow.

To create a Logic App, login to your Azure portal and go to Marketplace from the dashboard.

Under Marketplace, search for Logic App as shown in the figure below:

Click on Logic Apps to create a new Logic App. Specify the name and the App Service Plan. It will take a while to generate the Logic App and you will be redirected to the declarative designer to now start creating the workflow.

Not only you can search Managed APIs to kickstart your workflow, but you can also add condition steps to branch out the workflow.

You also have the option to choose Open API end points using the Http + Swagger shape as illustrated below.

To look at the generated code, you can also switch to the Code Window and you will see the JSON schema that forms the workflow definition.

Posted by Gigi Sayfan on March 4, 2016

We live in exciting times. The rate of technological progress has always accelerated, but only very recently, in the 21st century, have the advances become tangible within a single generation. The computer and the Internet have transformed the world and now we're on the doorstep of multiple concurrent interwoven revolutions. Artificial intelligence finally emerged out of its AI winter. The game of Go was finally conquered by a program.

These feats were aided by massive development in hardware, which led to brute computing power as a foundation to software improvements. The Internet of Things is gathering a lot of momentum with many companies coming out with platforms and products that try to integrate and weave the physical and virtual worlds. Speaking of virtual worlds, virtual reality is finally a reality (see what I did there?), with Facebook's acquisition of Oculus Rift and subsequent big plans for the technology. Similar to AI, virtual reality was a pipe dream for a long time. Twenty-five years ago, when I studied computer science, VRML was the buzzword. It never took off because the hardware was unable to support the technology economically. With today's much more powerful and cheaper hardware it has become possible.

The robots are getting better and better too. The Boston Dynamics video of its latest robot that was recently circulated makes one feel empathetic towards a robot that is being harassed by a person pushing it around. Transportation is also on the cusp of a huge change with electric and self-driving cars already here. Don't forget the Hyperloop. Diverse clean energy solutions are getting better, slowly but surely. This is always complicated to reason about with powerful oil interests stirring the pot, but the trend is clear. Biotechnology is getting a lot of attention and produces miracles on a daily basis. I will not try to make specific predictions, but I believe that in 5 to 10 years all these areas will become mainstream and possibly dominate all incumbents. The social changes will be fascinating to watch as well. Don't blink!

Posted by Sandeep Chanda on February 29, 2016

Kylin is an open source analytics platform from the house of eBay, designed to run analytics on Hadoop using SQL compatible tools, thereby allowing developers and power users with prior knowledge of analytical services to run multi-dimensional analytics.

Kylin is an important bridge for power users who not only want to continue to use their favourite tool like Excel, Power BI, or Tableau to create and execute predictive analytical models based on the principals of OLAP, but at the same time also want to leverage the performance of Hadoop in quickly processing large volumes of data at sub-second rates. The platform has since been transformed into an Apache Incubator project and various production houses within eBay are actually running it to perform analytics on large scale datasets. It is designed to provide better performance than Hive queries for the same dataset and works seamlessly with well-known data visualization tools such as Tableau and Excel, as well as third-party applications. Most ANSI SQL queries are also supported.

Users can run multi-dimensional OLAP queries on 10+ billion rows of data in less than a second. From the perspective of security, Kylin supports integration with LDAP, allowing seamless access for your active directory users and also supports access control at a cube or project level.

The Kylin platform stack comprises of the Kylin Core, which is comprised of the fundamental framework components of Kylin to run the stack that includes the query engine, metadata engine, engine for running the jobs and storage. It also provides the RESTful services to interact with the platform. In addition to Core, Kylin has an integration layer that supports the ETL jobs and provides monitoring and alerting support, an extension layer that allows plugins for extending the security model for single sign-on, storage and other functions and a UI layer that allows developers to create custom interfaces on top of Kylin Core.

Thanks for your registration, follow us on our social networks to keep up-to-date