Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX

Posted by Gigi Sayfan on February 11, 2016

Visual programming is all about throwing away the syntax and the source files and programming by dragging boxes around and using a graphical environment to represent the program. This is a trend that gets some hype every few years. There are some great successes in specific domains. Scratch from MIT is one of the most popular. It is a game creation environment that allows you to create fairly sophisticated programs and has direct support for graphics, animations and sound. It is also a community with people sharing the games and programs they create.

Scratch is targeted mostly at kids. My son created some nice games when he was 8 or 9 years old. There are other such environments for various domains particularly where the end product is visual — so direct visual manipulation matches the user's mental model and makes a lot of sense. Some examples are visual web design, Game creation and circuit board design. Check out this blog for many examples.

The big question is whether or not visual programming can adapt to more abstract domains and on a larger scale. My opinion is that there will have to be some serious breakthroughs and some paradigm shifts before that happens, if at all. The main reason is that the flat screen is not a great medium to represent complex systems.

Something like a visual dependency graph is cool (especially if you make it 3D and rotatable), but it is not very functional. The same goes for call graphs, database diagrams or flow charts. They can all represent complex relationships and processes visually, but most hard-core developers reach for other tools to explore, search and filter. I'm not talking necessarily about low-level textual tools such as grep.

Modern IDEs have language, framework and source control awareness built-in. They provide a lot of information and support, but are not visual. My sense is that visual programming is still far away for being suitable for general-purpose programming.

Posted by Sandeep Chanda on February 8, 2016

Power BI online presents powerful analytical capabilities managed by the scale and performance of a cloud infrastructure. The ability to host visualizations that can be connected to a multitude of data sources has made Power BI quite popular and resulted in third-party hosted services being developed to provide data from a large variety of cloud platforms. Power BI now has its own marketplace for such services. Not so long ago, the Visual Studio team also published a connector for fetching data from Visual Studio Online accounts and created a default dashboard for analysing the progress of a team. Together with the connector for Application Insights, the platform for synthetic monitoring of an application, Power BI can provide the ability to uncover deep insights that wasn't possible before in Application Lifecycle Management (ALM).

To create an analytical dashboard on the project metrics, first login to your Power BI account and then click on "Get Data". It provides several options to fetch data from different sources. There is an option to connect to online services.

Click "Get" under the Services tab. You will be redirected to the marketplace services page that contains various options for connecting to different providers. This is where you will find the option to connect to Visual Studio Online using the Visual Studio Online connector. Click on it.

You will be prompted to provide the Visual Studio Online account name and the Project Name. You may specify an "*" under the project name to fetch and analyse data from all projects, but that is not ideal. You should specify the name of the project to fetch data specific to that project only. Next you will be prompted to authenticate. If the Visual Studio Online and Power BI are not part of the same subscription, then you can authenticate using OAuth 2.0.

Currently OAuth 2.0 is the only supported authentication protocol. Once connected, Power BI will fetch project data and create a default dashboard to showcase some of the project metrics such as burn down by story points, active bug count, etc., that you can analyse to determine the health of the project. A connector is also provided for Application Insights, and you can use it to fetch synthetic monitoring data around the application being watched. Together with the project metrics, you can drive powerful ALM insights on the application under development.

Posted by Sandeep Chanda on January 29, 2016

The @WalmartLabs team was founded by Walmart eCommerce to provide a sustainable rate of innovation, given the competition. The team adopted a DevOps culture and migrated the ecommerce platform to cloud. With continuous application lifecycle management (ALM) in vision, the group acquired OneOps — the platform to accelerate DevOps through continuous ALM of cloud workload.

Most recently, the group took a step forward in open sourcing the platform for the leverage of the community at large. This is a huge step, given that OneOps has integration hooks with most of the major cloud providers such as OpenStack, RackSpace, Azure, and Amazon. This is not surprising given that @WalmartLabs is not new to open source. They have contributed to some wonderful technologies for the community like Mupd8 — and hapi — and have been actively contributing to React.js as well.

OneOps not only has integration hooks for all major cloud providers, but can also allow developers to code deployments in a hybrid or a multi-cloud environment. With OneOps, the Walmart eCommerce team is able to run close to 1000 deployments a day. Developer communities can now look forward to automatically managing the lifecycle of an application post deployment. It can take care of scaling, and repairing as needed. It is also a one stop shop for porting applications from one environment to another. Applications or environments built in one environment (Azure, for example) can be easily ported to another (such as AWS).

Setting up OneOps is easy. If you have an AWS account, it is available as a public AMI. Alternately there is a Vagrant image to setup OneOps. The Vagrant project can be checked with the following command:

$ git clone https://github.com/oneops/setup
$ cd setup/vagrant
$ vagrant up 

Once setup, you can monitor the build process in Jenkins on the URI http://localhost:3003.

After installing OneOps, you can login to the console using your account, and then create an Organization profile. The organization profile bootstraps with suitable default parameters. After creating an organization, you can go to the Cloud tab to select environments to configure and deploy. The following figure illustrates selecting an Azure environment.

OneOps provides a three-phase configuration. You have the Design phase, where you create a Platform to configure the building blocks for your deployment from existing packs. Then you move to the Transition phase, where you define the environment variables for targeted deployment. Finally you move to the Operate phase, where actual instances are created post successful deployment, and you can monitor the run.

Posted by Gigi Sayfan on January 27, 2016

It is a common advise to learn from other people successes and mistakes. Go through testimonies, post-mortem analyses, books and articles and they all say the same. But, in the dynamic environment of software development you have to be very careful. It is often easy to observe that, given all other things being equal, A is clearly better than B. The only problem is that all other things are never equal.

This applies on multiple levels. Maybe you want to choose a programming language, a web framework or a configuration management tool. Maybe you want to incorporate a new development process or performance review. Maybe you try to figure out how many people you need to hire and what skills and experience should you shoot for. All of these decisions will have to take into account the current state of affairs and your specific situation. Suppose you read that some startup started using the Rust programming language and within two weeks improved performance 20X. That means nothing. There are so many variables. How bad was their original code, was the performance issue isolated to a single spot, was a Rust wizard on the team? Or maybe you read about a company about your size that tried to switch from waterfall to an agile process and failed miserably. Does that mean your company will fail too? What's the culture of the other company, how was the new process introduced, was higher management committed?

What's the answer then? How can you decide what to do if you can't learn from other people? Very often, the outcome doesn't depend so much on the decision, but more about the commitment and hard work going into the execution. Gather a reasonable amount of information about the different options (don't start a three month study to decide which spell checker you should use). Consult with people you trust and know something both about the subject matter and about your situation, ideally from people inside your organization.

Make sure to involve all the relevant stakeholders and secure their support. But, form your own opinion don't just trust some supposed expert. Then just make a decision and run with it. The bigger the decision or the impact, you should consider more seriously the risk of making the wrong decision and what's the cost of pivoting later. If it turns out your decision was wrong, you're now an expert and should know exactly what went wrong and how to fix it.

Posted by Sandeep Chanda on January 22, 2016

R is the most widely used programming language in the world of data science and heavily used for statistical modelling and predictive analytics. The popularity of R is driving many commercial big data and analytics providers to not only provide first class support for R, but also create software and services around R. Microsoft is not far behind. Months after its acquisition of Revolution Analytics, the company leading the commercial software and services development around R, Microsoft is now ready with R Server. Microsoft R Server is an enterprise scale analytics platform supporting a range of machine learning capabilities based on the R language. It supports all stages of analytics viz. explore, analyse, model and visualize. It can run R scripts and CRAN packages.

In addition, it overcomes the limitations of R open source by supporting parallel processing, thereby allowing a multi-fold increase in the analytical capabilities. Microsoft R Server has support for Hadoop, thereby allowing developers to distribute processing of R data models across Hadoop clusters. It also has support for Teradata. Interests on cloud are also taken care. The Data Science Virtual Machine will now come pre-built with R Server Developer Edition. You can now leverage the scale of Azure to run your R data models. For Windows, R Server ships as R services in SQL Server 2016. While currently in CTP you can install the advanced analytics extensions during the installation of SQL Server 2016 to use a new service called the SQL Server Launchpad and integrate with Microsoft R Open using standard T-SQL statements. To enable R integration then, you can run the sp_configure command and give permissions to a user to run R scripts:

sp_configure 'external scripts enabled', 1
alter role db_rrerole add member [name]; 

You can then connect using your IDE like R Studio to develop and run R code. Microsoft will also shortly launch R tools for Visual Studio (RTVS), and you will be able to run R from within Visual Studio.

With enterprises embracing R and providing solutions for commercial use, it is only a matter of time before developers fully embrace this language for enterprise scale data analysis.

Posted by Gigi Sayfan on January 21, 2016

Agile methodologies have been used successfully in many big companies, but it is often a challenge. There are many reasons: lack of project sponsorship, prolonged user validation, existing policies, legacy systems with no tests - and most importantly culture and inertia. Given all these obstacles how do you scale Agile processes in a big organization? Very carefully. If you're interested in introducing Agile development practices into a large organization, you can try some of these techniques:

  1. Show don't tell - Work on a project using Agile methods. Get it done on time and on budget using Agile methods.
  2. Grow organically and incrementally - If you're a manager it's easy. Start with your team. Try to gain mindshare with your peer managers - for example, when collaborating on a project, suggest the use of Agile methods to coordinate deliverables and handoffs. If you're a developer, try to convince your team members and manager to give it a try.
  3. Utilize the organizational structure - Treat each team or department as a small Agile entity. If you can, establish well-defined interfaces.
  4. Be flexible - Be willing to compromise and acknowledge other people's concerns. Try to accommodate as much as possible even if it means you start with a hybrid Agile process. Changing people and their habits is hard. Changing the mindset of veteran people in big companies with established culture is extremely difficult.

Finally, if you are really passionate about Agile practices and everything you've tried has failed, you can always join a company that already follows agile practices, including many companies from the Fortune 2000.

Posted by Gigi Sayfan on January 13, 2016

Two Meanings?

  1. The form of internal documentation appropriate for an organization following agile practices.
  2. Generating external documentation as an artifact/user story.

The first meaning is typically a combination of code comments and auto-generated documentation. A very common assertion in Agile circles is that unit tests serve as live documentation. Python, for example has a module called doctest in which the documentation of a function may contain live code examples with outputs that can be executed as tests which verify the correctness.

Behavior Driven Development

BDD is putting a lot of emphasis on even specifying the requirements in an executable form via special DSLs (domain specific languages), so the requirements can serve as both tests and live human readable documentation. Auto-generated documentation for public APIs is very common. Public APIs are designed to be used by third party developers who are not familiar with the code (even if it's open source). The documentation must be accurate and in sync with the code.

The second meaning can be considered as just another artifact. But, there are some differences. Typically, when generating external documentation for a system it is centralized. You have to consider the structure and organization and then address the content as a collection of user stories. Unlike code artifacts, external documentation doesn't have automated tests. Documentation testing is an often neglected practice. Which is fairly typical because the documentation itself is often neglected. However, some types of external documentation are critical and must serve contractual or regulatory requirements. In these cases, you must verify that the documentation is correct.

Posted by Sandeep Chanda on January 12, 2016

In the previous two posts (see Part 1 and Part 2), we compared the two most popular cloud platforms, Microsoft's Azure and Amazon's AWS for their offerings in the end-to-end ecosystem of data analytics, both large scale and real time.

In this final post, will compare Azure's Data Factory and an equivalent offering from AWS in the form of AWS Data Pipeline. Both are fairly similar in their abilities and offerings, however, while AWS pitches the Data Pipeline as a platform for data migration between different AWS compute and storage services, and also between on premise and AWS instances, Azure's pitch for Data Factory is more as an integration service for orchestrating and automating the movement and transformation of data.

In terms of quality attributes, both services are very capable in terms of scalability, reliability, flexibility, and of course, cost of operations. Data Pipeline is backed by the highly available and fault tolerant infrastructure of AWS and hence is extremely reliable. It is also very easy to create a pipeline using the drag and drop console in AWS. It offers a host of features, such as scheduling, dependency tracking, and error handling. Pipelines can not only be run serially, but also in parallel. The usage is also very transparent in terms of moderating control over the computational resources assigned to execute the business logic. Azure Data Factory, on the other hand, provides features such as visualizing the data lineage.

In terms of pricing, Azure charges by the frequency of activities and where they run. A low frequency activity in cloud is charged at $.60 and the same activity on premise is charged $1.50. Similarly the high frequency activities have higher charges. Note that you are also charged for data movement separately for cloud and on premise. In addition, pipelines that are left inactive are also charged.

Posted by Gigi Sayfan on January 4, 2016

Agile practices have proven themselves time and again for development and evolution of software systems. But, it's not clear if the same agile approach can benefit user-facing aspects such as public APIs, user interface design and user experience. If you change your API constantly, no sane developer will use it. If your user interface design or experience keeps shifting users will get confused and angry that they have to face a new learning curve whenever you decide to make a change. Sometimes, users will be upset even if the changes are demonstrably beneficial, just because of switching costs. Remember users didn't subscribe to your agile thinking and are just interested in using your API/product.

What's the answer then? Do you have to be prescient and come up with the ultimate API and user interface right at the beginning? Not at all. There are several solutions that will allow you to iterate here as well. But, you have to realize that iteration on these aspects should and will be slower and more disciplined.

Possible approaches include A/B testing, keeping the old API/interface available, deprecating previous APIs, backward compatibility, testing rapid changes on groups of users that sign up for beta. In general, the more successful you are, the less free you are to get rid of legacy. Probably the best example is Microsoft which still allows you to run DOS programs on the latest Windows versions and used a variety of approaches to iterate on the Windows desktop experience, including handling the frustration from users whenever a new version of Windows comes out. Windows 10 is a fine response to the harsh criticism Windows 8 endured.

Posted by Sandeep Chanda on January 1, 2016

In the first part of this series comparing the competing analytics platform offerings from Microsoft Azure and Amazon AWS, we explored Azure Analytics Platform System and AWS Redshift. In this post, we will talk about comparing some of the other products in the ecosystem of analytics.

Microsoft Azure also offers Stream Analytics, that's again a turnkey proprietary solution from Microsoft for cost effective real-time processing of events. With Stream Analytics, you can easily set up a variety of devices, sensors, web applications, social media and infrastructure to stream data and then perform real-time analytical computations on them. Stream Analytics is a powerful and effective platform for designing IoT solutions. It allows streaming millions of events per second and provides mission critical reliability. It also provides a familiar SQL based language support for rapid development using your existing SQL knowledge.

A competing offering from AWS is Kinesis Streams, however it is geared more towards application insights than devices and sensors. Stream Analytics actually seems to be competing against Apache Storm on Azure hosted as HDInsight. Both are offered as PaaS and support processing of virtually millions of events per second. A key difference, however, is that Stream Analytics deploy as monitoring jobs, while Storm on HDInsight deploys as clusters of monitoring jobs, hosting multiple stream jobs or other workloads. Another volumetric aspect to consider is that Stream Analytics is turnkey, whereas Storm on HDInsight allows lot of custom connectors and is extensible.

There are pricing considerations to make as well while making a choice between these platforms. In Stream Analytics, pricing is by the volume of data processed and number of streaming units, while in HDInsight, it is charged by the clusters irrespective of jobs that may or may not be running. This post by Jeff Stokes details the differences.

(See also, Part 3 of this series)

Thanks for your registration, follow us on our social networks to keep up-to-date