Azure Data Factory offers capabilities to orchestrate data movement services that can scale using Azure Infrastructure. Not just that, you can visualize the data lineage connected to both on premise and cloud data sources and monitor the health of the pipeline as well. A few weeks back the Azure team published a Code-free Copy tool for Azure Data Factory that allows hassle free configuration and management of data pipelines without having to write any script using a declarative designer. A simple wizard allows you to explore data sources between various cloud offerings like SQL Azure, Azure Storage, Azure SQL Data Warehouse etc., as well as your local SQL Server database. You can also preview the data, apply expressions to validate and perform schema mapping for simple transformations. You get the scalability benefits of Azure and hence you can move hundreds and thousands of files and rows of data efficiently.
To start, first login to your Azure portal and search for Data Factory under the Data + Analytics marketplace segment. Create an instance of Azure Data Factory as shown in the figure below.
After creating the Data Factory instance, you will now see an option called Copy Data (Preview). Click on it to launch the Copy Data wizard.
The first step in the wizard is to set the properties like name and the schedule configuration, whether you want to run it just once or create a job that runs on a regular interval.
After configuring the schedule the next step is to define the data source. Pick a connection from the available list of stores. In this example we selected the Azure Blob Storage.
After selecting the data source connection, the wizard will direct you to the connector specific steps. In this example, it will prompt you to select the folders / files from where you need the data copied.
You can also then provide additional properties to select specific content from the folder / file like the text format, delimiter etc.
You can preview the data and then set the destination data source to where the data will get copied at a regular interval. In the destination, as well, you can specify the properties to merge or append content. Once set, review the summary and then save to complete the wizard and it will get triggered based on the schedule.
data management, Azure Data Factory, data pipeline, Azure Data Factory Copy Wizard