IT Central Station is now PeerSpot: Here's why

Azure Data Factory Room for Improvement

GaryM - PeerSpot reviewer
Data Architect at World Vision

The list of issues and gaps in this tool is extensive.  It includes:

1) Missing email/SMTP activity.

2) Mapping data flows requires significant lag time to spin up spark clusters.

3) Performance compared to SSIS. Expect copy activity to take ten times that of what SSIS takes for simple data flow between tables in the same database.

4) It's missing a debug of a single activity.

5) Oath2.0 adapters lack automated support for refresh tokens.

6) Copy activity errors provide no guidance as to which column is causing a failure.

7) There's no built-in pipeline exit activity when encountering an error.

8) AutoResolveIntegration runtime should never pick a region that you're not using (should be your default for your tenant).

9) Resolve IR queue time lag.  For example a small table copy activity I just ran took 95 seconds queuing and 12 seconds to actually copy the data. 

10 Copy activity upsert option is great if you think watching trees grow is an action movie. 

They need to fix the bugs, for example:

1) Debug sometimes stops picking up saved changes for a period of time, rendering this essential tool useless during that time.

2) Enable interactive authoring (a critical tool for development) often doesn't turn on when enabled without going into another part of the tool to enable it.  And then you have to wait several minutes before it's enabled which is time your blocked from development until it's ready.  And then it only activates for up to 120 minutes before you have to go through this all over again.  I think Microsoft is trying to torture developers.  

3) Exiting the inside of an activity that contains other activities always causes the screen to jump to the beginning of a pipeline requiring re-navigating to where you were at (greatly slowing development productivity). 

4) AutoResolveIntegration runtime (using default settings) often picks remote regions to operate, which causes either an unnecessary slowdown or an error message saying it's unable to transfer the volume of data across regions.

5) Copy activity often gets error "mapping source is empty" for no apparent reason. If you play with the activity such as importing new metadata then it's happy again.  This sort of thing makes you want to just change careers. Or tools. 

View full review »
Richard Domikis - PeerSpot reviewer
Chief Technology Officer at cornerstone defense

There is always room to improve. There should be good examples of use that, of course, customers aren't always willing to share. It is Catch-22. It would help the user base if everybody had really good examples of deployments that worked, but when you ask people to put out their good deployments, which also includes me, you usually got, "No, I'm not going to do that." They don't have enough good examples. Microsoft probably just needs to pay one of their partners to build 20 or 30 examples of functional Data Factories and then share them as a user base.

View full review »
Brian Sullivan - PeerSpot reviewer
Chief Analytics Officer at Idiro Analytics

Data Factory has so many features that it can be a little difficult or confusing to find some settings and configurations. I'm sure there's a way to make it a little easier to navigate.

In the main ADF web portal could, there's a section for monitoring jobs that are currently running so you can see if recent jobs have failed. There's an app for working with Azure in general where you can look at some segs in your account. It would be nice if Azure had an app that lets you access the monitoring layer of Data Factory from your phone or a tablet, so you could do a quick check-in on the status of certain jobs. That could be useful.

View full review »
Buyer's Guide
Azure Data Factory
August 2022
Learn what your peers think about Azure Data Factory. Get advice and tips from experienced pros sharing their opinions. Updated: August 2022.
621,327 professionals have used our research since 2012.
Dan_McCormick - PeerSpot reviewer
Chief Strategist & CTO at a consultancy with 11-50 employees

The documentation could be improved. They require more detailed error reporting, data normalization tools, easier connectivity to other services, more data services, and greater compatibility with other commonly used schemas.

I would like to see a better understanding of other common schemas, as well as a simplification of some of the more complex data normalization and standardization issues.

It would be helpful to have visibility, or better debugging, and see parts of the process as they cycle through, to get a better sense of what is and isn't working.

It's essentially just a black box. There is some monitoring that can be done, but when something goes wrong, even simple fixes are difficult to troubleshoot.

View full review »
Senior Manager at a tech services company with 51-200 employees

My only problem is the seamless connectivity with various other databases, for example, SAP. Our transaction data there, all the maintenance data, is maintained in SAP. That seamless connectivity is not there. 

Basically, it could have some specific APIs that allow it to connect to the traditional ERP systems. That'll make it more powerful. With Oracle, it's pretty good at this already. However, when it comes to SAP, SAP has its native applications, which are the way it is written. It's very much AWS with SAP Cloud, so when it comes to Azure, it's difficult to fetch data from SAP.

The initial setup is a bit complex. It's likely a company may need to enlist assistance.

Technical support is lacking in terms of responsiveness.

View full review »
Kamlesh Sancheti - PeerSpot reviewer
Director at a tech services company with 1-10 employees

We are too early into the entire cycle for us to really comment on what problems we face. We're mostly using it for transformations, like ETL tasks. I think we are comfortable with the facts or the facts setting. But for other parts, it is too early to comment on.

We are still in the development phase, testing it on a very small set of data, maybe then the neatest four or bigger set of data. Then, you might get some pain points once we put it in place and run it. That's when it will be more effective for me to answer that.

View full review »
Data Analytics Specialist at a pharma/biotech company with 10,001+ employees

Data Factory could be improved by eliminating the need for a physical data area. We have to extract data using Data Factory, then create a staging database for it with Azure SQL, which is very, very expensive. Another improvement would be lowering the licensing cost. 

View full review »
Manoj Kukreja - PeerSpot reviewer
Technical Director, Senior Cloud Solutions Architect (Big Data Engineering & Data Science) at NorthBay Solutions

Data Factory is embedded in the new Synapse Analytics. The problem is if you're using the core Data Factory, you can't call a notebook within Synapse. It's possible to call Databricks from Data Factory, but not the Spark notebook and I don't understand the reason for that restriction. To my mind, the solution needs to be more connectable to its own services.

There is a list of features I'd like to see in the next release, most of them related to oversight and security. AWS has a lake builder, which basically enforces the whole oversight concept from the start of your pipeline but unfortunately Microsoft hasn't yet implemented a similar feature.

View full review »
User with 5,001-10,000 employees

I was planning to switch to Synapse and was just looking into Synapse options.

I wanted to plug things in and then put them into Power BI. Basically, I'm planning to shift some data, leveraging the skills I wanted to use Synapse for performance.

I am not a frequent user, and I am not an Azure Data Factory engineer or data engineer. I work as an enterprise architect. Data Factory, in essence, becomes a component of my solution. I see the fitment and plan on using it. It could be Azure Data Factory or Data Lake, but I'm not sure what enhancements it would require.

User-friendliness and user effectiveness are unquestionably important, and it may be a good option here to improve the user experience. However, I believe that more and more sophisticated monitoring would be beneficial.

View full review »
Anand Kumar Singh - PeerSpot reviewer
Enterprise Architect at TechnipEnergies

The need to work more on developing out-of-the-box connectors for other products like Oracle, AWS, and others.

View full review »
Business Unit Manager Data Migration and Integration at a tech services company with 201-500 employees

The number of standard adaptors could be extended further. What we find is that if we develop data integration solutions with Data Factory, there's still quite a bit of coding involved, whereas we'd like to move in a direction with less coding and more select-and-click.

View full review »
Charles Nordine - PeerSpot reviewer
Senior Partner at Collective Intelligence

I couldn't quite grasp it at first because it has a Microsoft footprint on it. Some of the nomenclature around sync and other things is based on how SSRS or SSIS works, which works fine if you know these products. I didn't know them. So, some of the language and some of the settings were obtuse for me to use. It could be a little difficult if you're coming from the Java or AWS platform, but if you are coming from a Microsoft background, it would be very familiar.

For some of the data, there were some issues with data mapping. Some of the error messages were a little bit foggy. There could be more of a quick start guide or some inline examples. The documentation could be better.

There were some latency and performance issues. The processing time took slightly longer than I was hoping for. I wasn't sure if that was a licensing issue or construction of how we did the product. It wasn't super clear to me why and how those occurred. There was think time between steps. I am not sure if they can reduce the latency there. 

View full review »
Lead BI&A Consultant at a computer software company with 10,001+ employees

We didn't have a very good experience. The first steps were very easy but it turned out that we used Europe for a Microsoft data center, also partly abroad for our alpha notes. As soon as we started using Azure Data Factory, the bills got higher and higher. At first we couldn't understand why, but it is very expensive to put data into a data center abroad. So instead, we decided to use only Northern Europe, which worked out for a while in the beginning. And then we had nothing to show for it. They gave me a really hard time for this.

Azure Data Factory should be cheaper to move data to a data center abroad for calamities in case of disasters.

What I really miss is the integration of Microsoft TED quality services and Microsoft Data services. If they were to combine those features in Data Factory, I think they would have a very strong proposition. They promise something like that on Microsoft Congress. That was years ago and it's still not here.

View full review »
Thomas Fuchs - PeerSpot reviewer
General Manager Data & Analytics at a tech services company with 1,001-5,000 employees

I'm more of a general manager. I don't have any insights in terms of missing features or items of that nature.

Integration of data lineage would be a nice feature in terms of DevOps integration. It would make implementation for a company much easier. I'm not sure if that's already available or not. However, that would be a great feature to add if it isn't already there.

View full review »
Richard Griffin - PeerSpot reviewer
Manager Data & Analytics at Fletcher Building

It does not appear to be as rich as other ETL tools. It has very limited capabilities. It simply moves data around. It's not very good after that because it's taking the data to the next level and modeling it.

View full review »
IT Analyst at a tech vendor with 10,001+ employees

There should be a way that it can do switches, so if at any point in time I want to do some hybrid mode of making any data collections or ingestions, I can just click on a button. I can change a switch and make sure a batch can be a streaming process.

View full review »
Principal Engineer at a computer software company with 501-1,000 employees

Snowflake connectivity was recently added and if the vendor provided some videos on how to create data then that would be helpful. I think that everything is there, but we need more tutorials.

View full review »
Data engineer at INICON

Some stuff can be better, however, overall it's fine.

The performance and stability are touch and go.

The deployment should be easier.

We’d like the management of the solution to run a little more smoothly.

View full review »
CTO at a construction company with 1,001-5,000 employees

The pricing scheme is very complex and difficult to understand. Analyzing it upfront is impossible, so we just decided to start using it and figure out the costs on a weekly or monthly basis.

View full review »
PRESIDENT at a computer software company with 51-200 employees

Azure Data Factory can improve by having support in the drivers for change data capture.

View full review »
Anil Jha - PeerSpot reviewer
Director D&A at Iris Software Inc.

I find that Azure Data Factory is still maturing, so there are issues. For example, there are many features missing that you can find in other products.

You cannot use a custom data delimiter, which means that you have problems receiving data in certain formats. For example, there are problems dealing with data that is comma-delimited.

View full review »
Gyanendu Rai - PeerSpot reviewer
Senior Tech Consultant at Crowe

Microsoft is constantly upgrading its product. Changes can happen every week. Every time you open Data Factory you see something new and need to study what it is. I would like to be informed about the changes ahead of time, so we are aware of what's coming.

In future releases, I would like to see Azure Data Factory simplify how the information of logs is presented. Currently, you need to do a lot of clicks and go through steps to find out what happened. It takes too much time. The log needs to be more user-friendly.

View full review »
Andres Felipe Naranjo - PeerSpot reviewer
.NET Architect at a computer software company with 10,001+ employees

It would be better if it had machine learning capabilities. For example, at the moment, we're working with Databricks and Azure Data Factory. But Databricks is very complex to do the different data flows. It could be great to have more functionalities to do that in Azure Data Factory.

View full review »
Data engineer at Target

Areas for improvement would be the product's performance and its mapping of data flow. In addition, some of the optimization techniques are not scalable, some naming connections are not supported, and automated testing is not supported in all cases. In the next release, I would like to see support so we can enhance based on the next-level pipelines, writing from scratch, flexible scheduling, and pipeline activity.

View full review »
IT Functional Analyst at a energy/utilities company with 1,001-5,000 employees

One area for improvement is documentation. At present, there isn't enough documentation on how to use Azure Data Factory in certain conditions. It would be good to have documentation on the various use cases.

Sometimes, it's really difficult to find the answers to very technical questions regarding certain conditions.

View full review »
Balan Murugan - PeerSpot reviewer
Azure Technical Architect at a computer software company with 10,001+ employees

Understanding the pricing model for Data Factory is quite complex. It needs to be simplified, and easier to understand.

We have experienced some issues with the integration. This is an area that needs improvement.

View full review »
Soumyojit Sen - PeerSpot reviewer
Director of Product Management at EIM solutions

The speed and performance can be further improved.

This solution should be able to connect with custom APIs.

View full review »
Senior Data Engineer at a real estate/law firm with 201-500 employees

The only thing I wish it had was real-time replication when replicating data over, rather than just allowing you to drop all the data and replace it. It would be beneficial if you could replicate it.

Real-time replication is required, and this is not a simple task.

View full review »
Head of Product at Tata Consultancy

We use this solution within a limited context, specifically for extracting data and moving it to the Azure Cloud to develop our BI solution. Based on our usage, we have not found any challenges using the solution but have not explored every feature. The one element of the solution that we have used and could be improved is the user interface. 

View full review »
Sunil Singh - PeerSpot reviewer
Lead Engineering at GlobalLogic

The performance could be better. It would be better if Azure Data Factory could handle a higher load. I have heard that it can get overloaded, and it can't handle it.

View full review »
Enterprise Data Architect at a financial services firm with 201-500 employees

It can improve from the perspective of active logging. It can provide active logging information.

It should provide support for changing data capture on several other platforms.

View full review »
PedroNavarro - PeerSpot reviewer
BI Development & Validation Manager at JT International SA

Occasionally, there are problems within Microsoft itself that impact the Data Factory and cause it to fail.

View full review »
Senior Systems Analyst at a non-profit with 201-500 employees

There is no built-in function for automatically adding notifications concerning the progress or outline of a pipeline run. I believe this requires a bit of development through power ups, but it would be nice to have a task for firing off an email through the GUI.

View full review »
Buyer's Guide
Azure Data Factory
August 2022
Learn what your peers think about Azure Data Factory. Get advice and tips from experienced pros sharing their opinions. Updated: August 2022.
621,327 professionals have used our research since 2012.