Coming October 25: PeerSpot Awards will be announced! Learn more
2020-07-08T09:51:00Z
Rony_Sklar - PeerSpot reviewer
Community Manager at PeerSpot (formerly IT Central Station)
  • 3
  • 112

How do you select the right cloud ETL tool?

What features are most important to compare when selecting a cloud ETL tool?

2
PeerSpot user
2 Answers
Haja Najumudeen - PeerSpot reviewer
IT Linux Administrator and Cloud Architect at Gateway Gulf
Real User
Top 5
2020-11-04T10:33:27Z
04 November 20

AWS Glue and Azure Data factory for ELT best performance cloud services.

Rony_Sklar - PeerSpot reviewer
Community Manager at PeerSpot (formerly IT Central Station)
Community Manager
05 November 20

@Haja Najumudeen thanks. What features are most important to keep in mind when choosing a tool?

PeerSpot user
EY
Solution Development Director at Komtaş Bilgi Yönetimi
User
2021-10-12T13:23:38Z
12 October 21

I think you have to focus your workload, human resource and data integration strategy. 


For example, if you would like to load data incrementally from some RDBMS and would like to manage data linage, you have to review the data management feature of tools. If that feature is not important and throughput is important for you, you have to focus data processing power of the tool. 


There is no one best solution in the market. You have to define your requirements before selecting your criteria. 


I recommend, if you would like to make a design independently of cloud vendors like GCP, AWS or Azure, you can look at the Informatica Cloud product family. There are really good solutions for data integration.

Find out what your peers are saying about Databricks, Amazon, Apache and others in Streaming Analytics. Updated: September 2022.
635,513 professionals have used our research since 2012.
Related Questions
Janet Staver - PeerSpot reviewer
Tech Blogger
Jul 25, 2022
What are the benefits?
See 2 answers
Beth Safire - PeerSpot reviewer
Tech Blogger
06 July 22
Streaming analytics enables real-time logging, processing, and analyzing of data and events from streaming sources. This is essential in industries that consistently rely on real-time information, such as news corporations, financial companies, investment houses, and utility management companies. Streaming analytics tools continuously run queries to gain information and perform actions on real-time data. When choosing a streaming analytics tool, here are some features to look out for: Machine learning capabilities: Employ machine learning training models on real-time data streams to make continuous and accurate forecasts. Machine learning is also used to filter and enhance streamed data, and to help visualize collected data and generated predictions. In addition, you can utilize machine learning to help detect anomalies in your current data by comparing it with previously logged data. SQL support: Continuous SQL queries perform powerful transformations on streaming data, saving data analysts a substantial amount of time and effort. SQL queries also enable users to detect events in real time by running queries from current streams against queries from past event streams. User-friendly dashboards: Your streaming analytics tool should allow you to create operational dashboards with interactive charts and graphs that provide real-time monitoring and enable data transformation and alert configuration. You should also be able to create role-based user interfaces for different groups within your organization.It is important that you have a graphical user interface of all data streams and data relationships with information on the streamed data’s origin, destination, and transformation history. Support for multiple programming languages: Programmers and data analysts should be able to utilize code to automate the data streaming processes and execute various functions on the streamed data using any popular programming language, such as Python, Java, C#, Ruby, and JavaScript. Rapid scaling: The ability to instantly scale your applications is essential for handling sudden increases of generated data. If you are using a cloud-based service, you should be charged only for the virtual machines and the associated storage and resources your company consumes. High performance: In real time, data is ingested from thousands of sources. A powerful streaming analytics tool should be able to process millions of events every second. Security and compliance: Extended data usage within your company increases the urgency of following regulations and the importance of data security. To uphold strict security and compliance requirements, your team should have the ability to define and enforce universal regulation standards and be provided with a complete set of data security management tools from data encryption of all incoming and outgoing communications and in-memory processing to private networking. Integration with external systems: Flexible integration options can help provide added value to your existing streamed data. For example, integrating with external data sources that live-stream data from events, or sites that provide real-time market information can help influence your own business choices.
Evgeny Belenky - PeerSpot reviewer
Director of Community at PeerSpot (formerly IT Central Station)
25 July 22
Hi @Anirban Bhattacharya, @reviewer1438992, @reviewer1582965, @Ertugrul Akbas, @Wallace Hugh ​and @reviewer1488372, Please share your expert opinion. Thanks.​ ​ ​ ​ ​ 
Netanya Carmi - PeerSpot reviewer
Content Manager at PeerSpot (formerly IT Central Station)
Jun 12, 2022
What makes it the best?
See 2 answers
KR
Associate Director - Head of Client Custom Technology Solutions at Arcesium
20 May 22
There are many of them, the choice largely depends on your specific use case or problem statement. Could you elaborate what you are looking to solve?
Dovid Gelber - PeerSpot reviewer
Tech blogger
12 June 22
Our organization was searching for a streaming analytics tool that would be able to adequately meet our needs. During our search, we compared many different streaming analytics tools and solutions. We found Databricks and Amazon Kinesis to be the most effective ones. One of the things that I initially noticed about Azure Databricks was the high level of flexibility that it affords my team. Databricks is designed to enable us to customize both the way that our data processing solution functions and the platform where it runs. It makes it possible for us to ensure that we have the tools to conduct our data analysis in the way that best addresses our needs. Databricks accomplishes this by providing us with a number of extremely useful features. These capabilities include: Many common programming languages. Databricks is designed so that we can make use of the programming languages that are most appropriate for the project that we are undertaking. Integration with Microsoft Azure. Databricks allows us to integrate our system with any Microsoft Azure solution that we want. We can tailor our system to best fit our analytical and data processing needs. Our team never needs to worry that some crucial feature is missing. The Microsoft Azure solution suite provides us with a wide variety of options. If something is missing, we can rely on the Azure suite to most likely have the function that we seek. Cloud-native nature. Databricks is a cloud-native solution. It enables us to run all of our operations out of the cloud. Databricks is compatible with every major cloud provider. This means that we have the ability to use the cloud environment of our choice to host our data processing activities. A major benefit that Databricks provides us is its flexibility. It enables us to handle workloads of many different sizes. Databricks’s cloud architecture has the ability to handle both large loads of data and much smaller tasks. This solution makes other data analysis and processing solutions unnecessary. One of the aspects of Amazon Kinesis that I appreciate is the way that it enables us to take in, store, and process data in real time. Amazon Kinesis leverages a machine learning algorithm and provides us with the ability to quickly turn raw data into valuable insights. We do not need to devote hours or days to data processing. Amazon Kinesis’s ML capabilities provide us with immense value while at the same time saving us time and other resources. We can also use this solution to scale up our data processing capabilities. Amazon Kinesis can be set to scale up our data stream processing as the need for growth increases. This function elastically expands our capabilities as the stream of data running through our system grows. Amazon Kinesis allows us to make sure that our system can keep up with our analysis and processing demands. After we tried out both Databricks and Amazon Kinesis, we found that they both empowered and enabled us to take complete control of every aspect of our data analysis and processing process.
Related Articles
MS
User at a media company with 51-200 employees
Oct 09, 2020
Living in a digital world where every second matters and every delay in receiving or processing data could mean a lost sale, a poor user experience, or a compromised security system. The best real-time analytics software is providing great benefit in moving our data evolution closer to the edge of our systems where the data is being generated then streaming and processing data to more quickly m...
MS
User at a media company with 51-200 employees
Oct 08, 2020
Event Stream Processing (ESP) is when you have a series of data points related to an event that is coming from a system that is constantly generating data, and you take action on that set of data points. In this case, talking about an “event” is referencing each actual data point, and the “stream” is referring to delivering all of those events. So you will be streaming or delivering the data po...
MS
User at a media company with 51-200 employees
Oct 08, 2020
There are a lot of advantages of using the best real-time analytics software with real-time information. Overall, consider that with these types of analytics, your team will be able to get more accurate data than ever before, and faster too. Gone are the days of waiting for hours to process data as it comes in. With stream processing, you can analyze data instantly, mere seconds after it was pr...
MS
User at a media company with 51-200 employees
Oct 08, 2020
The great debate between batch and stream processing can be easily simplified. There are clear-cut differences between these two methods of processing data, both of which can be great options depending on your specific use-case. A batch means you want to be able to collect data that is already in a group and also within an established time frame. Stream processing is going to give you a real-ti...
MS
User at a media company with 51-200 employees
Oct 07, 2020
Real-time data analysis software gives you the ability to stream data and perform real-time analytics, so you are never dealing with stale data. Whether you use batch processing vs stream processing, you will quickly see the benefits of real-time data analysis. Each type of processing has its benefits, and your company can benefit from either as long as the right architecture is put in place. ...
Related Articles
MS
User at a media company with 51-200 employees
Oct 09, 2020
Data Streaming Use Cases
Living in a digital world where every second matters and every delay in receiving or processing d...
MS
User at a media company with 51-200 employees
Oct 08, 2020
Event Stream Processing Explained
Event Stream Processing (ESP) is when you have a series of data points related to an event that i...
Related Categories
Download Free Report
Download our free Streaming Analytics Report and find out what your peers are saying about Databricks, Amazon, Apache, and more! Updated: September 2022.
DOWNLOAD NOW
635,513 professionals have used our research since 2012.