2021-11-25T06:01:00Z

What are the main storage requirements to support Artificial Intelligence and Deep Learning applications?

What should a storage vendor implement to support them? 

What should a customer be aware of?

EB
Director of Community at PeerSpot (formerly IT Central Station)
  • 3
  • 81
2
PeerSpot user
2 Answers
TS
CEO at Rufusforyou LLC
Real User
Top 5
2021-11-29T08:56:18Z
Nov 29, 2021

Storage requirements are depending on several factors. 


For example, the language you will use, the data you will receive, do you use a RAID, etc. 

Search for a product comparison in Storage Software
Shibu Babuchandran - PeerSpot reviewer
Regional Manager/ Service Delivery Manager at a tech services company with 201-500 employees
Real User
ExpertModerator
2021-11-26T03:34:49Z
Nov 26, 2021

Hi @Evgeny Belenky ​,


HERE ARE THE STORAGE REQUIREMENTS FOR DEEP LEARNING


Deep learning workloads are a special kind of beast: all DL data is considered hot data, which raises the dilemma of not being able to employ any sort of tiered storage management solution. This is because normal SSDs usually used for hot data under conventional conditions simply won’t move the data required for millions, billions, or even trillions of metadata transfers for an ML training model to classify an unknown something out of only a limited amount of examples.


Below are a few examples of a few storage requirements needed to avoid the dreaded curse of dimensionality.


COST EFFICIENCY


Enormous AI data sets become an even bigger burden if they don’t fall within the budget set aside for storage. Anyone who has been in charge of managing enterprise data for any amount of time knows well that highly-scalable systems have always been more high-priced on a capacity versus cost basis. The ultimate deep learning storage system must be both affordable and scalable to make sense.


PARALLEL ARCHITECTURE


In order to avoid those dreaded choke points that stunt a deep learning machine’s ability to learn, it’s essential for data sets t to have parallel-access architecture.


DATA LOCALITY


While it might be possible that many organizations may opt to keep some of their data on the cloud, most of it should remain on-site in a data center. There are at least three reasons for this: regulatory compliance, cost efficiency, and performance. For this reason, on-site storage must rival the cost of keeping it on the cloud.


HYBRID ARCHITECTURE


As touched on above, different types of data have unique performance requirements. Thus, storage solutions should offer the perfect mixture of storage technologies instead of an asymmetrical strategy that will eventually fail. It’s all about simultaneously meeting ML storage performance and scalability.


SOFTWARE-DEFINED STORAGE


Not all huge data sets are the same—especially in terms of DL and ML. While some of them can get by with the simplicity of pre-configured machines, others need hyper-scale data centers featuring purpose-built servers architectures that are previously set in place. This is what makes software-defined storage solutions the best option.


Our X-AI Accelerated is an any–scale DL and ML solution that offers unmatched versatility for any organization’s needs. X-AI Accelerated was engineered from the ground up and optimized for “ingest, training, data transformations, replication, metadata, and small data transfers.” Not only that but RAID Inc. offers all the aforementioned requirements such as all-flash NVMe X2-AI/X4-AI or the X5-AI, which are hybrid flash and hard drive storage platforms.


Both the NVMe X2-AI/X4-AI and the X5-AI support parallel access to flash and deeply expandable HDD storage as well. Furthermore, the X-AI Accelerated storage platform permits one to scale out from only a few TBs to tens of PBs.

Shibu Babuchandran - PeerSpot reviewer
Regional Manager/ Service Delivery Manager at a tech services company with 201-500 employees
Real User
ExpertModerator
Jan 19, 2022

@Ariful Mondal ,
Thanks for your response.

PeerSpot user
Learn what your peers think about NetApp ONTAP. Get advice and tips from experienced pros sharing their opinions. Updated: May 2023.
708,830 professionals have used our research since 2012.
Related Questions
Ariel Lindenfeld - PeerSpot reviewer
Director of Community at PeerSpot
Apr 6, 2020
Let the community know what you think. Share your opinions now!
2 out of 6 answers
KC
Administrateur Systèmes et réseaux at Centre Hospitalier de PAU
Feb 26, 2019
The most important things are the security of our data and the facility of administration.
SL
System Administrator at Superior Water And Air
Apr 9, 2019
I would have to say too that the security of our data is important, in that it is redundant and easy to administrate.
it_user78465 - PeerSpot reviewer
CEO at a consultancy with 501-1,000 employees
May 7, 2014
We are looking into DataCore, Nexenta, and Tintri as prospective Software Defined Storage (SDS) solutions. We were previously set on DataCore, but have been hearing some interesting things about these other two competitor solutions!Can you help us us tease out which of these would be the best alternative for our business by sharing your experiences and opinions with our IT team of an excl...
See 2 answers
it_user107157 - PeerSpot reviewer
Alliances Manager at Research In Motion
May 7, 2014
There is no straight answer on this topic. Each storage architecture is different. Since the whole premise of SDS is to abstract the storage management form the hardware, Does the vendors software provide management for all your storage hardware brands and also all types of storage you use? Does it meet the end users needs and IT's? Assuming that you have a list of got to haves and nice to haves, all you can do is compare them. If Datacore meets your needs and you and IT have done your due diligence then go with that. There is always going to be someone with something "better".
it_user6186 - PeerSpot reviewer
Independent Analyst and Advisory Consultant at Server StorageIO - www.storageio.com
May 7, 2014
It Depends, tell us more info on your needs, requirements, why you are looking at just those three which represent three different approaches, or, did you look at three (or more) different approaches and these are the winners of those categories? How many virtual desktops will you be support, what is their workload/applications profiles and related info? If you are looking for turnkey converged with server, storage, basic connectivity, hardware and software all in one solution then of the three that would be Tintri. However if that is the route you are going, then what about Nutanix, Simplivity etc... Otoh if you are just looking for ZFS based software solution for san/nas etc to be deployed on different hardware, there is Nexenta as well as CloudByte, SoftNAS not to mention other variations including from those such as StarWind, OpenE etc... If you are looking for storage virtualization, there is Datacore which has been around for awhile, however there are others including the ZFS bases, StarWind, OpenE, EMC ViPR and many others... So knowing what the problem is that needs to be solved will yield the answer of what is applicable or best for that given scenario.
Moderator
AM
Consulting Practice Partner - Data, Analytics & AI at FH
Real User
ExpertModerator
Download Free Report
Download our free NetApp ONTAP Report and get advice and tips from experienced pros sharing their opinions. Updated: May 2023.
DOWNLOAD NOW
708,830 professionals have used our research since 2012.