What is our primary use case?
This is mainly for the edge. For example, if you have a customer with a requirement to process data on the edge rather than on-site data center, that can be at a contracting site, on a ship in the sea, or in a retail shop.
What is most valuable?
Dell PowerEdge XR-Series offers multiple configurations that can run in outside data center environments without the need for standard cooling and power. The modular design includes a small chassis with multiple slots for compute sleds, which helps customers with expandability.
Modularity ensures that everything is compartmentalized with a chassis containing shared parts. If there is a problem on a certain compute or sled, it will not affect the full cluster or data center because it only impacts that part. For example, if there is a power issue or a problem with a disk, it does not affect everything or the full data center.
Redundant power supply is very important because in these kinds of setups, if one power supply goes down, you need to ensure that you are running in a fully redundant environment in terms of power. It is important to ensure that this is configured as redundant power supply so that if one goes down, the full workload can be taken by the other power supply.
Security is built with a zero trust approach, meaning everything is built from the BIOS and firmware with a zero trust approach in mind. This ensures that no unauthorized access is happening to the device.
What needs improvement?
RAID is very important to protect the data by creating a redundant copy or protection for your storage. When writing data, you need to ensure that if one disk fails, you can restore your data. This is very important. Within this kind of chassis inside the server, we are using SSDs, NVMEs, and hard drives. The PERCs that support different types of drives are again very important. Normally NVMe is meant to run without a PERC and should be run directly on the bus, which gives something like a software RAID.
When creating the RAID, it depends on how critical the data is. For example, it can be single drive failure protection, double drive failure protection, or full redundancy through mirroring of the data. If you have a most critical environment where you need to ensure data is up and running even in case of a disk failure, then full mirroring of the data is the approach. However, this will impact the utilization of your capacity. If I have, for example, 10 TB of raw data, full mirroring will result in 5 TB usable for customers to write data because the other five will be only for protection. For less critical data, single drive failure protection like RAID 5 can provide 75% usable space compared to mirroring, which means 7.5 TB usable space compared to 5 TB.
What do I think about the scalability of the solution?
Scalability is possible through scale up and scale out approaches. When speaking about scalability, it is very important to work from the design point of view. From a design perspective, we understand customer requirements and based on that, you need to design for plus maybe 30 to 50% for three to five years. We need to ensure that customers can scale within five years and add items easily. Additionally, you can scale out by adding new nodes to the cluster. Scalability is not an issue in terms of the platform itself but mainly depends on if the customer's data centers have adequate power, cooling and space. Sometimes scalability can be an issue if customers have hosted their environment with a managed services provider. In that case, they do not need to consume more power or more virtual machines. That is why they need to ensure that whatever server they have is capable of being scaled internally with components like adding more memory DIMMs, adding more drives, or adding more GPUs so that there are enough PCIe cards. When it comes to the CPU, we prefer not to scale the CPU itself. If you scale the CPU, it means you need to scale out by adding nodes which will be more visible.
How are customer service and support?
On the point of technical support, I am not saying this is a disadvantage, even though it was highlighted. The technical team is capable to manage support, but the discussion refers to ensuring that we are selling the right support contract to our customers and setting the right expectation from day one. This can be better and could be improved.
What other advice do I have?
Dell PowerEdge XR-Series is very popular and we are selling this as a standard. Dell actually has a very smart approach these days to selling Dell PowerEdge XR-Series through a purpose-built design. This makes it very smart as normally servers are very customizable in terms of configuration, which gives customers and partners the choice to customize based on customer requirements. However, this flexibility needs to be controlled by a compatibility matrix to ensure best practices. This should be controlled from the architect perspective. Sometimes for price and design, these guidelines or best practices are not followed. When speaking about purpose-built devices, it is an appliance that takes this appliance with the best practice configuration to ensure that there are fewer performance issues.
Dell PowerEdge XR-Series is following the trend in the markets. Dell is moving with what is happening in the compute industry. There are no customers now with requirements for blade servers. Blades are still in Dell but may be going end of sale as a blade because the compute is now moving into GPUs and plates with chassis, which will limit the power and cooling. I feel Dell is going with what is happening in the compute industry, which is towards mainly rack-mounted servers.
All Dell PowerEdge servers now have a price impact due to inflation and shortages happening with memory and SSD drives, NVMe drives, and even some CPUs. Intel CPUs have longer lead times these days, particularly some of the 16th generation CPUs. This is what is impacting the increase in price because in discussions with customers, they are now negotiating more on availability than price. If a customer needs something available very soon, they do not ask about targeted prices.
In the latest models for Dell PowerEdge servers, they can support many more cores per platform. This higher number of cores supports more DIMMs, better PCIe cards, and we are working with some customers on consolidation platforms. For example, if customers have 15 or 10 old servers that are five years old, we are consolidating all these platforms into four or five servers because now there are more powerful platforms available. The power and cooling will be less in terms of quantity, but we need to use more power supplies to support the environment.
We are competing with other vendors such as Supermicro, Lenovo, and HPE, especially in the AI trend. Dell has a complete story behind AI and the main building block of AI is GPU servers, where Dell is playing a big role in the market and their go-to-market strategy. It is not about selling a server with a GPU but rather selling customers a solution.
What we discussed about zero trust, RAID, and the redundancy of power, as well as memories and CPUs and everything else, the redundancy here happens by design. For example, power supply design can be redundant to ensure that if there is a power supply issue, the full server is running on a single power supply until a replacement is made. By design, when speaking about RAID, you ensure that when you create the virtual machines inside the server, you duplicate the virtual machines in different nodes so that the server itself is very capable of running fully redundant. Even when speaking about network cards, you have multiple options to add and can add multiple network cards to avoid a single point of failure.
I would rate this review overall as a nine out of ten.
Which deployment model are you using for this solution?
On-premises