Xelera Silva provides best-in-class throughput and latency for XGBoost, LightGBM and CatBoost inference.
Machine learning based on gradient boosting frameworks such as XGBoost and LightGBM is increasingly used in many application domains, such as algorithmic trading systems, recommender systems, bioscience, or ransomware and DDoS detection systems.
Xelera Silva overcomes the latency drawback or throughput limitations of machine learning algorithms: It enables users to take advantage of in-loop machine learning inference with ultra-low latency or to eliminate throughput bottlenecks.
Sub-microsecond inference latency
Bring your own model
Concurrent model execution
Model hot swapping
High Frequency Traders use decision algorithms to automate trading instructions. The automated decisions are increasingly made by AI models. A low latency is key for these systems. Silva overcomes the latency disadvantage of machine learning algorithms: Inference of XGBoost, LightGBM and CatBoost models is performed with a latency of a few microseconds. This enables our clients to make better, sophisticated trading decisions and win speed races. The turn-key accelerator connects to the software-based trading system and offloads the AI inference to a PCIe-attached hardware accelerator card.
The turn-key accelerator connects to the software-based trading system and offloads the XGBoost / LightGBM / CatBoost inference to a PCIe-attached hardware accelerator card (ultra-low latency PCIe transfer included in round-trip latency).
Benchmark:
Test model: XGBoost Regression
Number of features: 100
Number of trees: 1000
Number of levels: 8
Batch size: 1
Test setup:
Xelera Silva on hardware accelerator (AMD Alveo U55C)
In addition to the turnkey version, Xelera Silva is also available as an IP core. The inline machine learning accelerator is inserted into the fast path of network-bound hardware accelerators and receives input from the card's network port. In this way, no data needs to be transferred via the PCIe bus and the corresponding latency for data transfer is eliminated. This product is relevant for customers with their own FPGA teams and offers the lowest latencies.
Developers of HFT hardware accelerators
integrate the IP Core into their FPGA design to benefit
from AI inference at lowest latency.
Benchmark:
Test model: XGBoost Regression
Number of features: 100
Number of trees: 1000
Number of levels: 8
Batch size: 1
Test setup:
Xelera Silva on hardware accelerator
(AMD Alveo UL3524)
Xelera Silva is a turnkey full-stack solution designed to jumpstart best-in-class AI inference acceleration.
DEB / RPM packages and FPGA bitstreams for AMD Alveo U50, U55C, U200, U250 accelerator cards and Azure NP-series virtual machine.
API: C/C++, Python, C#
Host library to load model to the FPGA and run inference
Jumpstart the AI inference acceleration with the provided example design
Integration and full lifecycle maintenance support
Periodic software updates
We understand that integrating solutions does not only require exceptional functionality but also transparent pricing models and reliable support. As technology evolves, so do we. We are committed to continuous innovation, ensuring that our software remains at the forefront of machine learning acceleration. With regular updates and feature enhancements, you can trust that you're always leveraging the latest advancements in the field. Our commitment to innovation means that you can stay ahead of the competition and unlock new possibilities for your projects.Contact us today to learn more about our pricing plans and support services. Unlock the full potential of your projects.
Do you have any Questions?