Releases: aws-neuron/aws-neuron-sdk
Neuron SDK Release - August 19, 2020
Bug fix for an error reporting issue with the Neuron Runtime. Previous versions of the runtime were only reporting uncorrectable errors on half of the dram per Inferentia. Other Neuron packages are not changed.
Neuron SDK Release - August 08, 2020
This release of the Neuron SDK delivers performance enhancements for the BERT Base model. Sequence lengths including 128, 256 and 512 were found to have best performance at batch size 6, 3 and 1 respectively using publically available versions of both Pytorch (1.5.x) and Tensorflow-based (1.15.x) models. The compiler option "-O2" was used in all cases.
A new Kubernetes scheduler extension is included in this release to improve pod scheduling on inf1.6xlarge and inf1.24xlarge instance sizes. Details on how the scheduler works and how to apply the scheduler can be found here. Check the Neuron K8 release notes for details changes to k8 components going forawrd.
Neuron SDK Release - August 5, 2020
Bug fix for a latent issue caused by a race condition in Neuron Runtime leading to possible crashes. The crash was observed under stress load conditons. All customers are encouraged to update the latest Neuron Runtime package (aws-neuron-runtime), version 1.0.8813.0 or newer. Other Neuron packages are being updated as well, but are to be considered non-critical updates.
Neuron SDK Release - July 16, 2020
This release of Neuron SDK adds support for the OpenPose (posenet) Neural Network. An example of using Openpose for end to end inference is available here.
A new PyTorch auto-partitioner feature now automatically builds a Neuron specific graph representation of PyTorch models. The key benefit of this feature is automatic partitioning the model graph to run the supported operators on the NeuronCores and the rest on the host. PyTorch auto-partitioner is enabled by default with ability to disable if a manual partition is needed. More details here. The release also includes various bug fixes and increased operator support.
Important to know:
- This update moves the supported version for PyTorch to the current release (PyTorch 1.5.1)
- This release supports Python 3.7 Conda packages in addition to Python 3.6 Conda packages
Neuron SDK Release - June 18, 2020
Point fix an error related to yum downgrade/update of Neuron Runtime packages. The prior release fails to successfully downgrade/update Neuron Runtime Base package and Neuron Runtime package when using Yum on Amazon Linux 2.
Please remove and then install both packages on AL2 using these commands:
# Amazon Linux 2
sudo yum remove aws-neuron-runtime-base
sudo yum remove aws-neuron-runtime
sudo yum install aws-neuron-runtime-base
sudo yum install aws-neuron-runtime
Neuron SDK Release - June 11, 2020
This Neuron release provides support for the recent launch of EKS for Inf1 instance types and numerous other improvements. More details about how to use EKS with the Neuron SDK can be found in AWS documentation here.
This release adds initial support for OpenPose PoseNet for images with resolutions upto 400x400.
This release also adds a '-O2' option to the Neuron Compiler. '-O2' can help with handling of large tensor inputs.
In addition the Neuron Compiler increments the version of the compiled artifacts, called "NEFF", to version 1.0. Neuron Runtime versions earlier than the 1.0.6905.0 release in May 2020 will not be able to execute NEFFs compiled from this release forward. Please see Neuron Runtime Release Notes for compatibility.
Stay up to date on future improvements and new features by following the Neuron SDK Roadmap.
Refer to the detailed release notes for more information on each Neuron component.
Important to know:
-
Size of neural network. The current Neuron compiler release has a limitation in terms of the size of neural network it could effectively optimize for. The size of neural network is influenced by a number of factors including: a) type of neural network (CNN, LSTM, MLP) , b) number of layers, c) sizes of input (dimension of the tensors, batch size, ...). Using the Neuron Compiler '-O2' option can help with handling of large tensor inputs for some models. If not used, Neuron limits the size of CNN models like ResNet to an input size of 480x480 fp16/32, batch size=4; LSTM models like GNMT to have a time step limit of 900; MLP models like BERT to have input size limit of sequence length=128, batch=8.
-
INT8 data type is not currently supported by the Neuron compiler.
-
Neuron does not support TensorFlow 2 or PyTorch 1.4.0.
Neuron SDK Release - May 15, 2020
Point fix an error related to installation of the Neuron Runtime Base package. The prior release fails to successfully start Neuron Discovery when the Neuron Runtime package is not also installed. This scenario of running Neuron Discovery alone is critical to users of Neuron in container environments.
Please update the aws-neuron-runtime-base package:
# Ubuntu 18 or 16:
sudo apt-get update
sudo apt-get install aws-neuron-runtime-base
# Amazon Linux, Centos, RHEL
sudo yum update
sudo yum install aws-neuron-runtime-base
Neuron SDK Release - May 11, 2020
This release provides additional throughput improvements to running inference on a variety of models; for example BERTlarge throughput has improved by an additional 35% compared to the previous release and with peak thoughput of 360 seq/second on inf1.xlarge (more details here).
In addition to the performance boost, this release adds PyTorch, and MXNet framework support for BERT models, as well as expands container support in preparation to an upcoming EKS launch.
We continue to work on new features and improving performance further, to stay up to date follow this repository and our Neuron roadmap.
Refer to the detailed release notes for more information for each Neuron component.
Important to know:
-
Size of neural network. The current Neuron compiler release has a limitation in terms of the size of neural network it could effectively optimize for. The size of neural network is influenced by a number of factors including: a) type of neural network (CNN, LSTM, MLP) , b) number of layers, c) sizes of input (dimension of the tensors, batch size, ...). As a result, we limit the sizes of CNN models like ResNet to have an input size limit of 480x480 fp16/32, batch size=4; LSTM models like GNMT to have a time step limit of 900; MLP models like BERT to have input size limit of sequence length=128, batch=8.
-
INT8 data type is not currently supported by the Neuron compiler.
-
Neuron does not support TensorFlow 2 or PyTorch 1.4.0.
Neuron SDK Release - March 26, 2020
This release supports a variant of the SSD object detection network, a SSD inference demo is available here
This release also enhances our Tensorboard support to enable CPU-node visibility.
Refer to the detailed release notes for more information for each neuron component.
Important to know:
-
Size of neural network. The current Neuron compiler release has a limitation in terms of the size of neural network it could effectively optimize for. The size of neural network is influenced by a number of factors including: a) type of neural network (CNN, LSTM, MLP) , b) number of layers, c) sizes of input (dimension of the tensors, batch size, ...). As a result, we limit the sizes of CNN models like ResNet to have an input size limit of 480x480 fp16/32, batch size=4; LSTM models like GNMT to have a time step limit of 900; MLP models like BERT to have input size limit of sequence length=128, batch=8.
-
INT8 data type is not currently supported by the Neuron compiler.
-
Neuron does not support TensorFlow 2 or PyTorch 1.4.0.
Neuron SDK Release - February 27, 2020
This release improves performance throughput by up to 10%, for example ResNet-50 on inf1.xlarge has increased from 1800 img/sec to 2040 img/sec, Neuron logs include more detailed messages and various bug fixes. Refer to the detailed release notes for more details.
We continue to work on new features and improving performance further, to stay up to date follow this repository, and watch the AWS Neuron developer forum.
Important to know:
-
Size of neural network. The current Neuron compiler release has a limitation in terms of the size of neural network it could effectively optimize for. The size of neural network is influenced by a number of factors including: a) type of neural network (CNN, LSTM, MLP) , b) number of layers, c) sizes of input (dimension of the tensors, batch size, ...). As a result, we limit the sizes of CNN models like ResNet to have an input size limit of 480x480 fp16/32, batch size=4; LSTM models like GNMT to have a time step limit of 900; MLP models like BERT to have input size limit of sequence length=128, batch=8.
-
Computer-vision object detection and segmentation models are not yet supported.
-
INT8 data type is not currently supported by the Neuron compiler.
-
Neuron does not support TensorFlow 2 or PyTorch 1.4.0.