Release Neuron SDK Release - October 26, 2023 · aws-neuron/aws-neuron-sdk

What’s New

This release adds support for PyTorch 2.0 (Beta), increases performance for both training and inference workloads, adding ability to train models like Llama-2-70B using neuronx-distributed. With this release, we are also adding pipeline parallelism support for neuronx-distributed enabling full 3D parallelism support to easily scale training to large model sizes. Neuron 2.15 also introduces support for training resnet50, milesial/Pytorch-UNet and deepmind/vision-perceiver-conv models using torch-neuronx, as well as new sample code for flan-t5-xl model inference using neuronx-distributed, in addition to other performance optimizations, minor enhancements and bug fixes.

What’s New	Details	Instances
Neuron Distributed (neuronx-distributed) for Training	Pipeline parallelism support. See API Reference Guide (neuronx-distributed ) , pp_developer_guide and pipeline_parallelism_overview Llama-2-70B model training script (sample script) (tutorial) Mixed precision support. See pp_developer_guide Support serialized checkpoint saving and loading using save_xser and load_xser parameters. See API Reference Guide (neuronx-distributed ) See more at Neuron Distributed Release Notes (neuronx-distributed)	Trn1/Trn1n
Neuron Distributed (neuronx-distributed) for Inference	flan-t5-xl model inference script (tutorial) See more at Neuron Distributed Release Notes (neuronx-distributed) and API Reference Guide (neuronx-distributed )	Inf2,Trn1/Trn1n
Transformers Neuron (transformers-neuronx) for Inference	Serialization support for Llama, Llama-2, GPT2 and BLOOM models . See developer guide and tutorial See more at Transformers Neuron (transformers-neuronx) release notes	Inf2, Trn1/Trn1n
PyTorch Neuron (torch-neuronx)	Introducing PyTorch 2.0 Beta support. See Introducing PyTorch 2.0 Support (Beta) . See llama-2-7b training , bert training and t5-3b inference samples. Scripts for training resnet50[Beta] , milesial/Pytorch-UNet[Beta] and deepmind/vision-perceiver-conv[Beta] models.	Trn1/Trn1n,Inf2
AWS Neuron Reference for Nemo Megatron library (neuronx-nemo-megatron)	Llama-2-70B model training sample using pipeline parallelism and tensor parallelism ( tutorial ) GPT-NeoX-20B model training using pipeline parallelism and tensor parallelism See more at AWS Neuron Reference for Nemo Megatron(neuronx-nemo-megatron) Release Notes and neuronx-nemo-megatron github repo	Trn1/Trn1n
Neuron Compiler (neuronx-cc)	New llm-training option argument to --distribution_strategy compiler option for optimizations related to distributed training. See more at Neuron Compiler CLI Reference Guide (neuronx-cc) See more at Neuron Compiler (neuronx-cc) release notes	Inf2/Trn1/Trn1n
Neuron Tools	alltoall Collective Communication operation, previously released in Neuron Collectives v2.15.13, was added as a testable operation in nccom-test. See NCCOM-TEST User Guide See more at Neuron System Tools	Inf1/Inf2/Trn1/Trn1n
Documentation Updates	New App Note and Developer Guide about Activation memory reduction using sequence parallelism and activation recomputation in neuronx-distributed Added a new Model Samples and Tutorials summary page. See Model Samples and Tutorials Added Neuron SDK Classification guide. See Neuron Software Classification See more at Neuron Documentation Release Notes	Inf1, Inf2, Trn1/Trn1n
Minor enhancements and bug fixes.	See Neuron Components Release Notes	Trn1/Trn1n , Inf2, Inf1
Release Artifacts	see Release Artifacts	Trn1/Trn1n , Inf2, Inf1

For more detailed release notes of the new features and resolved issues, see Neuron Components Release Notes.

To learn about the model architectures currently supported on Inf1, Inf2, Trn1 and Trn1n instances, please see Model Architecture Fit Guidelines.

[What’s New](https://awsdocs-neuron-staging.readthedocs-hosted.com/en/latest/release-notes/index.html#id7)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Neuron SDK Release - October 26, 2023

What’s New