Chat | Windows build status | Linux build status |
---|---|---|
The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes neural networks as a series of computational steps via a directed graph. In this directed graph, leaf nodes represent input values or network parameters, while other nodes represent matrix operations upon their inputs. CNTK allows users to easily realize and combine popular model types such as feed-forward DNNs, convolutional nets (CNNs), and recurrent networks (RNNs/LSTMs). It implements stochastic gradient descent (SGD, error backpropagation) learning with automatic differentiation and parallelization across multiple GPUs and servers. CNTK has been available under an open-source license since April 2015. It is our hope that the community will take advantage of CNTK to share ideas more quickly through the exchange of open source working code.
If you prefer to use latest CNTK bits from master, use one of the CNTK nightly packages:
You can learn more about using and contributing to CNTK with the following resources:
Dear community,
With our ongoing contributions to ONNX and the ONNX Runtime, we have made it easier to interoperate within the AI framework ecosystem and to access high performance, cross-platform inferencing capabilities for both traditional ML models and deep neural networks. Over the last few years we have been privileged to develop such key open-source machine learning projects, including the Microsoft Cognitive Toolkit, which has enabled its users to leverage industry-wide advancements in deep learning at scale.
Today’s 2.7 release will be the last main release of CNTK. We may have some subsequent minor releases for bug fixes, but these will be evaluated on a case-by-case basis. There are no plans for new feature development post this release.
The CNTK 2.7 release has full support for ONNX 1.4.1, and we encourage those seeking to operationalize their CNTK models to take advantage of ONNX and the ONNX Runtime. Moving forward, users can continue to leverage evolving ONNX innovations via the number of frameworks that support it. For example, users can natively export ONNX models from PyTorch or convert TensorFlow models to ONNX with the TensorFlow-ONNX converter.
We are incredibly grateful for all the support we have received from contributors and users over the years since the initial open-source release of CNTK. CNTK has enabled both Microsoft teams and external users to execute complex and large-scale workloads in all manner of deep learning applications, such as historical breakthroughs in speech recognition achieved by Microsoft Speech researchers, the originators of the framework.
As ONNX is increasingly employed in serving models used across Microsoft products such as Bing and Office, we are dedicated to synthesizing innovations from research with the rigorous demands of production to progress the ecosystem forward.
Above all, our goal is to make innovations in deep learning across the software and hardware stacks as open and accessible as possible. We will be working hard to bring both the existing strengths of CNTK and new state-of-the-art research into other open-source projects to truly broaden the reach of such technologies.
With gratitude,
-- The CNTK Team
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
You can find more news on the official project feed
2019-03-29. CNTK 2.7.0
To setup build and runtime environment on Windows:
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles here. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
CNTK models with recursive loops can be exported to ONNX models with scan ops.
To export models larger than 2GB in ONNX format, use cntk.Function API: save(self, filename, format=ModelFormat.CNTKv2, use_external_files_to_store_parameters=False) with 'format' set to ModelFormat.ONNX and use_external_files_to_store_parameters set to True. In this case, model parameters are saved in external files. Exported models shall be used with external parameter files when doing model evaluation with onnxruntime.
2018-11-26.
Netron now supports visualizing CNTK v1 and CNTK v2 .model
files.
2018-09-17. CNTK 2.6.0
The implementation of group convolution in CNTK has been updated. The updated implementation moves away from creating a sub-graph for group convolution (using slicing and splicing), and instead uses cuDNN7 and MKL2017 APIs directly. This improves the experience both in terms of performance and model size.
As an example, for a single group convolution op with the following attributes:
The comparison numbers for this single node are as follows:
First Header | GPU exec. time (in millisec., 1000 run avg.) | CPU exec. time (in millisec., 1000 run avg.) | Model Size (in KB, CNTK format) |
---|---|---|---|
Old implementation | 9.349 | 41.921 | 38 |
New implementation | 6.581 | 9.963 | 5 |
Speedup/savings Approx. | 30% Approx. | 65-75% Approx. | 87% |
The implementation of sequential convolution in CNTK has been updated. The updated implementation creates a separate sequential convolution layer. Different from regular convolution layer, this operation convolves also on the dynamic axis(sequence), and filter_shape[0] is applied to that axis. The updated implementation supports broader cases, such as where stride > 1 for the sequence axis.
For example, a sequential convolution over a batch of one-channel black-and-white images. The images have the same fixed height of 640, but each with width of variable lengths. The width is then represented by sequential axis. Padding is enabled, and strides for both width and height are 2.
>>> f = SequentialConvolution((3,3), reduction_rank=0, pad=True, strides=(2,2), activation=C.relu)
>>> x = C.input_variable(**Sequence[Tensor[640]])
>>> x.shape
(640,)
>>> h = f(x)
>>> h.shape
(320,)
>>> f.W.shape
(1, 1, 3, 3)
There is a breaking change in the depth_to_space and space_to_depth operators. These have been updated to match ONNX specification, specifically the permutation for how the depth dimension is placed as blocks in the spatial dimensions, and vice-versa, has been changed. Please refer to the updated doc examples for these two ops to see the change.
Added support for trigonometric ops Tan
and Atan
.
Added support for alpha
attribute in ELU op.
Updated auto padding algorithms of Convolution
to produce symmetric padding at best effort on CPU, without affecting the final convolution output values. This update increases the range of cases that could be covered by MKL API and improves the performance, E.g. ResNet50.
There is a breaking change in the arguments property in CNTK python API. The default behavior has been updated to return arguments in python order instead of in C++ order. This way it will return arguments in the same order as they are fed into ops. If you wish to still get arguments in C++ order, you can simply override the global option. This change should only affect the following ops: Times, TransposeTimes, and Gemm(internal).
LogSoftMax
to use more numerically stable implementation.BatchNormalization
op export/import to latest spec.DepthToSpace
and SpaceToDepth
ops to match ONNX spec on the permutation for how the depth dimension is placed as block dimension.alpha
attribute in ELU
ONNX op.Convolution
and Pooling
export. Unlike before, these ops do not export an explicit Pad
op in any situation.ConvolutionTranspose
export and import. Attributes such as output_shape
, output_padding
, and pads
are fully supported.StopGradient
as a no-op.Hardmax
/Softmax
/LogSoftmax
import/export.Select
op export.MatMul
op.Gemm
op.MeanVarianceNormalization
op export/import to latest spec.LayerNormalization
op export/import to latest spec.PRelu
op export/import to latest spec.Gather
op export/import to latest spec.ImageScaler
op export/import to latest spec.Reduce
ops export/import to latest spec.Flatten
op export/import to latest spec.Unsqueeze
op.size
attribute has the semantics of diameter, not radius. Added validation if LRN kernel size is larger than channel size.Min
/Max
import implementation to handle variadic inputs.The Cntk.Core.Managed library has officially been converted to .Net Standard and supports .Net Core and .Net Framework applications on both Windows and Linux. Starting from this release, .Net developers should be able to restore CNTK Nuget packages using new .Net SDK style project file with package management format set to PackageReference.
The following C# code now works on both Windows and Linux:
>>> var weightParameterName = "weight";
>>> var biasParameterName = "bias";
>>> var inputName = "input";
>>> var outputDim = 2;
>>> var inputDim = 3;
>>> Variable inputVariable = Variable.InputVariable(new int[] { inputDim }, DataType.Float, inputName);
>>> var weightParameter = new Parameter(new int[] { outputDim, inputDim }, DataType.Float, 1, device, weightParameterName);
>>> var biasParameter = new Parameter(new int[] { outputDim }, DataType.Float, 0, device, biasParameterName);
>>>
>>> Function modelFunc = CNTKLib.Times(weightParameter, inputVariable) + biasParameter;
For example, simply adding an ItemGroup clause in the .csproj file of a .Net Core application is sufficient: >>> >>> >>> >>> netcoreapp2.1 >>> x64 >>> >>> >>> >>> >>> >>> >>>
2018-04-16. CNTK 2.5.1
Repack CNTK 2.5 with third party libraries included in the bundles (Python wheel packages)
2018-03-15. CNTK 2.5
Change profiler details output format to be chrome://tracing
Enable per-node timing. Working example here
import cntk as C
C.debugging.debug.set_node_timing(True)
C.debugging.start_profiler() # optional
C.debugging.enable_profiler() # optional
#<trainer|evaluator|function> executions
<trainer|evaluator|function>.print_node_timing()
C.debugging.stop_profiler()
Example profiler details view in chrome://tracing
CPU inference performance improvements using MKL
cntk.cntk_py.enable_cpueval_optimization()/cntk.cntk_py.disable_cpueval_optimization()
1BitSGD incorporated into CNTK
1BitSGD
source code is now available with CNTK license (MIT license) under Source/1BitSGD/
1bitsgd
build target was merged into existing gpu targetNew loss function: hierarchical softmax
Distributed Training with Multiple Learners
Operators
MeanVarianceNormalization
operator.Bug fixes
CNTKBinaryFormat
deserializer when crossing sweep boundarympi=no
cntk.convert
API in misc.converter.py
, which prevents converting complex networks.ONNX
ONNX.checker
compliant.OptimizedRNNStack
operator (LSTM only).MeanVarianceNormalization
.Identity
.LayerNormalization
layer using ONNX MeanVarianceNormalization
op.Concat
operator.LeakyReLu
(argument ‘alpha’ reverted to type double).Misc
find_by_uid()
under cntk.logging.graph
.2018-02-28. CNTK supports nightly build
If you prefer to use latest CNTK bits from master, use one of the CNTK nightly package.
Alternatively, you can also click corresponding build badge to land to nightly build page.
2018-01-31. CNTK 2.4
Highlights:
OPs
top_k
operation: in the forward pass it computes the top (largest) k values and corresponding indices along the specified axis. In the backward pass the gradient is scattered to the top k elements (an element not in the top k gets a zero gradient).gather
operation now supports an axis argumentsqueeze
and expand_dims
operations for easily removing and adding singleton axeszeros_like
and ones_like
operations. In many situations you can just rely on CNTK correctly broadcasting a simple 0 or 1 but sometimes you need the actual tensor.depth_to_space
: Rearranges elements in the input tensor from the depth dimension into spatial blocks. Typical use of this operation is for implementing sub-pixel convolution for some image super-resolution models.space_to_depth
: Rearranges elements in the input tensor from the spatial dimensions to the depth dimension. It is largely the inverse of DepthToSpace.sum
operation: Create a new Function instance that computes element-wise sum of input tensors.softsign
operation: Create a new Function instance that computes the element-wise softsign of a input tensor.asinh
operation: Create a new Function instance that computes the element-wise asinh of a input tensor.log_softmax
operation: Create a new Function instance that computes the logsoftmax normalized values of a input tensor.hard_sigmoid
operation: Create a new Function instance that computes the hard_sigmoid normalized values of a input tensor.element_and
, element_not
, element_or
, element_xor
element-wise logic operationsreduce_l1
operation: Computes the L1 norm of the input tensor's element along the provided axes.reduce_l2
operation: Computes the L2 norm of the input tensor's element along the provided axes.reduce_sum_square
operation: Computes the sum square of the input tensor's element along the provided axes.image_scaler
operation: Alteration of image by scaling its individual values.ONNX
Reshape
op to handle InferredDimension
.producer_name
and producer_version
fields to ONNX models.auto_pad
nor pads
atrribute is specified in ONNX Conv
op.Pooling
op serializationInputVariable
with only one batch axis.Transpose
op to match updated spec.Conv
, ConvTranspose
, and Pooling
ops to match updated spec.Operators
Convolution
op will change for groups > 1. More optimized implementation of group convolution is expected in the next release.Convolution
layer.Halide Binary Convolution
Cntk.BinaryConvolution.so/dll
library that can be used with the netopt
module. The library contains optimized binary convolution operators that perform better than the python based binarized convolution operators. To enable Halide in the build, please download Halide release and set HALIDE_PATH
environment varibale before starting a build. In Linux, you can use ./configure --with-halide[=directory]
to enable it. For more information on how to use this feature, please refer to How_to_use_network_optimization.See more in the Release Notes. Get the Release from the CNTK Releases page.