We are aware of this issue, and are working on a fix. It is possible that with large numbers of boxes, the inference engine may fail due to memory constraints. When inference fails, the inference endpoint returns a `CHECK` fail. We have observed that inference of the following code snippet may fail due to memory constraints: >>> boxes = list ( range ( 4 )) >>> boxes [ 0 ] tensorflow.python.ops.tensor.DenseTensor of size 4 with shape=(5, 1))> When inference fails due to memory constraints, the inference endpoint returns a `CHECK` fail, as shown in the example above. This failure can be very problematic for production systems, as it may cause the inference process to restart, potentially leading to another failure when the CPU is already being overloaded. As you can see, the code above is a simple feed of numbers. We have observed that TensorFlow is failing due to memory constraints when inferenceing this code. To work around this issue, we have implemented a quick fix. When TensorFlow encounters a `CHECK` fail due to memory constraints, it will instead return an `ERROR` fail, which will be logged to the console.

TensorFlow API Issues

The TensorFlow API is still under development. As a result, the behavior of existing code may change in future releases. This includes the behavior of x86-32 and x86-64 code within Python 2.7, 3.4, or 3.5 when using TensorFlow on Linux or OS X (we currently do not support Windows at this time).

Install TensorFlow with GPU support

To work around this issue, we have implemented a quick fix. When TensorFlow encounters a `CHECK` fail due to memory constraints, it will instead return an `ERROR` fail, which will be logged to the console.
To make this change easier for users, we have also created a pip package for TensorFlow that automatically handles both of these cases:
- The latest release includes GPU support by default and installs the necessary packages.
- The latest release is available on PyPI: https://pypi.python.org/pypi/tensorflow-gpu
If you would like to manually install TensorFlow with GPU support, see our documentation for more information:
- https://github.com/tensorflow/tensorflow/blob/master/README.md#how-to-install-with-gpu

Build Variables

We have also implemented a fix to better detect when the inference engine is failing due to memory constraints and abort the inference process. If an `ERROR` fail occurs while doing inference, then TensorFlow will stop the inference process, which will not restart until more memory becomes available.

Timeline

Published on: 09/16/2022 23:15:00 UTC
Last modified on: 09/20/2022 14:43:00 UTC

References