How to Integrate Custom TLT Semantic Segmentation Model to NVIDIA® Jetson™ Modules?
WHAT YOU WILL LEARN?
1- How to Generate Engine Using tlt-converter?
2- How to Deploy the Model in Deepstream?
ENVIRONMENT
Hardware: DSBOX-NX2
OS: Jetpack 4.5
In this blog post, we will show how to integrate custom segmentation model that we previously trained using NVIDIA® Transfer Learning Toolkit into Jetson™ modules with Deepstream. To learn how to train a custom segmentation model with transfer learning toolkit, click here.
Before we get started, you should download tlt-converter to generate an engine and ds-tlt-segmentation to test the model.
To install tlt-converter go to the following page, download and unzip the zip file for Jetpack 4.5:
https://docs.nvidia.com/metropolis/TLT/tlt-user-guide/text/tensorrt.html#tlt-converter-matrix
You need to install OpenSSL package as well.
sudo apt-get install libssl-dev
Export the following environment variables.
export TRT_LIB_PATH=”/usr/lib/aarch64-linux-gnu”
export TRT_INC_PATH=”/usr/include/aarch64-linux-gnu”
To install ds-tlt, go to following GitHub post and download deepstream-tlt-apps according to the instructions. You do not need to download models since we will use our own model.
https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps#2-build-sample-application
How to Generate Engine Using tlt-converter?
Make sure you have exported models that we previously trained.
Now, we will generate an engine using tlt-converter. You can find the usage of tlt-converter by running the following command.
./tlt-converter -h
We will show the steps for each exported model (fp16, fp32 and int8).
Fp16 model:
cd exported_models/fp16/
/home/nvidia/tlt-converter -k forecr -p input_1,1x3x320x320,2x3x320x320,3x3x320x320 segmentation_model_fp16.etlt -e fp16.engine -t fp16
Fp32 Model:
cd exported_models/fp32/
/home/nvidia/tlt-converter -k forecr -p input_1,1x3x320x320,2x3x320x320,3x3x320x320 segmentation_model_fp32.etlt -e fp32.engine -t fp32
Int8 Model (do not forget to add calibration cache file):
cd exported_models/int8/
/home/nvidia/tlt-converter -k forecr -p input_1,1x3x320x320,2x3x320x320,3x3x320x320 segmentation_model_int8.etlt -e int8.engine -t int8 -c cal.bin
How to Deploy the Model in Deepstream?
Before running ds-tlt, we need to make changes on the Make files.
Set NVDS_VERSION:=5.1 in apps/Makefile and post_processor/Makefile.
Now, you can set the CUDA version and run Make files.
export CUDA_VER=xy.z // xy.z is CUDA version, e.g. 10.2
make
Also, go to deepstream source code and change input/output dimensions according to your model. For example, for Resnet18-3 channel model set the following.
#define MODEL_OUTPUT_WIDTH 320
#define MODEL_OUTPUT_HEIGHT 320
Finally, you can run ds-tlt-segmentation to test the model. Supported formats for the test input are jpeg and h264. You can use the sample videos and pictures that comes with deepstream, under opt/nvidia/deepstream/deepstream-5.0/samples/streams.
./apps/tlt_segmentation/ds-tlt-segmentation -c configs/unet_tlt/pgie_unet_tlt_config.txt -i streams/sample_720p.jpeg
Thank you for reading our blog post.