ธันวาคม 18, 2563 —
Posted by Wei Wei, TensorFlow Developer Advocate
The task of recovering a high resolution (HR) image from its low resolution counterpart is commonly referred to as Single Image Super Resolution (SISR). While interpolation methods, such as bilinear or cubic interpolation, can be used to upsample low resolution images, the quality of resulting images is generally less appealing. Deep learning, espe…
Posted by Wei Wei, TensorFlow Developer Advocate
The task of recovering a high resolution (HR) image from its low resolution counterpart is commonly referred to as Single Image Super Resolution (SISR). While interpolation methods, such as bilinear or cubic interpolation, can be used to upsample low resolution images, the quality of resulting images is generally less appealing. Deep learning, especially Generative Adversarial Networks, has successfully been applied to generate more photo-realistic images, for example, SRGAN and ESRGAN. In this blog, we are going to use a pre-trained ESRGAN model from TensorFlow Hub and generate super resolution images using TensorFlow Lite in an Android app. The final app looks like below and the complete code has been released in TensorFlow examples repo for reference.
First, we can conveniently load the ESRGAN model from TFHub and easily convert it to a TFLite model. Note that here we are using dynamic range quantization and fixing the input image dimensions to 50x50. The converted model has been uploaded to TFHub but we want to demonstrate how to do it just in case you want to convert it yourself (for example, try a different input size in your own app):
model = hub.load("https://tfhub.dev/captain-pool/esrgan-tf2/1")
concrete_func = model.signatures[tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY]
concrete_func.inputs[0].set_shape([1, 50, 50, 3])
converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()
# Save the TF Lite model.
with tf.io.gfile.GFile('ESRGAN.tflite', 'wb') as f:
f.write(tflite_model)
esrgan_model_path = './ESRGAN.tflite'
You can also convert the model without hardcoding the input dimensions at conversion time and later resize the input tensor at runtime, as TFLite now supports inputs of dynamic shapes. Please refer to this example for more information.
Once the model is converted, we can quickly verify that the ESRGAN TFLite model does generate a much better image than bicubic interpolation. We also have another tutorial on ESRGAN if you want to better understand the model.
lr = cv2.imread(test_img_path)
lr = cv2.cvtColor(lr, cv2.COLOR_BGR2RGB)
lr = tf.expand_dims(lr, axis=0)
lr = tf.cast(lr, tf.float32)
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path=esrgan_model_path)
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Run the model
interpreter.set_tensor(input_details[0]['index'], lr)
interpreter.invoke()
# Extract the output and postprocess it
output_data = interpreter.get_tensor(output_details[0]['index'])
sr = tf.squeeze(output_data, axis=0)
sr = tf.clip_by_value(sr, 0, 255)
sr = tf.round(sr)
sr = tf.cast(sr, tf.uint8)
LR: low resolution input image cropped from an butterfly image in DIV2K dataset. ESRGAN (x4): super resolution output image generated using ESRGAN model with upscale_ratio=4. Bicubic: output image generated using bicubic interpolation. As can be seen here, bicubic interpolation-generated image is much blurrier than the ESRGAN-generated one. PSNR is also higher on the ESRGAN-generated image. |
As you may already know, TensorFlow Lite is the official framework to run inference with TensorFlow models on edge devices and is deployed on more than 4 billions edge devices worldwide, supporting Android, iOS, Linux-based IoT devices and microcontrollers. You can use TFLite in Java, C/C++ or other languages to build Android apps. In this blog we will use TFLite C API since many developers have asked for such an example.
We distribute TFLite C header files and libraries in prebuilt AAR files (the core library and the GPU library). To set up the Android build correctly, the first thing we need to do is download the AAR files and extract the header files and shared libraries. Have a look at how this is done in the download.gradle file.
Since we are using Android NDK to build the app (NDK r20 is confirmed to work), we need to let Android Studio know how native files should be handled. This is done in CMakeList.txt file:
We included 3 sample images in the app, so a user may easily run the same model multiple times, which means we need to cache the interpreter for better efficiency. This is done by passing the interpreter pointer from C++ code to Java code, after the interpreter is successfully created:
extern "C" JNIEXPORT jlong JNICALL
Java_org_tensorflow_lite_examples_superresolution_MainActivity_initWithByteBufferFromJNI(JNIEnv *env, jobject thiz, jobject model_buffer, jboolean use_gpu) {
const void *model_data = static_cast<void *>(env->GetDirectBufferAddress(model_buffer));
jlong model_size_bytes = env->GetDirectBufferCapacity(model_buffer);
SuperResolution *super_resolution = new SuperResolution(model_data, static_cast<size_t>(model_size_bytes), use_gpu);
if (super_resolution->IsInterpreterCreated()) {
LOGI("Interpreter is created successfully");
return reinterpret_cast<jlong>(super_resolution);
} else {
delete super_resolution;
return 0;
}
}
Once the interpreter is created, running the model is fairly straightforward as we can follow the TFLite C API documentation. We first need to carefully extract RGB values from each pixel. Now we can run the interpreter:
// Feed input into model
status = TfLiteTensorCopyFromBuffer(input_tensor, input_buffer, kNumberOfInputPixels * kImageChannels * sizeof(float));
…...
// Run the interpreter
status = TfLiteInterpreterInvoke(interpreter_);
…...
// Extract the output tensor data
const TfLiteTensor* output_tensor = TfLiteInterpreterGetOutputTensor(interpreter_, 0);
float output_buffer[kNumberOfOutputPixels * kImageChannels];
status = TfLiteTensorCopyToBuffer(output_tensor, output_buffer, kNumberOfOutputPixels * kImageChannels * sizeof(float));
…...
With the models results at hand, we can pack the RGB values back into each pixel.
There you have it. A reference Android app using TFLite to generate super resolution images on device. More details can be found in the code repository. Hopefully this is useful as a reference to Android developers who are getting started to use TFLite in C/C++ to build amazing ML apps.
We are looking forward to seeing what you have built with TensorFlow Lite, as well as hearing your feedback. Share your use cases with us directly or on Twitter with hashtags #TFLite and #PoweredByTF. To report bugs and issues, please reach out to us on GitHub.
The author would like to thank @captain__pool for uploading the ESRGAN model to TFHub, and Tian Lin and Jared Duke for the helpful feedback.
[1] Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, Wenzhe Shi. 2016. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network.
[2] Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Chen Change Loy, Yu Qiao, Xiaoou Tang. 2018. ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks.
[3] Tensorflow 2.x based implementation of EDSR, WDSR and SRGAN for single image super-resolution
[4] @captain__pool’s ESGRAN code implementation
[5] Eirikur Agustsson, Radu Timofte. 2017. NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study.
ธันวาคม 18, 2563
—
Posted by Wei Wei, TensorFlow Developer Advocate
The task of recovering a high resolution (HR) image from its low resolution counterpart is commonly referred to as Single Image Super Resolution (SISR). While interpolation methods, such as bilinear or cubic interpolation, can be used to upsample low resolution images, the quality of resulting images is generally less appealing. Deep learning, espe…