Customized model using DevKit with KL520 and SDK
Hi,
We used customized CNN model (input is 120x120 image, output is a float number for regression problem) and successfully transform ONNX into NEF
The test with dongle is OK (RMSE=3-4 for RGBA8888, RMSE=5-6 for RGB565)
There are two questions,
- Is it possible to use RGBA8888 format with development kit? we want to increase accuracy so RGBA8888 is a better choice than RGB565, but the example of "tiny_yolo_v3" only provides RGB565 format for camera and display. Is there any way to use RGBA8888 for dev kit?
- Our debug message printed from UART is shown below. The output of model should be printed in roast degree:[xx.xx]. However, the number is always 0.00. It seems that no error occurs when the program running. I dont know why my result is incorrect (maybe the data size problem?). Could you help us find out the solution?
- Thanks!!!!!
/////The settings of are000 shown here://///
#define IMG_W RGB_IMG_SOURCE_W
#define IMG_H RGB_IMG_SOURCE_H
#define IMG_CH 3
#define IMG_SIZE (IMG_W * IMG_H * IMG_CH)
#define IMG_FORMAT IMAGE_FORMAT_RGB565
#define DISPLAY_FORMAT V2K_PIX_FMT_RGB565
#ifdef MODEL_COMPILATION_WITH_ADD_NORM
#define INF_IMG_FORMAT (IMAGE_FORMAT_SUB128 | NPU_FORMAT_RGB565 | IMAGE_FORMAT_PARALLEL_PROC)
/////-----------------------------------------------------/////
The UART msg is shown below:
BOOT MODE: Manual
1. SPI
2. UART(Xmodem)
Please select boot mode[1-2]: 1
[0.000]
#########################################
[0.000] ## With key ##
[0.000] #########################################
[0.000] -> Bootup Status: 0x10000 <-
[0.000] : Power Button Reset
[0.000] [kmdw_cam_mipi_init] init 270548244
[0.000] [kmdw_cam_mipi_init] init 270548252
ncpu: Ready!
[0.000] === Versions (host mode) ===
[0.000] SPL: 1.1.1.0-build.2
[0.000] FW: 1.7.0.0-build.1229
[0.000]
=== Menu ===
[0.000] ( 1) Start Tiny Yolo
[0.000] ( 2) Stop Tiny Yolo
[0.000] ( 3) Toggle pipeline mode
[0.000] ( 4) Quit
[0.000] command >> 1
[4.327] cam 0: frame buf[0] : 0x63c12820
[4.327] cam 0: frame buf[1] : 0x63b31820
[4.327] cam 0: frame buf[2] : 0x63a50820
[4.328] cam 0: frame buf[3] : 0x6396f820
[4.328] cam 0: frame buf[4] : 0x6388e820
[4.328] cam 0: frame buf[5] : 0x637ad820
[4.328] cam 0: frame buf[6] : 0x636cc820
[4.329] cam 0: frame_info buf: 0x636a6e70, size: 7*60
[4.329] command >> [4.539]
=[00004539]= [0] camera # 1 (pre 00004334)
[4.539]
XXX Model predict from here!!!! XXX
[4.684] roast degree:[-0.00]
[4.684] Round trip 936941207: pre/npu/post: 3/25/0 ms
[4.685] -[00004685]- (inf out)
[4.690]
=[00004690]= [1] camera # 2 (pre 00004690)
[4.690]
XXX Model predict from here!!!! XXX
[4.718] roast degree:[-0.00]
[4.718] Round trip 943769139: pre/npu/post: 3/26/0 ms
[4.719] --> FPS: 29.41 (34 ms)
[4.719] -[00004719]- (inf out)
[4.724]
=[00004724]= [2] camera # 3 (pre 00004724)
[4.724]
...
Comments
Hi, Dereck, Thank for the reply!
I try to modify the post_process function and get a result which is not correct
I print some messages and find out the status is not right
The codes are shown below:
int preprocess_rgb565_bgr_crop_dennis(int model_id, struct kdp_image_s *image_p)
{
int in_row, in_col, in_ch, top, bottom, left, right, channel;
int input_radix, bit_shift;
uint8_t *src_p, *dst_p;
in_row = DIM_INPUT_ROW(image_p);
in_col = DIM_INPUT_COL(image_p);
in_ch = DIM_INPUT_CH(image_p);
top = RAW_CROP_TOP(image_p);
bottom = RAW_CROP_BOTTOM(image_p);
left = RAW_CROP_LEFT(image_p);
right = RAW_CROP_RIGHT(image_p);
channel = 2;
src_p = (uint8_t *)RAW_IMAGE_MEM_ADDR(image_p);
dst_p = (uint8_t *)PREPROC_INPUT_MEM_ADDR(image_p);
int data_size = sizeof(uint8_t)*in_ch;
int len_row = data_size*in_row;
int out_data_size = sizeof(uint8_t)*channel;
int out_len_row = out_data_size*(right-left);
for (int y = 0; y < in_col; y++)
{
for (int x = 0; x < in_row; x++)
{
if ( (x > left) && (x <= right) && (y > top) && (y <= bottom) )
{
uint16_t rgb565 = (uint16_t)*(src_p+len_row*y+data_size*x);
uint8_t r = (uint8_t)(rgb565 >> 11) << 3;
uint8_t g = (uint8_t)(rgb565 >> 5) << 2;
uint8_t b = (uint8_t)(rgb565 << 3);
uint16_t bgr565 = ((r-128)>>3) | ((g-128)<<3) | ((b-128) << 8);
*(dst_p+out_len_row*(y-top-1)+out_data_size*(x-left-1)) = bgr565;
}
}
}
return 0;
int post_dennis(int model_id, struct kdp_image_s *image_p)
{
struct dennis_post_globals_s *gp = get_dennis_gp();
uint8_t *result_p;
int div;
float scale;
int8_t *src_p = (int8_t *)POSTPROC_OUTPUT_MEM_ADDR(image_p);
/* Convert to float */
scale = *(float *)&POSTPROC_OUT_NODE_SCALE(image_p);
div = 1 << POSTPROC_OUT_NODE_RADIX(image_p);
gp->temp.roast_degree = (float)*src_p;
gp->temp.roast_degree = do_div_scale(gp->temp.roast_degree, div, scale);
DSG("model result:[%.2f]\n",(gp->temp.roast_degree));
result_p = (uint8_t *)(POSTPROC_RESULT_MEM_ADDR(image_p));
memcpy(result_p, &(gp->temp), sizeof(struct dennis_post_globals_s));
return sizeof(struct dennis_post_globals_s);
}
[122.240] Round trip -1321960007: pre/npu/post: 1/27/2 ms
[122.240] --> FPS: 12.35 (81 ms)
PreProcessing 0!
Run NPU!!
[122.288]
=[00122288]= [2] camera # 1471 (pre 00122288)
[122.288]
XXX Model predict from here!!!! XXX
Run Post Process!
model result:[10.02]
[122.318] [INFO] IMAGE STATE error, status:1
[122.318] -[00122318]- (inf out)
[122.320] roast degree:[10.02]
[122.321] Round trip -1305760728: pre/npu/post: 0/28/2 ms
[122.321] --> FPS: 12.35 (81 ms)
PreProcessing 0!
Run NPU!!
[122.369]
=[00122369]= [3] camera # 1472 (pre 00122369)
[122.369]
XXX Model predict from here!!!! XXX
Run Post Process!
model result:[10.02]
[122.399] [INFO] IMAGE STATE error, status:1
[122.399] -[00122399]- (inf out)
[122.401] roast degree:[10.02]
[122.402] Round trip -1289560365: pre/npu/post: 1/27/2 ms
[122.402] --> FPS: 12.35 (81 ms)
PreProcessing 0!
Run NPU!!
[122.450]
=[00122450]= [0] camera # 1473 (pre 00122450)
[122.450]
XXX Model predict from here!!!! XXX
Run Post Process!
model result:[10.02]
[122.480] [INFO] IMAGE STATE error, status:1
[122.480] -[00122480]- (inf out)
[122.482] roast degree:[10.02]
[122.483] Round trip -1273363274: pre/npu/post: 1/27/2 ms
[122.483] --> FPS: 12.35 (81 ms)
PreProcessing 0!
Run NPU!!
[122.531]
=[00122531]= [1] camera # 1474 (pre 00122531)
[122.531]
XXX Model predict from here!!!! XXX
Run Post Process!
model result:[10.02]
[122.561] [INFO] IMAGE STATE error, status:1
[122.561] -[00122561]- (inf out)
[122.563] roast degree:[10.02]
[122.564] Round trip -1257160903: pre/npu/post: 0/27/3 ms
[122.564] --> FPS: 12.35 (81 ms)
PreProcessing 0!
Run NPU!!
[122.612]
=[00122612]= [2] camera # 1475 (pre 00122612)
[122.612]
XXX Model predict from here!!!! XXX
Run Post Process!
model result:[10.02]
[122.642] [INFO] IMAGE STATE error, status:1
[122.642] -[00122642]- (inf out)
[122.644] roast degree:[10.02]
static int tiny_yolo_run_image(uint32_t app_id, struct kapp_img_run_s *img_run_p)
{
int status;
bool is_dme = false;
if (img_run_p == NULL)
return -1;
//check where the model is stored
//is_dme = kmdw_model_get_location();
kmdw_model_config_img(&img_run_p->img_cfg, &img_run_p->crop_box, &img_run_p->pad_values, img_run_p->ext_param);
kmdw_model_config_result(img_run_p->evt_id, img_run_p->evt_flag);
//status = kmdw_model_run("kapp_tiny_yolo", img_run_p->out, TINY_YOLO_V3_224_224_3, is_dme);
status = kmdw_model_run("dennis_CNN", img_run_p->out, CUSTOMER_MODEL_1, is_dme);
if (status == KMDW_MODEL_RUN_RC_ABORT) {
info_msg("[INFO] Got abort request\n");
return KAPP_ABORT;
} else if(status == KMDW_MODEL_RUN_RC_ERROR) {
info_msg("[INFO] Run Model error\n");
return KAPP_ERR;
} else if(status != IMAGE_STATE_DONE) {
info_msg("[INFO] IMAGE STATE error, status:%d\n",status);
return KAPP_ERR;
}
return KAPP_OP_OK;
}
Did you check that these parameters are correct?(in_row, in_col, in_ch, top, bottom, left, right, channel...)
KL520 model format order is wch, w will be aligned to 16 (below link as your reference. FAQ 4)
End to End Simulator - Document Center (kneron.com)
Hi, Dereck, thanks for your hint and link!!
I've figure out the solution!
And I changed the application. It becomes a camera-vedio with red rectangle (measure range)
As the user enter "3" to cmd, the dev kit measures the value (predicted by model).
In the debug procedure, I guess the format of image is the most important thing.
I print every param and finally find out the raw format and input format are different!!!
(I feed data in B,G,R format because I trained the model with BGR888 input)
The correct preprocess, postprocess, and UART message result are shown below:
(If there is any mistake, please let me know, thank you!!!)
///////////////////////////////////////////////////////////////////////////// Preprocess
int preprocess_rgb565_bgr_crop_dennis(int model_id, struct kdp_image_s *image_p)
{
int in_row, in_col, in_ch, top, bottom, left, right, channel, raw_row, raw_col, raw_ch;
int input_radix, bit_shift;
uint8_t *src_p, *dst_p;
int32_t *len_p;
raw_row = RAW_INPUT_ROW(image_p);
raw_col = RAW_INPUT_COL(image_p);
top = RAW_CROP_TOP(image_p);
bottom = RAW_CROP_BOTTOM(image_p);
left = RAW_CROP_LEFT(image_p);
right = RAW_CROP_RIGHT(image_p);
raw_ch = 2;
in_row = DIM_INPUT_ROW(image_p);
in_col = DIM_INPUT_COL(image_p);
in_ch = DIM_INPUT_CH(image_p);
//DSG("data:%d,%d,%d,%d,%d,%d,%d\n",top,bottom,left,right,raw_row,raw_col,in_ch);
src_p = (uint8_t *)RAW_IMAGE_MEM_ADDR(image_p);
dst_p = (uint8_t *)PREPROC_INPUT_MEM_ADDR(image_p);
len_p = (int32_t *)&PREPROC_INPUT_MEM_LEN(image_p);
int data_size = sizeof(uint8_t)*raw_ch;
int len_row = data_size*raw_col;
int out_data_size = sizeof(uint8_t)*in_ch;
int out_len_row = out_data_size*in_col;
int num = 0;
for (int y = 0; y < raw_row; y++)
{
for (int x = 0; x < raw_col; x++)
{
if ( (x > left) && (x <= right) && (y > top) && (y <= bottom) )
{
uint16_t rgb565 = *(uint16_t *)(src_p+len_row*y+data_size*x);
uint8_t r = (uint8_t)(rgb565 >> 11) << 3;
uint8_t g = (uint8_t)(rgb565 >> 5) << 2;
uint8_t b = (uint8_t)(rgb565 << 3);
//uint16_t bgr565 = ((r-128)>>3) | ((g-128)<<3) | ((b-128) << 8);
//*(dst_p+out_len_row*(y-top-1)+out_data_size*(x-left-1)) = bgr565;
*(dst_p+out_len_row*(y-top-1)+out_data_size*(x-left-1)) = b-128;
*(dst_p+out_len_row*(y-top-1)+out_data_size*(x-left-1)+1) = g-128;
*(dst_p+out_len_row*(y-top-1)+out_data_size*(x-left-1)+2) = r-128;
num += 3;
}
}
}
//DSG("num:%d\n",num);
*len_p = in_row*in_col*in_ch;
//DSG("pre_len,%d\n",PREPROC_INPUT_MEM_LEN(image_p));
return 0;
}
///////////////////////////////////////////////////////////////////////////// Postprocess
int post_dennis(int model_id, struct kdp_image_s *image_p)
{
// model output dim:(H, C, W_aligned)
struct dennis_post_globals_s *gp = get_dennis_gp();
float *result_p;
int div;
float scale;
int8_t *src_p = (int8_t *)MODEL_OUTPUT_MEM_ADDR(image_p);
// Convert to float
scale = *(float *)&POSTPROC_OUT_NODE_SCALE(image_p);
div = 1 << POSTPROC_OUT_NODE_RADIX(image_p);
gp->temp.roast_degree = (float)*src_p;
gp->temp.roast_degree = do_div_scale(gp->temp.roast_degree, div, scale);
//DSG("output:[%.2f]\n",gp->temp.roast_degree);
//DSG("scale:[%.2f]\n",scale);
//DSG("div:[%d]\n",div);
result_p = (float *)(POSTPROC_RESULT_MEM_ADDR(image_p));
*result_p = gp->temp.roast_degree;
int32_t model_len = MODEL_OUTPUT_MEM_LEN(image_p);
//DSG("model_len:[%d]\n",model_len); // w aligned to 16 bytes
//DSG("result:[%.2f]\n",*result_p);
return 0;
}
///////////////////////////////////////////////////////////////////////////// UART result
BOOT MODE: Manual
1. SPI
2. UART(Xmodem)
Please select boot mode[1-2]: 1
[0.000]
#########################################
[0.000] ## With key ##
[0.000] #########################################
[0.000] -> Bootup Status: 0x10000 <-
[0.000] : Power Button Reset
[0.000] [kmdw_cam_mipi_init] init 270547368
[0.000] [kmdw_cam_mipi_init] init 270547376
ncpu: Ready!
[0.000] === Versions (host mode) ===
[0.000] SPL: 1.1.1.0-build.2
[0.000] FW: 1.7.0.0-build.1229
[0.000]
=== Menu ===
[0.000] ( 1) Start Camera
[0.000] ( 2) Stop Camera
[0.000] ( 3) measure coffee degree
[0.000] ( 4) Quit
[0.000] command >> 1
[3.272] cam 0: frame buf[0] : 0x63c12850
[3.272] cam 0: frame buf[1] : 0x63b31850
[3.272] cam 0: frame buf[2] : 0x63a50850
[3.272] cam 0: frame buf[3] : 0x6396f850
[3.273] cam 0: frame buf[4] : 0x6388e850
[3.273] cam 0: frame buf[5] : 0x637ad850
[3.273] cam 0: frame buf[6] : 0x636cc850
[3.273] cam 0: frame_info buf: 0x636a6ea0, size: 7*60
[3.274] command >> 3
[8.773] command >> [8.810] do measure!
[8.940] roast degree:[52.60]
[8.941] Round trip 1788148522: pre/npu/post: 25/25/0 ms
3
[13.286] command >> [13.327] do measure!
[13.457] roast degree:[28.39]
[13.458] Round trip -1603416481: pre/npu/post: 25/25/0 ms
3
[17.999] do measure!
[17.999] command >> [18.128] roast degree:[53.44]
[18.129] Round trip -669224694: pre/npu/post: 24/26/0 ms
3
[21.592] command >> [21.612] do measure!
[21.742] roast degree:[52.60]
[21.743] Round trip 53568456: pre/npu/post: 24/26/0 ms
3
[28.859] do measure!
[28.860] command >> [28.989] roast degree:[45.09]
[28.990] Round trip 1502966045: pre/npu/post: 24/26/0 ms
2
[46.051] command >> 4
I found that the preprocess function wrong, but there is no detail comment of the raw data format in SDK.
Therefore, I try to debug and change the code as below. I use NPU_FORMAT_RGBA8888, but my model
only use B, G, R (three channels), so I set A as 0. Is it correct? What is the raw image format exactly?(in the code I use H, C, W)
after many wrong answers, I finally get a version with "some" right answer (from UART msg) as shown below.
The correct answer should be 45.225, but I get 10.02 many times. Sometimes it's 45.
Does the result come out after NPU process finish all the time? or maybe I should give NPU some time to process?
////////////////////////////////////////// preprocess:
src_p = (uint8_t *)RAW_IMAGE_MEM_ADDR(image_p);
dst_p = (uint8_t *)PREPROC_INPUT_MEM_ADDR(image_p);
int len_row = sizeof(uint8_t)*raw_col;
int data_row = len_row*raw_ch;
int out_len_row = sizeof(uint8_t)*in_col;
int out_data_row = out_len_row*in_ch;
for (int y = 0; y < raw_row; y++)
{
for (int x = 0; x < raw_col; x++)
{
if ( (x >= left) && (x < right) && (y >= top) && (y < bottom) )
{
uint8_t a = *(uint8_t *)(src_p+data_row*y+len_row*0+x);
uint8_t b = *(uint8_t *)(src_p+data_row*y+len_row*1+x);
uint8_t g = *(uint8_t *)(src_p+data_row*y+len_row*2+x);
uint8_t r = *(uint8_t *)(src_p+data_row*y+len_row*3+x);
*(dst_p+out_data_row*(y-top)+out_len_row*0+(x-left)) = 0;
*(dst_p+out_data_row*(y-top)+out_len_row*1+(x-left)) = r;
*(dst_p+out_data_row*(y-top)+out_len_row*2+(x-left)) = g;
*(dst_p+out_data_row*(y-top)+out_len_row*3+(x-left)) = b;
}
}
}
return 0;
//////////////////////////////////////////////// UART output to PC
[505.309] command >> [505.341] do measure!
[505.425] roast degree:[10.02]
[505.426] Round trip -1994555333: pre/npu/post: 28/26/0 ms
3
[506.675] command >> [506.706] do measure!
[506.786] roast degree:[52.60]
[506.786] Round trip -1722543648: pre/npu/post: 28/26/0 ms
3
[507.584] command >> [507.585] do measure!
[507.664] roast degree:[10.02]
[507.665] Round trip -1546757264: pre/npu/post: 28/26/0 ms
3
[508.318] command >> [508.386] do measure!
[508.466] roast degree:[45.09]
[508.466] Round trip -1386545015: pre/npu/post: 29/25/0 ms
3
[509.070] command >> [509.108] do measure!
[509.188] roast degree:[10.02]
[509.188] Round trip -1242145063: pre/npu/post: 29/25/0 ms
3
[509.823] command >> [509.825] do measure!
[509.909] roast degree:[10.02]
[509.910] Round trip -1097759773: pre/npu/post: 28/26/0 ms
Hello,
Hi,
Hello,
It's different between IMAGE_FORMAT_PARALLEL_PROC and IMAGE_FORMAT_RAW_OUTPUT.
User can set the bit IMAGE_FORMAT_PARALLEL_PROC to enable or disable parallel mode, for instance,
img_run[i].img_cfg.image_format |= IMAGE_FORMAT_PARALLEL_PROC; // enable
img_run[i].img_cfg.image_format &= ~IMAGE_FORMAT_PARALLEL_PROC; // disable
And bit IMAGE_FORMAT_RAW_OUTPUT is set to make NPU output raw feature map(fixed point) without post processing function.
Hi Ethon,
Thank for your advice!
I found the comment of raw output format:
/* raw output format:
* ([output_num][height_outnode1][channel_outnode1][width_outnode1][radix_outnode1][scale_outnode1][h2][c2][w2][r2][s2][...]
* [h_n][c_n][w_n][r_n][s_n][fixed_point_datanode1][fixed_point_datanode2][...][fixed_point_datanodeN])
* 1 byte for each fixed-point data. 4 bytes for each of other data.
* fixed-point data is converted to float data with formula of fp_value / (scale * (2 ^ radix)).
*/
This helps me to get the result from model by scpu. (In the case of raw output, I dont use postprocessing. Instead, I change the fixed number into float in scpu.)
However, there is still a problem. My UART result is shown below. The right answer should be 5x.xx, but the result 14.11 continously comes out. Is the result always generated from the NPU? or Do I get the wrong result when NPU has not done yet? How should I solve this problem? (like waiting for NPU complete or something)
Thank you for your patience!!!
[127.180] do measure!
[127.180] command >> [127.314] roast degree:[14.11]
[127.314] Round trip -307135142: pre/npu/post: 28/26/0 ms
3
[127.905] do measure!
[127.905] command >> [128.039] roast degree:[55.59]
[128.039] Round trip -162115811: pre/npu/post: 29/26/0 ms
3
[128.555] command >> [128.575] do measure!
[128.709] roast degree:[14.11]
[128.709] Round trip -28135037: pre/npu/post: 28/26/0 ms
3
[129.254] do measure!
[129.254] command >> [129.388] roast degree:[14.11]
[129.388] Round trip 107680164: pre/npu/post: 29/26/0 ms
3
[130.013] command >> [130.058] do measure!
[130.192] roast degree:[53.10]
[130.192] Round trip 268483120: pre/npu/post: 28/26/0 ms
3
[130.765] do measure!
[130.765] command >> [130.899] roast degree:[53.10]
[130.899] Round trip 409875537: pre/npu/post: 29/26/0 ms
3
[131.460] do measure!
[131.460] command >> [131.594] roast degree:[53.10]
[131.594] Round trip 548881982: pre/npu/post: 29/25/0 ms
3
[132.110] command >> [132.130] do measure!
[132.264] roast degree:[14.11]
[132.264] Round trip 682880716: pre/npu/post: 29/26/0 ms
3
[132.758] command >> [132.800] do measure!
[132.934] roast degree:[54.76]
[132.934] Round trip 816861546: pre/npu/post: 28/26/0 ms
3
[133.404] do measure!
[133.404] command >> [133.538] roast degree:[14.11]
[133.538] Round trip 937659278: pre/npu/post: 29/26/0 ms
3
[134.196] command >> [134.208] do measure!
[134.342] roast degree:[14.11]
[134.342] Round trip 1098473014: pre/npu/post: 28/26/0 ms
3
[134.932] do measure!
[134.932] command >> [135.066] roast degree:[14.11]
[135.066] Round trip 1243278742: pre/npu/post: 28/26/0 ms
3
[136.128] command >> [136.138] do measure!
[136.272] roast degree:[55.59]
[136.272] Round trip 1484456600: pre/npu/post: 29/26/0 ms
3
[136.968] do measure!
[136.968] command >> [137.102] roast degree:[53.93]
[137.102] Round trip 1650467550: pre/npu/post: 29/25/0 ms
Hello,
You mentioned that you can get the correct result with dongle but got wrong abnormal output with host mode on 96 board. Is my understanding right?
I wonder whether are this two results are using different input image format on each platform. (MIPI with rgb565 on host mode but rgba8888 with dongle on PLUS). Maybe you can try to use rgb565 on dongle to check the accuracy instead.
Hello Ethon,
Yes, you are right. I get correct result with dongle (RGB565 and RGBA8888 are both correct), but sometimes wrong result on 96 board.
I dont know if I understand the SDK code right. Maybe this is the point.
In the settings below, IMG_CH was 3 for RGB565. Does this mean the preprocess function in NCPU sees raw data like three bytes as a pixel (R,G,B)?
or Does this mean the preprocess function in NCPU sees two bytes (RGB565) with one byte (A) as a pixel?
or Should I set the IMG_CH as 2 for RGB565?
And here is the code for format: #define INF_IMG_FORMAT (NPU_FORMAT_RGBA8888 | IMAGE_FORMAT_RAW_OUTPUT)
Does this mean the preprocess function sees raw data as RGBA8888 format? (4 bytes as a pixel)
or this means the output of preprocess function for NPU is RGBA8888? (Actually, I already feed model with this format)
#define IMG_W RGB_IMG_SOURCE_W
#define IMG_H RGB_IMG_SOURCE_H
#define IMG_CH 3
#define IMG_SIZE (IMG_W * IMG_H * IMG_CH)
#define IMG_FORMAT IMAGE_FORMAT_RGB565
#define DISPLAY_FORMAT V2K_PIX_FMT_RGB565
#ifdef MODEL_COMPILATION_WITH_ADD_NORM
#define INF_IMG_FORMAT (IMAGE_FORMAT_SUB128 | NPU_FORMAT_RGB565 | IMAGE_FORMAT_PARALLEL_PROC)
#else
#ifdef MODEL_COMPILATION_WITH_ADD_NORM_DENNIS
#define INF_IMG_FORMAT (NPU_FORMAT_RGBA8888 | IMAGE_FORMAT_RAW_OUTPUT)
//#define INF_IMG_FORMAT (NPU_FORMAT_RGB565 | IMAGE_FORMAT_RAW_OUTPUT)
#else
#define INF_IMG_FORMAT (IMAGE_FORMAT_RIGHT_SHIFT_ONE_BIT | NPU_FORMAT_RGB565 | IMAGE_FORMAT_PARALLEL_PROC)
#endif
#endif
Thank you for reply!!
I finally get a version for normal result. The error is a little bigger than dongle, but it is OK because the camera is different. That should be solved by better designed model. At least the result is stable and reasonable now!
Here is the note for myself and people who meet the same problem when using SDK:
1. The definition of crop box is not coordinate but distance to edge
img_run[i].crop_box.top = 180; // pixels to top
img_run[i].crop_box.bottom = 180; // pixels to bottom
img_run[i].crop_box.left = 260; // pixels to left
img_run[i].crop_box.right = 260; // pixels to right
2. Preprocess can use default settings. No need to write code in NCPU!!
#define INF_IMG_FORMAT (IMAGE_FORMAT_SUB128 | NPU_FORMAT_RGB565 | IMAGE_FORMAT_PARALLEL_PROC)
IMAGE_FORMAT_SUB128 is for r-128, g-128, b-128
NPU_FORMAT_RGB565 is telling preprocess to transfer RGB565 into RGB, so there is no need to do transformation in NCPU again
Therefore, the settings are:
#define IMG_W RGB_IMG_SOURCE_W
#define IMG_H RGB_IMG_SOURCE_H
#define IMG_CH 3
#define IMG_SIZE (IMG_W * IMG_H * 2)
#define IMG_FORMAT IMAGE_FORMAT_RGB565
#define DISPLAY_FORMAT V2K_PIX_FMT_RGB565
3. If use IMAGE_FORMAT_PARALLEL_PROC and do_parallel=1
the output result should be transfer by NCPU postprocess(see the code inside)
If use IMAGE_FORMAT_RAW_OUTPUT and do_parallel=0
the raw output should follow the format:
/* raw output format:
* ([output_num][height_outnode1][channel_outnode1][width_outnode1][radix_outnode1][scale_outnode1][h2][c2][w2][r2][s2][...]
* [h_n][c_n][w_n][r_n][s_n][fixed_point_datanode1][fixed_point_datanode2][...][fixed_point_datanodeN])
* 1 byte for each fixed-point data. 4 bytes for each of other data.
* fixed-point data is converted to float data with formula of fp_value / (scale * (2 ^ radix)).
*/
4. [important] If no parallel processing, there is no need to write NCPU code!!!!!
5. In this project version, the preprocess is SUB128 and model inprocess is [/255] and [+0.5], so the final input is 0-1
Congrats on solving the problem!! and thanks for organizing these notes.