一 前言

  • 在前面分析了驱动大致走向和函数,驱动细节我们无法分析,因为拿不到底层寄存器相关的细节文档

  • 所以我们应用的重中之重,就是需要仔细分析DMA读写,把DMA读写学会,就可以拿到数据给上层应用使用

  • DMA本身没有绑定DSMC,但是使用DMA驱动时,会给DMA配置DSMC的内存地址,这样DMA读写直接按块访问内存中映射的DSMC

  • 应用请直接看后面的 三 外部调用例程做库文件 ,例程修改为库文件和头文件作为接口库,提供对外接口调用


二 Linux DMA读写分析

2.1 DMA驱动大致内容(创龙为dma_memcpy.c,飞凌为dsmc_lb_device.c)

  • 首先要写底层DMA读写驱动,提供用户空间与FPGA(localbus)之间的高速数据搬运通道,在dsmc-lb-device.c中增加DMA驱动读写接口(飞凌做法),或者单开一个DMA读写驱动(创龙做法),源码见SDK

  • DMA有个dma_setup_params_stu结构体配置参数transfer_direction决定的读写,由DMA执行内存写到FPGA,还是FPGA读到内存

  • 其实在上层,我们只关心DMA读写就可以了,底层驱动已经配置好DSMC的DMA,我们读数据在DMA配置好后由DSMC读取完成再通知DMA读取,写数据由DMA发送后DSMC会完成写数据

  • 创建设备文件接口,通常为/dev/dsmc,然后通过ioctl接口进行DMA参数配置、启动、状态获取与关闭。(对应函数:dma_memcpy_ioctl

  • 通过mmap为用户空间提供DMA缓冲区(对应函数:dma_memcpy_mmap

  • 可选通过input事件(GPIO中断)通知用户空间DMA完成(对应函数:dma_irq_handler,上报input事件,创龙做法)

ioctl命令(核心控制接口)通过ops调用dma_memcpy_ioctl

1
2
3
4
5
在dma_memcpy_ioctl做switch分支,用以做DMA不同时机的操作
DMA_MEMCPY_SETUP 配置DMA参数 case DMA_MEMCPY_SETUP:
DMA_MEMCPY_START 启动DMA传输 按配置开始一次DMA搬运 case DMA_MEMCPY_START:
DMA_MEMCPY_GET_TIME 获取最近一次DMA耗时 返回微秒级耗时 case DMA_MEMCPY_GET_TIME:
DMA_MEMCPY_SHUTDOWN 关闭DMA、释放资源 每次传输后调用 case DMA_MEMCPY_SHUTDOWN:

上层应用读写必须依次完成setup/start/shutdown流程,完成DMA读写到DSMC实现的流程
详细驱动分析参考前面的DSMC代码分析之Linux


2.2 DMA用户使用流程

用户每次DMA操作都需完整的setup/start/shutdown流程

  • 用户操作前置:
1
2
3
1 分配读/写缓冲区:malloc等
2 打开设备节点和可选input事件设备(其实是dma_init):open("/dev/dma_memcpy", O_RDWR),open("/dev/input/eventX", O_RDONLY)
3 赋值DMAInfo结构:填充读取的地址,大小等
  • 写操作(内存→FPGA) dma_write()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
填充写src_buf缓冲区(写数组之类)

配置DMA参数(ioctl SETUP,方向为写)
执行ioctl(fd, DMA_MEMCPY_SETUP, &setup)

mmap映射DMA缓冲区
对应函数:mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0)

memcpy数据到DMA缓冲区
对应函数:memcpy(dma_buf, src_buf, size)

启动DMA(ioctl START)
执行ioctl(fd, DMA_MEMCPY_START, 0)

等待input事件(创龙做法,飞凌直接下一步)
对应函数:select/read(input_fd, &event, sizeof(event))

获取DMA耗时(ioctl GET_TIME)
执行ioctl(fd, DMA_MEMCPY_GET_TIME, 0)

关闭DMA(ioctl SHUTDOWN),释放mmap
执行ioctl(fd, DMA_MEMCPY_SHUTDOWN, 0)、munmap(dma_buf, size)

  • 读操作(FPGA→内存) dma_read()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
配置DMA参数(ioctl SETUP,方向为读)
执行ioctl(fd, DMA_MEMCPY_SETUP, &setup)

启动DMA(ioctl START)
执行ioctl(fd, DMA_MEMCPY_START, 0)

mmap映射DMA缓冲区
对应函数:mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0)

等待input事件(select/read)
对应函数:select/read(input_fd, &event, sizeof(event))

从DMA缓冲区拷贝到读缓冲区
对应函数:memcpy(dst_buf, dma_buf, size)

数据校验
对应函数:memcmp/自定义校验函数

获取DMA耗时(ioctl GET_TIME)
执行ioctl(fd, DMA_MEMCPY_GET_TIME, 0)

关闭DMA(ioctl SHUTDOWN),释放mmap
执行ioctl(fd, DMA_MEMCPY_SHUTDOWN, 0)、munmap(dma_buf, size)

  • 伪代码流程
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
main()
├─ parse_args(argc, argv)
├─ malloc w_buffer, r_buffer
├─ open("/dev/dma_memcpy")
├─ open("/dev/input/eventX")
├─ for (循环次数)
│ ├─ 填充 w_buffer
│ ├─ ioctl(fd, DMA_MEMCPY_SETUP, &setup)
│ ├─ dma_buf = mmap(..., fd, ...)
│ ├─ memcpy(dma_buf, w_buffer, size)
│ ├─ ioctl(fd, DMA_MEMCPY_START, 0)
│ ├─ select/read(input_fd, ...)
│ ├─ ioctl(fd, DMA_MEMCPY_GET_TIME, 0)
│ ├─ ioctl(fd, DMA_MEMCPY_SHUTDOWN, 0)
│ ├─ munmap(dma_buf, size)
│ ├─ 读操作同理
│ └─ 数据校验、统计
├─ close(fd), close(input_fd), free()
  • 总之就是,open打开DMA节点,然后ioctl命令通过ops调用dma_memcpy_ioctl执行setup/start/shutdown流程,完成DMA配置、读写、关闭的流程

三 外部调用例程做库文件

3.1 外部文件调用库文件

  • 代码本质已经完善了读写,我们只需要把例程改为库文件,就可以给其他程序使用了

  • 首先把dma_write函数内部的下面这段注释,测试例程是通过这种方式填写测试数据

1
2
3
4
for (i = 0; i < dma_info->params.size; i++)
{
w_buffer[i] = i + 1;
}
  • 外部调用基本框架都可以不变,只需要把源文件放到同级目录,dsmc_lib.h头文件加上,dsmc_test函数注释,改为dma_readdma_write就可以。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
#include "dsmc_lib.h"

int main(int argc, char *argv[])
{
struct CmdLineParams params;
struct DMAInfo dma_info;
int ret = -1;

if (parse_parameter(&params, argc, argv) == false)
{
printf("Please try --help to see usage.\n");
exit(1);
}

w_buffer = (uint8_t *)malloc(params.size);
if (w_buffer == NULL)
{
printf("malloc w_buffer failed!\n");
exit(-1);
}
r_buffer = (uint8_t *)malloc(params.size);
if (r_buffer == NULL)
{
free(w_buffer);
printf("malloc r_buffer failed!\n");
exit(-1);
}

memset(&dma_info, 0, sizeof(struct DMAInfo));
dma_info.params.address = DSMC_MEM;
dma_info.params.size = params.size;
dma_info.params.cycles = params.cycles;

ret = dma_init(&dma_info);
if (ret < 0)
{
dma_destroy(&dma_info);
return -1;
}

// dma_test(&dma_info);

for (i = 0; i < dma_info->params.size; i++)
{
w_buffer[i] = i + 1;
}

dma_write(&dma_info);

dma_read(&dma_info);

printf("写入数据:");
for (int i = 0; i < dma_info.params.size; i++)
{
printf("%02x ", w_buffer[i]);
}
printf("\n");

printf("读回数据:");
for (int i = 0; i < dma_info.params.size; i++)
{
printf("%02x ", r_buffer[i]);
}
printf("\n");
dma_destroy(&dma_info);

if (w_buffer != NULL)
{
free(w_buffer);
}
if (r_buffer != NULL)
{
free(r_buffer);
}
return 0;
}

3.2 例程改造后的库文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
DSMC_LIB.C

#include "dsmc_lib.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <errno.h>


uint8_t *w_buffer = NULL;
uint8_t *r_buffer = NULL;


static void neon_copy(void *dst, void *src, int sz)
{
if (sz & 63)
{
sz = (sz & -64) + 64;
}
memcpy(dst, src, sz);
}


int dma_init(struct DMAInfo *dma_info)
{
if (dma_info == NULL)
{
return -1;
}
dma_info->dma_fd = open(DMA_MEMCPY_PATH, O_RDWR);
if (dma_info->dma_fd < 0)
{
printf("can't open %s\n", DMA_MEMCPY_PATH);
return -1;
}
return 0;
}


void dma_destroy(struct DMAInfo *dma_info)
{
if (dma_info->map_base_addr != NULL)
{
munmap(dma_info->map_base_addr, dma_info->params.size);
dma_info->map_base_addr = NULL;
}
if (dma_info->dma_fd != -1)
close(dma_info->dma_fd);
}


static int dma_write(struct DMAInfo *dma_info)
{
struct DMASetupParams dma_setup_params;
unsigned int dev_addr = 0;
int i = 0, ret = -1;
struct input_event input_buf;
int status = 0;
struct timeval start, end, time_diff;
fd_set input;
struct timeval timeout;
unsigned time2;
float rate;


/* Populate the write buffer */
// for (i = 0; i < dma_info->params.size; i++)
// {
// w_buffer[i] = i + 1;
// }


/* Setup DMA */
dma_setup_params.dev_phy_addr = dma_info->params.address;
dma_setup_params.transfer_total_size = dma_info->params.size;
dma_setup_params.transfer_single_size = dma_info->params.size;
/* 0: write, 1: read */
dma_setup_params.transfer_direction = 0;
dev_addr = ioctl(dma_info->dma_fd, DMA_MEMCPY_SETUP, &dma_setup_params);
if (dev_addr != 0)
{
printf("can't get device addr write: 0x%x\n", dev_addr);
return -1;
}


dma_info->vir_addr = mmap(NULL, dma_info->params.size, PROT_READ | PROT_WRITE, MAP_SHARED, dma_info->dma_fd, 0);
if (dma_info->vir_addr == MAP_FAILED)
{
printf("mmap failed\n");
return -1;
}


neon_copy((unsigned char *)dma_info->vir_addr, (unsigned char *)w_buffer, dma_info->params.size);


/* Start DMA */
ret = ioctl(dma_info->dma_fd, DMA_MEMCPY_START, 0);
if (ret != 0)
{
return -1;
}


time2 = ioctl(dma_info->dma_fd, DMA_MEMCPY_GET_TIME, 0);
while (!time2)
{
time2 = ioctl(dma_info->dma_fd, DMA_MEMCPY_GET_TIME, 0);
}
rate = ((dma_info->params.size * 1.0 / (time2 / 1000000.0)) / 1024 / 1024);


dma_info->dma_trans_info.dma_w_cost_time = time2;
dma_info->dma_trans_info.dma_w_rate = ((dma_info->params.size * 1.0 / (dma_info->dma_trans_info.dma_w_cost_time / 1000000.0)) / 1024 / 1024);
error:
/* Shutdown DMA transfer */
ret = ioctl(dma_info->dma_fd, DMA_MEMCPY_SHUTDOWN, 0);
if (ret < 0)
{
status = -1;
}
munmap(dma_info->vir_addr, dma_info->params.size);


return status;
}


static int dma_read(struct DMAInfo *dma_info)
{
struct DMASetupParams dma_setup_params;
unsigned int dev_addr = 0;
int status = 0;
int ret = -1, i = 0;
struct input_event input_buf;
struct timeval start, end, time_diff;
fd_set input;
unsigned time2;
float rate;


/* Setup DMA */
dma_setup_params.dev_phy_addr = dma_info->params.address;
dma_setup_params.transfer_total_size = dma_info->params.size;
dma_setup_params.transfer_single_size = dma_info->params.size;
/* 0: write, 1: read */
dma_setup_params.transfer_direction = 1;
dev_addr = ioctl(dma_info->dma_fd, DMA_MEMCPY_SETUP, &dma_setup_params);
if (dev_addr != 0)
{
printf("can't get device addr read: 0x%x\n", dev_addr);
return -1;
}


/* Start DMA */
ret = ioctl(dma_info->dma_fd, DMA_MEMCPY_START, 0);
if (ret != 0)
{
return -1;
}


dma_info->vir_addr = mmap(NULL, dma_info->params.size, PROT_READ | PROT_WRITE, MAP_SHARED, dma_info->dma_fd, 0);
if (dma_info->vir_addr == MAP_FAILED)
{
printf("mmap failed\n");
return -1;
}


time2 = ioctl(dma_info->dma_fd, DMA_MEMCPY_GET_TIME, 0);
while (!time2)
{
time2 = ioctl(dma_info->dma_fd, DMA_MEMCPY_GET_TIME, 0);
}
rate = ((dma_info->params.size * 1.0 / (time2 / 1000000.0)) / 1024 / 1024);


neon_copy((unsigned char *)r_buffer, (unsigned char *)dma_info->vir_addr, dma_info->params.size);


/* Verify data */
for (i = 0; i < dma_info->params.size; i++)
{
if (r_buffer[i] != w_buffer[i])
{
dma_info->dma_trans_info.error_count++;
if (dma_info->dma_trans_info.error_count == 1)
{
printf("\nData verify error at offset %d (0x%x), expect: 0x%x, actual: 0x%x\n",
i, i * 2, w_buffer[i], r_buffer[i]);
}
}
}


dma_info->dma_trans_info.dma_r_cost_time = time2;
dma_info->dma_trans_info.dma_r_rate = ((dma_info->params.size * 1.0 / (dma_info->dma_trans_info.dma_r_cost_time / 1000000.0)) / 1024 / 1024);


error:
/* Shutdown DMA transfer */
ret = ioctl(dma_info->dma_fd, DMA_MEMCPY_SHUTDOWN, 0);
if (ret < 0)
{
status = -1;
}


munmap(dma_info->vir_addr, dma_info->params.size);


return status;
}


void dma_test(struct DMAInfo *dma_info)
{
int ret = -1;
unsigned int num = 0;
float total_dma_w_rate = 0, total_dma_r_rate = 0;
float max_dma_w_rate = 0, max_dma_r_rate = 0;
float min_dma_w_rate = -1, min_dma_r_rate = -1;
uint32_t total_dma_w_time = 0, total_dma_r_time = 0;
uint32_t max_dma_w_time = 0, max_dma_r_time = 0;
uint32_t min_dma_w_time = -1, min_dma_r_time = -1;
const uint32_t total_cycles = dma_info->params.cycles;


printf("\nStarting DMA test with %d cycles...\n\n", total_cycles);


uint32_t total_error_count = 0;


while (dma_info->params.cycles-- > 0)
{
// 填充写入buffer
for (int i = 0; i < dma_info->params.size; i++)
{
w_buffer[i] = i + 1;
}
ret = dma_write(dma_info, w_buffer);
if (ret < 0)
{
printf("dma_write return error code: %d\r\n\n\n", ret);
break;
}


ret = dma_read(dma_info, r_buffer, w_buffer);
if (ret < 0)
{
printf("dma_read return error code: %d\r\n\n\n", ret);
break;
}
num++;


float error_rate = (dma_info->dma_trans_info.error_count / (float)(dma_info->params.size)) * 100;
total_error_count += dma_info->dma_trans_info.error_count;
dma_info->dma_trans_info.error_count = 0;


printf("\rCycle %d/%d - Data Size: %d bytes | Write: %.2f MB/s | Read: %.2f MB/s | Write Time: %d us | Read Time: %d us | Error Rate: %.2f%%",
num, total_cycles,
dma_info->params.size,
dma_info->dma_trans_info.dma_w_rate,
dma_info->dma_trans_info.dma_r_rate,
dma_info->dma_trans_info.dma_w_cost_time,
dma_info->dma_trans_info.dma_r_cost_time,
error_rate);
fflush(stdout);


total_dma_w_rate += dma_info->dma_trans_info.dma_w_rate;
if (dma_info->dma_trans_info.dma_w_rate > max_dma_w_rate)
max_dma_w_rate = dma_info->dma_trans_info.dma_w_rate;
if (min_dma_w_rate == -1.0 || dma_info->dma_trans_info.dma_w_rate < min_dma_w_rate)
min_dma_w_rate = dma_info->dma_trans_info.dma_w_rate;
total_dma_r_rate += dma_info->dma_trans_info.dma_r_rate;
if (dma_info->dma_trans_info.dma_r_rate > max_dma_r_rate)
max_dma_r_rate = dma_info->dma_trans_info.dma_r_rate;
if (min_dma_r_rate == -1.0 || dma_info->dma_trans_info.dma_r_rate < min_dma_r_rate)
min_dma_r_rate = dma_info->dma_trans_info.dma_r_rate;


total_dma_w_time += dma_info->dma_trans_info.dma_w_cost_time;
if (dma_info->dma_trans_info.dma_w_cost_time > max_dma_w_time)
max_dma_w_time = dma_info->dma_trans_info.dma_w_cost_time;
if (min_dma_w_time == -1.0 || dma_info->dma_trans_info.dma_w_cost_time < min_dma_w_time)
min_dma_w_time = dma_info->dma_trans_info.dma_w_cost_time;


total_dma_r_time += dma_info->dma_trans_info.dma_r_cost_time;
if (dma_info->dma_trans_info.dma_r_cost_time > max_dma_r_time)
max_dma_r_time = dma_info->dma_trans_info.dma_r_cost_time;
if (min_dma_r_time == -1.0 || dma_info->dma_trans_info.dma_r_cost_time < min_dma_r_time)
min_dma_r_time = dma_info->dma_trans_info.dma_r_cost_time;
}


if (num > 0)
{
float average_error_rate = (total_error_count / (float)(num * (dma_info->params.size / 2))) * 100;


// Print final statistics
printf("\n\nTest Complete! Final Statistics:\n");
printf("Completed Cycles: %d/%d\n", num, total_cycles);
printf("Data Size: %d bytes\n", dma_info->params.size);
printf("Write Performance:\n");
printf(" Rate: %.2f MB/s (avg)\n", total_dma_w_rate / num);
printf(" Time: %d us (avg)\n", total_dma_w_time / num);
printf(" Rate: %.2f MB/s (max)\n", max_dma_w_rate);
printf(" Time: %d us (min)\n", min_dma_w_time);
printf(" Rate: %.2f MB/s (min)\n", min_dma_w_rate);
printf(" Time: %d us (max)\n", max_dma_w_time);


printf("\nRead Performance:\n");
printf(" Rate: %.2f MB/s (avg)\n", total_dma_r_rate / num);
printf(" Time: %d us (avg)\n", total_dma_r_time / num);
printf(" Rate: %.2f MB/s (max)\n", max_dma_r_rate);
printf(" Time: %d us (min)\n", min_dma_r_time);
printf(" Rate: %.2f MB/s (min)\n", min_dma_r_rate);
printf(" Time: %d us (max)\n", max_dma_r_time);


printf("\nAverage Error Rate: %.2f%%\n", average_error_rate);
}


return;
}





3.3 例程改造后的头文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
DSMC_LIB.H

#ifndef DSMC_LIB_H
#define DSMC_LIB_H


#include <stdint.h>
#include <stdbool.h>


#ifdef __cplusplus
extern "C"
{
#endif


extern uint8_t *w_buffer;
extern uint8_t *r_buffer;


#define DMA_MEMCPY_PATH "/dev/dsmc/cs0/region0"
#define DMA_MEMCPY_SETUP 0x01000000
#define DMA_MEMCPY_START 0x02000000
#define DMA_MEMCPY_GET_TIME 0x03000000
#define DMA_MEMCPY_SHUTDOWN 0x04000000
#define DSMC_MEM 0xc0000000


struct DMASetupParams
{
unsigned int dev_phy_addr;
unsigned int dst_phy_addr;
unsigned int transfer_direction;
unsigned int transfer_single_size;
unsigned int transfer_total_size;
};


struct DMATransferInfo
{
float dma_r_rate;
float dma_w_rate;
uint32_t dma_r_cost_time;
uint32_t dma_w_cost_time;
uint32_t error_count;
};


struct DmaParams
{
uint32_t address;
uint32_t size;
uint32_t cycles;
char dev[64];
};


struct DMAInfo
{
int dma_fd;
int input_fd;


uint16_t *map_base_addr;
uint16_t *vir_addr;


struct DmaParams params;
struct DMATransferInfo dma_trans_info;
};


// API接口
int dma_init(struct DMAInfo *dma_info);
void dma_destroy(struct DMAInfo *dma_info);
int dma_write(struct DMAInfo *dma_info);
int dma_read(struct DMAInfo *dma_info);
void dma_test(struct DMAInfo *dma_info);


#ifdef __cplusplus
}
#endif


#endif // DSMC_LIB_H





四 Linux DMA例程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568


#include <stdio.h>
#include <stdint.h>
#include <time.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <linux/input.h>
#include <errno.h>
#include <string.h>
#include <getopt.h>
#include <stdbool.h>
#include <libgen.h>
#include <sys/select.h>
#include <signal.h>


const char * const VERSION = "1.0";


#define DMA_MEMCPY_PATH "/dev/dsmc/cs0/region0"


/* commands codes */
#define DMA_MEMCPY_SETUP 0x01000000
#define DMA_MEMCPY_START 0x02000000
#define DMA_MEMCPY_GET_TIME 0x03000000
#define DMA_MEMCPY_SHUTDOWN 0x04000000
#define DSMC_MEM 0xc0000000


uint8_t *w_buffer = NULL;
uint8_t *r_buffer = NULL;


struct DMASetupParams {
unsigned int dev_phy_addr;
unsigned int dst_phy_addr;
unsigned int transfer_direction;
unsigned int transfer_single_size;
unsigned int transfer_total_size;
};


struct DMATransferInfo {
float dma_r_rate;
float dma_w_rate;
uint32_t dma_r_cost_time;
uint32_t dma_w_cost_time;
uint32_t error_count;
};


struct DmaParams {
uint32_t address;
uint32_t size;
uint32_t cycles;
char dev[64];
};


struct DMAInfo {
int dma_fd;
int input_fd;


uint16_t *map_base_addr;
uint16_t *vir_addr;


struct DmaParams params;
struct DMATransferInfo dma_trans_info;
};


struct CmdLineParams {
uint32_t address;
uint32_t size;
uint32_t cycles;
char dev[64];
};


/* Short option names */
static const char g_shortopts [] = ":s:c:vh";
/* Option names */
static const struct option g_longopts [] = {
{ "size", required_argument, NULL, 's' },
{ "cycles", required_argument, NULL, 'c' },
{ "version", no_argument, NULL, 'v' },
{ "help", no_argument, NULL, 'h' },
{ 0, 0, 0, 0 }
};


static void usage(char **argv)
{
fprintf(stdout,
"Usage: %s [options]\n\n"
"Options:\n"
" -s | --size Data size (Unit: Byte) \n"
" -c | --cycles Loop read and write times \n"
" -v | --version Display version information\n"
" -h | --help Show help content\n\n"
"Example:\n"
" # ./%s -a 0x10000000 -s 65536 -c 1000 \n"
"", basename(argv[0]),
basename(argv[0]));
}


static bool parse_parameter(struct CmdLineParams *params, int argc, char **argv)
{
int opt;


memset(params, 0, sizeof(struct CmdLineParams));


while ((opt = getopt_long(argc, argv, g_shortopts, g_longopts, NULL)) != -1) {
switch (opt) {
case 's':
params->size = atoi(optarg);
break;


case 'c':
params->cycles = atoi(optarg);
break;


case 'h':
usage(argv);
exit(0);


case 'v':
printf("version : %s\n", VERSION);
exit(0);


default:
fprintf(stderr, "Unknown option %c\n", optopt);
break;
}
}
return true;
}


//static void neon_copy(volatile void *dst, volatile void *src, int sz)
static void neon_copy(void *dst, void *src, int sz)
{
if (sz & 63) {
sz = (sz & -64) + 64;
}


memcpy(dst, src, sz);
}


static int dma_init(struct DMAInfo *dma_info)
{
if (dma_info == NULL) {
return -1;
}


dma_info->dma_fd = open(DMA_MEMCPY_PATH, O_RDWR);
if(dma_info->dma_fd < 0) {
printf("can't open %s\n", DMA_MEMCPY_PATH);
return -1;
}
return 0;
}


static void dma_destroy(struct DMAInfo *dma_info)
{
if (dma_info->map_base_addr != NULL) {
munmap(dma_info->map_base_addr, dma_info->params.size);
dma_info->map_base_addr = NULL;
}


if (dma_info->dma_fd != -1) close(dma_info->dma_fd);
}


static int dma_write(struct DMAInfo *dma_info)
{
struct DMASetupParams dma_setup_params;
unsigned int dev_addr = 0;
int i = 0, ret = -1;
struct input_event input_buf;
int status = 0;
struct timeval start, end, time_diff;
fd_set input;
struct timeval timeout;
unsigned time2;
float rate;

/* Populate the write buffer */
for (i = 0; i < dma_info->params.size ; i++) {
w_buffer[i] = i+1;
}


/* Setup DMA */
dma_setup_params.dev_phy_addr = dma_info->params.address;
dma_setup_params.transfer_total_size = dma_info->params.size;
dma_setup_params.transfer_single_size = dma_info->params.size;
/* 0: write, 1: read */
dma_setup_params.transfer_direction = 0;
dev_addr = ioctl(dma_info->dma_fd, DMA_MEMCPY_SETUP, &dma_setup_params);
if (dev_addr != 0) {
printf("can't get device addr write: 0x%x\n", dev_addr);
return -1;
}


dma_info->vir_addr = mmap(NULL, dma_info->params.size, PROT_READ | PROT_WRITE, MAP_SHARED, dma_info->dma_fd, 0);
if (dma_info->vir_addr == MAP_FAILED) {
printf("mmap failed\n");
return -1;
}


neon_copy((unsigned char *)dma_info->vir_addr, (unsigned char *)w_buffer, dma_info->params.size);



/* Start DMA */
ret = ioctl(dma_info->dma_fd, DMA_MEMCPY_START, 0);
if (ret != 0) {
return -1;
}


time2 = ioctl(dma_info->dma_fd, DMA_MEMCPY_GET_TIME, 0);
while(!time2){
time2 = ioctl(dma_info->dma_fd, DMA_MEMCPY_GET_TIME, 0);
}
rate = ((dma_info->params.size * 1.0 / (time2 / 1000000.0)) / 1024 / 1024);


dma_info->dma_trans_info.dma_w_cost_time = time2;
dma_info->dma_trans_info.dma_w_rate = ((dma_info->params.size * 1.0 / (dma_info->dma_trans_info.dma_w_cost_time / 1000000.0)) / 1024 / 1024);
error:
/* Shutdown DMA transfer */
ret = ioctl(dma_info->dma_fd, DMA_MEMCPY_SHUTDOWN, 0);
if (ret < 0) {
status = -1;
}
munmap(dma_info->vir_addr, dma_info->params.size);


return status;
}


static int dma_read(struct DMAInfo *dma_info)
{
struct DMASetupParams dma_setup_params;
unsigned int dev_addr = 0;
int status = 0;
int ret = -1, i = 0;
struct input_event input_buf;
struct timeval start, end, time_diff;
fd_set input;
unsigned time2;
float rate;


/* Setup DMA */
dma_setup_params.dev_phy_addr = dma_info->params.address;
dma_setup_params.transfer_total_size = dma_info->params.size;
dma_setup_params.transfer_single_size = dma_info->params.size;
/* 0: write, 1: read */
dma_setup_params.transfer_direction = 1;
dev_addr = ioctl(dma_info->dma_fd, DMA_MEMCPY_SETUP, &dma_setup_params);
if (dev_addr != 0) {
printf("can't get device addr read: 0x%x\n", dev_addr);
return -1;
}


/* Start DMA */
ret = ioctl(dma_info->dma_fd, DMA_MEMCPY_START, 0);
if (ret != 0) {
return -1;
}


dma_info->vir_addr = mmap(NULL, dma_info->params.size, PROT_READ | PROT_WRITE, MAP_SHARED, dma_info->dma_fd, 0);
if (dma_info->vir_addr == MAP_FAILED) {
printf("mmap failed\n");
return -1;
}


time2 = ioctl(dma_info->dma_fd, DMA_MEMCPY_GET_TIME, 0);
while(!time2){
time2 = ioctl(dma_info->dma_fd, DMA_MEMCPY_GET_TIME, 0);
}
rate = ((dma_info->params.size * 1.0 / (time2 / 1000000.0)) / 1024 / 1024);


neon_copy((unsigned char *)r_buffer, (unsigned char *)dma_info->vir_addr, dma_info->params.size);


/* Verify data */
for(i = 0; i < dma_info->params.size ; i ++) {
if(r_buffer[i] != w_buffer[i]) {
dma_info->dma_trans_info.error_count++;
if(dma_info->dma_trans_info.error_count == 1) {
printf("\nData verify error at offset %d (0x%x), expect: 0x%x, actual: 0x%x\n",
i, i * 2, w_buffer[i], r_buffer[i]);
}
}
}


dma_info->dma_trans_info.dma_r_cost_time = time2;
dma_info->dma_trans_info.dma_r_rate = ((dma_info->params.size * 1.0 / (dma_info->dma_trans_info.dma_r_cost_time / 1000000.0)) / 1024 / 1024);


error:
/* Shutdown DMA transfer */
ret = ioctl(dma_info->dma_fd, DMA_MEMCPY_SHUTDOWN, 0);
if (ret < 0) {
status = -1;
}


munmap(dma_info->vir_addr, dma_info->params.size);


return status;
}


static void dma_test(struct DMAInfo *dma_info)
{
int ret = -1;
unsigned int num = 0;
float total_dma_w_rate = 0, total_dma_r_rate = 0;
float max_dma_w_rate = 0, max_dma_r_rate = 0;
float min_dma_w_rate = -1, min_dma_r_rate = -1;
uint32_t total_dma_w_time = 0, total_dma_r_time = 0;
uint32_t max_dma_w_time = 0, max_dma_r_time = 0;
uint32_t min_dma_w_time = -1, min_dma_r_time = -1;
const uint32_t total_cycles = dma_info->params.cycles;


printf("\nStarting DMA test with %d cycles...\n\n", total_cycles);


// Initialize total error count
uint32_t total_error_count = 0;


while (dma_info->params.cycles-- > 0) {
ret = dma_write(dma_info);
if (ret < 0) {
printf("dma_write return error code: %d\r\n\n\n", ret);
break;
}
#if 1
printf("write end ---------\n");
continue;
#endif
printf("start read----\n");
ret = dma_read(dma_info);
if (ret < 0) {
printf("dma_read return error code: %d\r\n\n\n", ret);
break;
}


num++;


// Calculate error rate for this cycle
float error_rate = (dma_info->dma_trans_info.error_count / (float)(dma_info->params.size / 2)) * 100;


// Accumulate total error count
total_error_count += dma_info->dma_trans_info.error_count;
dma_info->dma_trans_info.error_count = 0;


// Print current cycle statistics
printf("\rCycle %d/%d - Data Size: %d bytes | Write: %.2f MB/s | Read: %.2f MB/s | Write Time: %d us | Read Time: %d us | Error Rate: %.2f%%",
num, total_cycles,
dma_info->params.size,
dma_info->dma_trans_info.dma_w_rate,
dma_info->dma_trans_info.dma_r_rate,
dma_info->dma_trans_info.dma_w_cost_time,
dma_info->dma_trans_info.dma_r_cost_time,
error_rate);
fflush(stdout);


// Accumulate for write average calculation
total_dma_w_rate += dma_info->dma_trans_info.dma_w_rate;


// Calculate the write maximum value
if (dma_info->dma_trans_info.dma_w_rate > max_dma_w_rate) {
max_dma_w_rate = dma_info->dma_trans_info.dma_w_rate;
}


// Calculate the write minimum value
if (min_dma_w_rate == -1.0 || dma_info->dma_trans_info.dma_w_rate < min_dma_w_rate) {
min_dma_w_rate = dma_info->dma_trans_info.dma_w_rate;
}


// Accumulate for read average calculation
total_dma_r_rate += dma_info->dma_trans_info.dma_r_rate;


// Calculate the write maximum value
if (dma_info->dma_trans_info.dma_r_rate > max_dma_r_rate) {
max_dma_r_rate = dma_info->dma_trans_info.dma_r_rate;
}


// Calculate the write minimum value
if (min_dma_r_rate == -1.0 || dma_info->dma_trans_info.dma_r_rate < min_dma_r_rate) {
min_dma_r_rate = dma_info->dma_trans_info.dma_r_rate;
}


// Accumulated is used to calculate the average time to write
total_dma_w_time += dma_info->dma_trans_info.dma_w_cost_time;


// Write maximum time calculation
if (dma_info->dma_trans_info.dma_w_cost_time > max_dma_w_time) {
max_dma_w_time = dma_info->dma_trans_info.dma_w_cost_time;
}


// Write minimum time calculation
if (min_dma_w_time == -1.0 || dma_info->dma_trans_info.dma_w_cost_time < min_dma_w_time) {
min_dma_w_time = dma_info->dma_trans_info.dma_w_cost_time;
}


// Accumulated is used to calculate the average time to read
total_dma_r_time += dma_info->dma_trans_info.dma_r_cost_time;


// Read maximum time calculation
if (dma_info->dma_trans_info.dma_r_cost_time > max_dma_r_time) {
max_dma_r_time = dma_info->dma_trans_info.dma_r_cost_time;
}


// Read minimum time calculation
if (min_dma_r_time == -1.0 || dma_info->dma_trans_info.dma_r_cost_time < min_dma_r_time) {
min_dma_r_time = dma_info->dma_trans_info.dma_r_cost_time;
}
}


if (num > 0) {
// Calculate average error rate
float average_error_rate = (total_error_count / (float)(num * (dma_info->params.size / 2))) * 100;


// Print final statistics
printf("\n\nTest Complete! Final Statistics:\n");
printf("Completed Cycles: %d/%d\n", num, total_cycles);
printf("Data Size: %d bytes\n", dma_info->params.size);
printf("Write Performance:\n");
printf(" Rate: %.2f MB/s (avg)\n", total_dma_w_rate / num);
printf(" Time: %d us (avg)\n", total_dma_w_time / num);
printf(" Rate: %.2f MB/s (max)\n", max_dma_w_rate);
printf(" Time: %d us (min)\n", min_dma_w_time);
printf(" Rate: %.2f MB/s (min)\n", min_dma_w_rate);
printf(" Time: %d us (max)\n", max_dma_w_time);

printf("\nRead Performance:\n");
printf(" Rate: %.2f MB/s (avg)\n", total_dma_r_rate / num);
printf(" Time: %d us (avg)\n", total_dma_r_time / num);
printf(" Rate: %.2f MB/s (max)\n", max_dma_r_rate);
printf(" Time: %d us (min)\n", min_dma_r_time);
printf(" Rate: %.2f MB/s (min)\n", min_dma_r_rate);
printf(" Time: %d us (max)\n", max_dma_r_time);


printf("\nAverage Error Rate: %.2f%%\n", average_error_rate);
}


return;
}


int main(int argc, char *argv[])
{
struct CmdLineParams params;
struct DMAInfo dma_info;
int ret = -1;


if (parse_parameter(&params, argc, argv) == false) {
printf("Please try --help to see usage.\n");
exit(1);
}


w_buffer = (uint8_t *)malloc(params.size);
if (w_buffer == NULL) {
printf("malloc w_buffer failed!\n");
exit(-1);
}


r_buffer = (uint8_t *)malloc(params.size);
if (r_buffer == NULL) {
free(w_buffer);
printf("malloc r_buffer failed!\n");
exit(-1);
}


memset(&dma_info, 0, sizeof(struct DMAInfo));
dma_info.params.address = DSMC_MEM;
dma_info.params.size = params.size;
dma_info.params.cycles = params.cycles;


ret = dma_init(&dma_info);
if (ret < 0) {
dma_destroy(&dma_info);
return -1;
}


dma_test(&dma_info);


dma_destroy(&dma_info);


if (w_buffer != NULL) {
free(w_buffer);
}


if (r_buffer != NULL) {
free(r_buffer);
}


return 0;
}

五 RTOS DMA例程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
/**
* Copyright (c) 2024 Rockchip Electronics Co., Ltd
*
* SPDX-License-Identifier: Apache-2.0
******************************************************************************
* @file dsmc_test.c
* @version V1.0
* @brief dsmc test
*
* Change Logs:
* Date Author Notes
* 2024-10-10 Zhihuan He the first version
*
******************************************************************************
*/

#include <stdint.h>

#include <rthw.h>
#include <rtthread.h>

#ifdef RT_USING_COMMON_TEST_DSMC
#include "hal_base.h"
#include "dma.h"
#include "drv_dsmc_host.h"

#define IO_BW_32 0
#define IO_BW_16 1
#define IO_BW_8 2
#define IO_TYPE_0 0

#define DMA_SIZE (0x10000)
#define COUNTS 100

static struct rt_device *dma;
static rt_sem_t mem_sem;

static void m2m_complete(void *param)
{
rt_sem_release(mem_sem);
}

void dma_memcpy(int *dst_mem, int *src_mem, int size)
{
struct rt_dma_transfer *m2m_xfer;
rt_err_t ret;
int i, len;
uint32_t tick_s, tick_e; /* ms */

m2m_xfer = (struct rt_dma_transfer *)rt_malloc(sizeof(*m2m_xfer));
mem_sem = rt_sem_create("memSem", 0, RT_IPC_FLAG_FIFO);
RT_ASSERT(m2m_xfer != RT_NULL);
RT_ASSERT(src_mem != RT_NULL);
RT_ASSERT(dst_mem != RT_NULL);
RT_ASSERT(mem_sem != RT_NULL);

len = size / sizeof(int);
for (i = 0; i < len; i++)
src_mem[i] = len - i;
rt_memset(dst_mem, 0x0, size);
rt_memset(src_mem, 0x6, size);
HAL_DCACHE_CleanInvalidateByRange((uint32_t)src_mem, size);
HAL_DCACHE_CleanInvalidateByRange((uint32_t)dst_mem, size);

dma = rt_device_find("dmac0");

/* memcpy test */
rt_memset(m2m_xfer, 0x0, sizeof(*m2m_xfer));
m2m_xfer->direction = RT_DMA_MEM_TO_MEM;
m2m_xfer->dst_addr = (rt_uint32_t)dst_mem;
m2m_xfer->src_addr = (rt_uint32_t)src_mem;
m2m_xfer->len = size;
m2m_xfer->callback = m2m_complete;
m2m_xfer->cparam = m2m_xfer;
ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_REQUEST_CHANNEL, m2m_xfer);
RT_ASSERT(ret == RT_EOK);

tick_s = HAL_GetTick();

for (i = 0; i < COUNTS; i++)
{
/* dma copy start */
ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_SINGLE_PREPARE, m2m_xfer);
RT_ASSERT(ret == RT_EOK);
ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_START, m2m_xfer);
RT_ASSERT(ret == RT_EOK);

/* wait for complete */
ret = rt_sem_take(mem_sem, RT_WAITING_FOREVER);
RT_ASSERT(ret == RT_EOK);

ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_STOP, m2m_xfer);
RT_ASSERT(ret == RT_EOK);
}

tick_e = HAL_GetTick();

ret = rt_memcmp(src_mem, dst_mem, size);

rt_kprintf("dma memcpy [%s]: avg: %7u MB/S with src: 0x%x dst: 0x%x len: %u counts: %u\n",
ret ? "FAIL" : "PASS", size * COUNTS / (tick_e - tick_s) / 1000,
src_mem, dst_mem, size, COUNTS);

ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_RELEASE_CHANNEL, m2m_xfer);
RT_ASSERT(ret == RT_EOK);

tick_s = HAL_GetTick();
for (i = 0; i < COUNTS; i++)
rt_memcpy(dst_mem, src_mem, size);
tick_e = HAL_GetTick();

ret = rt_memcmp(dst_mem, src_mem, size);

rt_kprintf("cpu memcpy [%s]: avg: %7u MB/S with src: 0x%x dst: 0x%x len: %u counts: %u\n",
ret ? "FAIL" : "PASS", size * COUNTS / (tick_e - tick_s) / 1000,
src_mem, dst_mem, size, COUNTS);

rt_sem_delete(mem_sem);
rt_free(m2m_xfer);
}

static int dma_test_psram(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
int *src = NULL, *dst = NULL;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
uint32_t size = DMA_SIZE;

rt_kprintf("dma test1: copy from ddr to psram\n");
src = rt_dma_malloc(size);
RT_ASSERT(src != RT_NULL);

dst = (int *)map->phys;

dma_memcpy(dst, src, size);
rt_dma_free(src);

rt_kprintf("dma test2: copy from psram to ddr\n");
src = (int *)map->phys;

dst = rt_dma_malloc(size);
RT_ASSERT(dst != RT_NULL);

dma_memcpy(dst, src, size);
rt_dma_free(dst);

rt_kprintf("dma test3: copy from psram to psram\n");
src = (int *)map->phys;
dst = (int *)(map->phys + size);

dma_memcpy(dst, src, size);
rt_kprintf("DMA test done\n");

return 0;
}

void psram_simple_test(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
uint32_t i, j;
struct dsmc_host_ops *ops = dsmc_host->ops;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
uint32_t test_cap;
uint32_t read, write;
uint32_t test_data[] = {0x5aa5f00f, 0x0, 0xffffffff, 0x3cc3d22d};

rt_kprintf("start cs%d simple test\n", cs);

for (j = 0; j < 4; j++)
{
test_cap = map->size;
rt_kprintf("write, test_cap = 0x%x\n", test_cap);
rt_kprintf(" write\n");
for (i = 0; i < test_cap; i += 4)
{
ops->write(dsmc_host, cs, 0, i, test_data[j]);
}
rt_kprintf("read\n");
test_cap = map->size;
rt_kprintf(" read\n");
for (i = 0; i < test_cap; i += 4)
{
ops->read(dsmc_host, cs, 0, i, &read);
if (read != test_data[j])
{
rt_kprintf("addr offset 0x%x: read = 0x%x, wr = 0x%x, error(0x%x)\n",
i, read, test_data[j], test_data[j] ^ read);
}
}
}

rt_kprintf("phase 1: write address to data\n");

test_cap = map->size;
for (i = 0; i < test_cap; i += 4)
{
ops->write(dsmc_host, cs, 0, i, map->phys + i);
}
for (i = 0; i < test_cap; i += 4)
{
ops->read(dsmc_host, cs, 0, i, &read);
write = map->phys + i;
if (read != write)
{
rt_kprintf("addr offset 0x%x: read = 0x%x, wr = 0x%x, error(0x%x)\n",
i, read, write, write ^ read);
}
}

rt_kprintf("phase 2: write 0x0f0f5aa5 + address\n");
test_cap = map->size;
for (i = 0; i < test_cap; i += 4)
{
ops->write(dsmc_host, cs, 0, i, map->phys + i + 0x0f0f5aa5);
}

test_cap = map->size;
for (i = 0; i < test_cap; i += 4)
{
ops->read(dsmc_host, cs, 0, i, &read);
write = map->phys + i + 0x0f0f5aa5;
if (read != write)
{
rt_kprintf("addr offset 0x%x: read = 0x%x, wr = 0x%x, error(0x%x)\n",
i, read, write, write ^ read);
}
}

rt_kprintf("phase 3: dma copy by software\n");
dma_test_psram(dsmc_host, cs);

rt_kprintf("cs%d simple test done.\n", cs);
}

static int plc_simple_test(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
uint32_t i;
struct dsmc_host_ops *ops = dsmc_host->ops;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
int *src = NULL, *dst = NULL;
uint32_t size = DMA_SIZE;
int ret = 0;
uint32_t timeout = 1000000;

rt_kprintf("phase 1: dma memcopy from host DDR to slave memory\n");
src = rt_dma_malloc(size);
RT_ASSERT(src != RT_NULL);

/* init src address */
rt_memset(src, 0x66, size);

HAL_DCACHE_CleanInvalidateByRange((uint32_t)src, size);
HAL_DCACHE_CleanInvalidateByRange((uint32_t)map->phys, size);

timeout = 10000;
if (!ops->copy_to(dsmc_host, cs, 0, (uint32_t)src, 0x0, size))
{
while (ops->copy_to_state(dsmc_host))
{
rt_thread_mdelay(1);
timeout--;
if (!timeout)
{
rt_kprintf("DSMC: wait copy to complete timeout\n");
ret = -1;
goto err;
}
}
rt_memcmp(src, (int *)map->phys, size);
}
else
{
rt_kprintf("err: phase 1: test skip!!!\n");
}

rt_kprintf("phase2 : dma memcopy from slave memory to host DDR\n");
dst = rt_dma_malloc(size);
RT_ASSERT(dst != RT_NULL);

rt_memset(dst, 0x88, size);
/* init plc src address */
for (i = 0; i < size; i += 4)
{
ops->write(dsmc_host, cs, 0, i, 0x10100000 + i);
}
HAL_DCACHE_CleanInvalidateByRange((uint32_t)dst, size);
HAL_DCACHE_CleanInvalidateByRange((uint32_t)map->phys, size);

timeout = 10000;
if (!ops->copy_from(dsmc_host, cs, 0x0, 0, (uint32_t)dst, size))
{
while (ops->copy_from_state(dsmc_host))
{
rt_thread_mdelay(1);
timeout--;
if (!timeout)
{
rt_kprintf("DSMC: wait copy from complete timeout\n");
ret = -1;
goto err;
}
}
rt_memcmp((int *)map->phys, dst, size);
}
else
{
rt_kprintf("err: phase 1: test skip!!!\n");
}

rt_dma_free(src);
rt_dma_free(dst);

rt_kprintf("plc_simple_test test done\n");
return 0;

err:
rt_kprintf("plc_simple_test test error!\n");
return ret;
}

void dsmc_slave_latency_cpu(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
uint32_t i;
struct dsmc_host_ops *ops = dsmc_host->ops;
uint32_t read, counter;
uint32_t kt[2], diff;

rt_kprintf("cpu access dsmc latency test\n");

counter = 1000000;

kt[0] = HAL_GetTick();
for (i = 0; i < counter; i++)
{
ops->read(dsmc_host, cs, 3, 0x100, &read);
}
kt[1] = HAL_GetTick();
diff = kt[1] - kt[0];
rt_kprintf("counter %d, cost %ums, read latency %uns\n",
counter, diff, diff * 1000000 / counter);
rt_kprintf("read = 0x%x\n", read);

kt[0] = HAL_GetTick();
for (i = 0; i < counter; i++)
{
ops->write(dsmc_host, cs, 3, 0x100, 0x6);
}
kt[1] = HAL_GetTick();
diff = kt[1] - kt[0];
rt_kprintf("counter %d, cost %ums, write latency %uns\n",
counter, diff, diff * 1000000 / counter);
}

void dsmc_slave_speed_cpu(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
uint32_t i;
struct dsmc_host_ops *ops = dsmc_host->ops;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
uint32_t kt[2], rd_diff, wr_diff;
uint32_t read;

rt_kprintf("cpu access dsmc speed\n");

kt[0] = HAL_GetTick();
for (i = 0; i < map->size; i += 4)
{
ops->write(dsmc_host, cs, 0, i, 0xffffffff);
}
HAL_DCACHE_CleanInvalidateByRange((uint32_t)map->phys, map->size);
kt[1] = HAL_GetTick();
wr_diff = kt[1] - kt[0];
rt_kprintf("cpu write %uMB, cost %ums, %uMB/s\n",
(uint32_t)map->size / 1024 / 1024,
wr_diff,
map->size / wr_diff / 1000);

kt[0] = HAL_GetTick();
for (i = 0; i < map->size; i += 4)
{
ops->read(dsmc_host, cs, 0, i, &read);
}
kt[1] = HAL_GetTick();
rd_diff = kt[1] - kt[0];

rt_kprintf("cpu read %uMB, cost %ums, %uMB/s\n",
(uint32_t)map->size / 1024 / 1024,
rd_diff,
map->size / rd_diff / 1000);
}

static int dsmc_slave_speed_dma(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
int *src = NULL, *dst = NULL;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
uint32_t size = DMA_SIZE;

rt_kprintf("dma access dsmc speed\n");

rt_kprintf("DMA write DSMC\n");
src = rt_dma_malloc(size);
RT_ASSERT(src != RT_NULL);

dst = (int *)map->phys;

dma_memcpy(dst, src, size);

rt_dma_free(src);

rt_kprintf("DMA read DSMC\n");
src = (int *)map->phys;
dst = rt_dma_malloc(size);
RT_ASSERT(dst != RT_NULL);

dma_memcpy(dst, src, size);

rt_dma_free(dst);

return 0;
}

void dsmc_test(int argc, char **argv)
{
uint32_t i;
rt_device_t dev;

struct rockchip_rt_dsmc_host *dsmc_host;

rt_kprintf("this is dsmc_test\n");

dev = rt_device_find("dsmc_host");
if (dev == RT_NULL)
{
rt_kprintf("dsmc_host not found\n");
return;
}
dsmc_host = (struct rockchip_rt_dsmc_host *)dev;

for (i = 0; i < DSMC_MAX_SLAVE_NUM; i++)
{
if (dsmc_host->host->dsmcHostDev.ChipSelCfg[i].deviceType == DEV_UNKNOWN)
continue;
psram_simple_test(dsmc_host, i);
if (dsmc_host->host->dsmcHostDev.ChipSelCfg[i].deviceType == DEV_DSMC_LB)
{
plc_simple_test(dsmc_host, i);
dsmc_slave_latency_cpu(dsmc_host, i);
}
dsmc_slave_speed_cpu(dsmc_host, i);
dsmc_slave_speed_dma(dsmc_host, i);
}
}

#ifdef RT_USING_FINSH
#include <finsh.h>
MSH_CMD_EXPORT(dsmc_test, dsmc tester);
#endif
#endif