一 前言

  • 在前面分析了驱动大致走向和函数,驱动细节我们无法分析,因为拿不到底层寄存器相关的细节文档

  • 所以我们应用的重中之重,就是需要仔细分析DMA读写,把DMA读写学会,就可以拿到数据给上层应用使用

  • DMA本身没有绑定DSMC,但是使用DMA驱动时,会给DMA配置DSMC的内存地址,这样DMA读写直接按块访问内存中映射的DSMC

  • 应用请直接看后面的 三 外部调用例程做库文件 ,例程修改为库文件和头文件作为接口库,提供对外接口调用


二 RTOS DMA读写分析

2.1 DMA驱动大致内容

具体驱动分析请看前面的DSMC代码分析之RTOS

  • drv_dsmc_host.c驱动主要做外设初始化与DMA数据拷贝搬运和直接读写指定地址的用户调用实现

  • 在RTOS中drv_dsmc_host.crockchip_dsmc_host_probe初始化函数会从RTOS层调用HAL层hal_dsmc_host.c文件的HAL_DSMC_HOST_Init去写寄存器初始化DSMC外设

  • 从下图drv_dsmc_host.c函数列表中可以看出来,除了初始化外设,主要就是DMA数据拷贝搬运的实现和直接读写指定地址的实现

  • 并且驱动通过ops操作集将DMA数据拷贝搬运和直接读写指定地址函数用作上层调用函数接口

2.2 DMA用户使用流程

  • 实际上是实现主控 → DSMC控制器 → Local Bus → FPGA等来做读写测试

  • 用户程序主要是通过DMA或直接地址读写向DSMC控制器写入读取数据,数据通过Local Bus总线,读写外部设备

  • 下面所有DMA相关函数copy_to、copy_from以及dma_memcpy函数,内部都通过rt_device_control对dma设置状态w完成DMA读写

  • rt_device_control设置的参数最终在SDK/GA3506_Linux_Source/rtos/bsp/rockchip/common/drivers/dma.c文件中的rt_dma_control函数中通过switch来实现DMA具体操作

1
2
3
4
5
rt_device_control(dma, RT_DEVICE_CTRL_DMA_REQUEST_CHANNEL, m2m_xfer);  //请求 DMA 通道
rt_device_control(dma, RT_DEVICE_CTRL_DMA_SINGLE_PREPARE, m2m_xfer); //DMA 传输准备信号
rt_device_control(dma, RT_DEVICE_CTRL_DMA_START, m2m_xfer); //单次 DMA 传输准备
rt_device_control(dma, RT_DEVICE_CTRL_DMA_STOP, m2m_xfer); //停止 DMA
rt_device_control(dma, RT_DEVICE_CTRL_DMA_RELEASE_CHANNEL, m2m_xfer); //释放 DMA 通道
  • 程序结构大致就是下方这样的伪代码,从结构上我们就能看出来,主要是初始化后,通过ops操作集来读写内存拿到数据,或者通过DMA完成读写
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
main()  
└──> dsmc_test()
├──> psram_simple_test() // PSRAM基本功能测试(CPU读写+数据校验)
│ ├──> ops->write() // 向PSRAM写入测试数据
│ ├──> ops->read() // 从PSRAM读取测试数据
│ └──> 数据校验
├──> dma_test_psram() // PSRAM DMA搬运测试(DMA搬运+数据校验)
│ ├──> dma_memcpy() // 用DMA搬运数据到/从PSRAM
│ └──> 数据校验
├──> plc_simple_test() // PLC类外设基本功能测试(CPU读写+数据校验)
│ ├──> ops->write() // 向PLC外设写入测试数据
│ ├──> ops->read() // 从PLC外设读取测试数据
│ ├──> ops->copy_to // 从DMA拷贝数据到FPGA
│ ├──> ops->copy_to_state //拷贝状态
│ ├──> ops->copy_from// 从FPGA拷贝数据到DMA
│ ├──> ops->copy_from_state//拷贝状态
│ └──> 数据校验
├──> dma_test_lb() // Local Bus外设DMA搬运测试(DMA搬运+数据校验)
│ ├──> dma_memcpy() // 用DMA搬运数据到/从Local Bus外设
│ └──> 数据校验
├──> fpga_simple_test() // FPGA外设基本功能测试(CPU读写+数据校验)
│ ├──> ops->write() // 向FPGA写入测试数据
│ ├──> ops->read() // 从FPGA读取测试数据
│ └──> 数据校验
├──> dsmc_slave_latency_cpu() // CPU方式外设访问延迟测试(多次操作统计平均延迟)
│ └──> ops->read()/ops->write()(循环多次) // 多次读/写外设,统计耗时
├──> dsmc_slave_speed_cpu() // CPU方式外设带宽测试(大块数据读写统计带宽)
│ └──> ops->read()/ops->write()(大块数据) // 用CPU搬运大块数据,统计带宽
└──> dsmc_slave_speed_dma() // DMA方式外设带宽测试(大块数据搬运统计带宽)
└──> dma_memcpy()(大块数据) // 用DMA搬运大块数据,统计带宽
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
1 内存搬运/性能测试
dma_memcpy:用DMA控制器实现内存到内存的拷贝,并和CPU memcpy做对比,测试速度

2 PSRAM/外设简单功能测试
psram_simple_test:依次对PSRAM做多轮写入/读取/校验,测试不同数据模式

3 DMA 读写 PSRAM/外设测试
dma_test_psram:DDR-> SRAM,PSRAM->DDR,PSRAM->PSRAM三种方向的DMA搬运

4 PLC(Local Bus)功能测试
plc_simple_test:用DSMC的驱动内部ops接口测试主机和Local Bus外设的数据搬运。先从DDR DMA写到外设,外设DMA读回DDR

5 CPU 访问延迟测试
dsmc_slave_latency_cpu:反复用CPU读/写DSMC外设的固定地址,统计总用时,计算单次访问的平均延迟。

6 CPU 访问带宽测试
dsmc_slave_speed_cpu:用CPU连续写入/读取DSMC外设的大块数据,统计总用时,计算带宽(MB/s)

7 DMA 访问带宽测试
dsmc_slave_speed_dma:用DMA连续搬运大块数据到DSMC外设,再搬回,测试DMA带宽(MB/s)



三 外部调用例程做库文件

3.1 外部文件调用库文件

  • 代码本质已经完善了读写,我们只需要把例程改为库文件,就可以给其他程序使用了

  • 外部调用基本框架都可以不变,只需要把源文件放到同级目录,rt_dsmc_lib.h头文件加上,就可以在用户新文件调用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48


#include "rt_dsmc_lib.h"


void dsmc_test(int argc, char **argv)
{
rt_device_t dev = rt_device_find("dsmc_host");
if (dev == RT_NULL)
{
rt_kprintf("dsmc_host not found\n");
return;
}
struct rockchip_rt_dsmc_host *dsmc_host = (struct rockchip_rt_dsmc_host *)dev;


int *src = NULL, *dst = NULL;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
uint32_t size = DMA_SIZE;


src = rt_dma_malloc(size);
RT_ASSERT(src != RT_NULL);


dst = (int *)map->phys;


dma_memcpy(dst, src, size);


// 只测试第0个cs
psram_simple_test(dsmc_host, 0);
dsmc_slave_latency_cpu(dsmc_host, 0);
dsmc_slave_speed_cpu(dsmc_host, 0);
dma_test_psram(dsmc_host, 0);
plc_simple_test(dsmc_host, 0);
}


#ifdef RT_USING_FINSH
#include <finsh.h>
MSH_CMD_EXPORT(dsmc_test, dsmc tester);
#endif




3.2 例程修改库文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
rt_dsmc_lib.c

#include "rt_dsmc_lib.h"


#define IO_BW_32 0
#define IO_BW_16 1
#define IO_BW_8 2
#define IO_TYPE_0 0


#define DMA_SIZE (0x10000)
#define COUNTS 100


static struct rt_device *dma;
static rt_sem_t mem_sem;


static void m2m_complete(void *param)
{
rt_sem_release(mem_sem);
}


void dma_memcpy(int *dst_mem, int *src_mem, int size)
{
struct rt_dma_transfer *m2m_xfer;
rt_err_t ret;
int i, len;
uint32_t tick_s, tick_e; /* ms */


m2m_xfer = (struct rt_dma_transfer *)rt_malloc(sizeof(*m2m_xfer));
mem_sem = rt_sem_create("memSem", 0, RT_IPC_FLAG_FIFO);
RT_ASSERT(m2m_xfer != RT_NULL);
RT_ASSERT(src_mem != RT_NULL);
RT_ASSERT(dst_mem != RT_NULL);
RT_ASSERT(mem_sem != RT_NULL);


len = size / sizeof(int);
for (i = 0; i < len; i++)
src_mem[i] = len - i;
rt_memset(dst_mem, 0x0, size);
rt_memset(src_mem, 0x6, size);
HAL_DCACHE_CleanInvalidateByRange((uint32_t)src_mem, size);
HAL_DCACHE_CleanInvalidateByRange((uint32_t)dst_mem, size);


dma = rt_device_find("dmac0");


/* memcpy test */
rt_memset(m2m_xfer, 0x0, sizeof(*m2m_xfer));
m2m_xfer->direction = RT_DMA_MEM_TO_MEM;
m2m_xfer->dst_addr = (rt_uint32_t)dst_mem;
m2m_xfer->src_addr = (rt_uint32_t)src_mem;
m2m_xfer->len = size;
m2m_xfer->callback = m2m_complete;
m2m_xfer->cparam = m2m_xfer;
ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_REQUEST_CHANNEL, m2m_xfer);
RT_ASSERT(ret == RT_EOK);


tick_s = HAL_GetTick();


for (i = 0; i < COUNTS; i++)
{
/* dma copy start */
ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_SINGLE_PREPARE, m2m_xfer);
RT_ASSERT(ret == RT_EOK);
ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_START, m2m_xfer);
RT_ASSERT(ret == RT_EOK);


/* wait for complete */
ret = rt_sem_take(mem_sem, RT_WAITING_FOREVER);
RT_ASSERT(ret == RT_EOK);


ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_STOP, m2m_xfer);
RT_ASSERT(ret == RT_EOK);
}


tick_e = HAL_GetTick();


ret = rt_memcmp(src_mem, dst_mem, size);


rt_kprintf("dma memcpy [%s]: avg: %7u MB/S with src: 0x%x dst: 0x%x len: %u counts: %u\n",
ret ? "FAIL" : "PASS", size * COUNTS / (tick_e - tick_s) / 1000,
src_mem, dst_mem, size, COUNTS);


ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_RELEASE_CHANNEL, m2m_xfer);
RT_ASSERT(ret == RT_EOK);


tick_s = HAL_GetTick();
for (i = 0; i < COUNTS; i++)
rt_memcpy(dst_mem, src_mem, size);
tick_e = HAL_GetTick();


ret = rt_memcmp(dst_mem, src_mem, size);


rt_kprintf("cpu memcpy [%s]: avg: %7u MB/S with src: 0x%x dst: 0x%x len: %u counts: %u\n",
ret ? "FAIL" : "PASS", size * COUNTS / (tick_e - tick_s) / 1000,
src_mem, dst_mem, size, COUNTS);


rt_sem_delete(mem_sem);
rt_free(m2m_xfer);
}
int dma_test_psram(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
int *src = NULL, *dst = NULL;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
uint32_t size = DMA_SIZE;


rt_kprintf("dma test1: copy from ddr to psram\n");
src = rt_dma_malloc(size);
RT_ASSERT(src != RT_NULL);


dst = (int *)map->phys;


dma_memcpy(dst, src, size);
rt_dma_free(src);


rt_kprintf("dma test2: copy from psram to ddr\n");
src = (int *)map->phys;


dst = rt_dma_malloc(size);
RT_ASSERT(dst != RT_NULL);


dma_memcpy(dst, src, size);
rt_dma_free(dst);


rt_kprintf("dma test3: copy from psram to psram\n");
src = (int *)map->phys;
dst = (int *)(map->phys + size);


dma_memcpy(dst, src, size);
rt_kprintf("DMA test done\n");


return 0;
}


void psram_simple_test(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
uint32_t i, j;
struct dsmc_host_ops *ops = dsmc_host->ops;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
uint32_t test_cap;
uint32_t read, write;
uint32_t test_data[] = {0x5aa5f00f, 0x0, 0xffffffff, 0x3cc3d22d};


rt_kprintf("start cs%d simple test\n", cs);


for (j = 0; j < 4; j++)
{
test_cap = map->size;
rt_kprintf("write, test_cap = 0x%x\n", test_cap);
rt_kprintf(" write\n");
for (i = 0; i < test_cap; i += 4)
{
ops->write(dsmc_host, cs, 0, i, test_data[j]);
}
rt_kprintf("read\n");
test_cap = map->size;
rt_kprintf(" read\n");
for (i = 0; i < test_cap; i += 4)
{
ops->read(dsmc_host, cs, 0, i, &read);
if (read != test_data[j])
{
rt_kprintf("addr offset 0x%x: read = 0x%x, wr = 0x%x, error(0x%x)\n",
i, read, test_data[j], test_data[j] ^ read);
}
}
}


rt_kprintf("phase 1: write address to data\n");


test_cap = map->size;
for (i = 0; i < test_cap; i += 4)
{
ops->write(dsmc_host, cs, 0, i, map->phys + i);
}
for (i = 0; i < test_cap; i += 4)
{
ops->read(dsmc_host, cs, 0, i, &read);
write = map->phys + i;
if (read != write)
{
rt_kprintf("addr offset 0x%x: read = 0x%x, wr = 0x%x, error(0x%x)\n",
i, read, write, write ^ read);
}
}


rt_kprintf("phase 2: write 0x0f0f5aa5 + address\n");
test_cap = map->size;
for (i = 0; i < test_cap; i += 4)
{
ops->write(dsmc_host, cs, 0, i, map->phys + i + 0x0f0f5aa5);
}


test_cap = map->size;
for (i = 0; i < test_cap; i += 4)
{
ops->read(dsmc_host, cs, 0, i, &read);
write = map->phys + i + 0x0f0f5aa5;
if (read != write)
{
rt_kprintf("addr offset 0x%x: read = 0x%x, wr = 0x%x, error(0x%x)\n",
i, read, write, write ^ read);
}
}


rt_kprintf("phase 3: dma copy by software\n");
dma_test_psram(dsmc_host, cs);


rt_kprintf("cs%d simple test done.\n", cs);
}


int plc_simple_test(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
uint32_t i;
struct dsmc_host_ops *ops = dsmc_host->ops;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
int *src = NULL, *dst = NULL;
uint32_t size = DMA_SIZE;
int ret = 0;
uint32_t timeout = 1000000;


rt_kprintf("phase 1: dma memcopy from host DDR to slave memory\n");
src = rt_dma_malloc(size);
RT_ASSERT(src != RT_NULL);


/* init src address */
rt_memset(src, 0x66, size);


HAL_DCACHE_CleanInvalidateByRange((uint32_t)src, size);
HAL_DCACHE_CleanInvalidateByRange((uint32_t)map->phys, size);


timeout = 10000;
if (!ops->copy_to(dsmc_host, cs, 0, (uint32_t)src, 0x0, size))
{
while (ops->copy_to_state(dsmc_host))
{
rt_thread_mdelay(1);
timeout--;
if (!timeout)
{
rt_kprintf("DSMC: wait copy to complete timeout\n");
ret = -1;
goto err;
}
}
rt_memcmp(src, (int *)map->phys, size);
}
else
{
rt_kprintf("err: phase 1: test skip!!!\n");
}


rt_kprintf("phase2 : dma memcopy from slave memory to host DDR\n");
dst = rt_dma_malloc(size);
RT_ASSERT(dst != RT_NULL);


rt_memset(dst, 0x88, size);
/* init plc src address */
for (i = 0; i < size; i += 4)
{
ops->write(dsmc_host, cs, 0, i, 0x10100000 + i);
}
HAL_DCACHE_CleanInvalidateByRange((uint32_t)dst, size);
HAL_DCACHE_CleanInvalidateByRange((uint32_t)map->phys, size);


timeout = 10000;
if (!ops->copy_from(dsmc_host, cs, 0x0, 0, (uint32_t)dst, size))
{
while (ops->copy_from_state(dsmc_host))
{
rt_thread_mdelay(1);
timeout--;
if (!timeout)
{
rt_kprintf("DSMC: wait copy from complete timeout\n");
ret = -1;
goto err;
}
}
rt_memcmp((int *)map->phys, dst, size);
}
else
{
rt_kprintf("err: phase 1: test skip!!!\n");
}


rt_dma_free(src);
rt_dma_free(dst);


rt_kprintf("plc_simple_test test done\n");
return 0;


err:
rt_kprintf("plc_simple_test test error!\n");
return ret;
}


void dsmc_slave_latency_cpu(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
uint32_t i;
struct dsmc_host_ops *ops = dsmc_host->ops;
uint32_t read, counter;
uint32_t kt[2], diff;


rt_kprintf("cpu access dsmc latency test\n");


counter = 1000000;


kt[0] = HAL_GetTick();
for (i = 0; i < counter; i++)
{
ops->read(dsmc_host, cs, 3, 0x100, &read);
}
kt[1] = HAL_GetTick();
diff = kt[1] - kt[0];
rt_kprintf("counter %d, cost %ums, read latency %uns\n",
counter, diff, diff * 1000000 / counter);
rt_kprintf("read = 0x%x\n", read);


kt[0] = HAL_GetTick();
for (i = 0; i < counter; i++)
{
ops->write(dsmc_host, cs, 3, 0x100, 0x6);
}
kt[1] = HAL_GetTick();
diff = kt[1] - kt[0];
rt_kprintf("counter %d, cost %ums, write latency %uns\n",
counter, diff, diff * 1000000 / counter);
}


void dsmc_slave_speed_cpu(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
uint32_t i;
struct dsmc_host_ops *ops = dsmc_host->ops;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
uint32_t kt[2], rd_diff, wr_diff;
uint32_t read;


rt_kprintf("cpu access dsmc speed\n");


kt[0] = HAL_GetTick();
for (i = 0; i < map->size; i += 4)
{
ops->write(dsmc_host, cs, 0, i, 0xffffffff);
}
HAL_DCACHE_CleanInvalidateByRange((uint32_t)map->phys, map->size);
kt[1] = HAL_GetTick();
wr_diff = kt[1] - kt[0];
rt_kprintf("cpu write %uMB, cost %ums, %uMB/s\n",
(uint32_t)map->size / 1024 / 1024,
wr_diff,
map->size / wr_diff / 1000);


kt[0] = HAL_GetTick();
for (i = 0; i < map->size; i += 4)
{
ops->read(dsmc_host, cs, 0, i, &read);
}
kt[1] = HAL_GetTick();
rd_diff = kt[1] - kt[0];


rt_kprintf("cpu read %uMB, cost %ums, %uMB/s\n",
(uint32_t)map->size / 1024 / 1024,
rd_diff,
map->size / rd_diff / 1000);
}
int dsmc_slave_speed_dma(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
int *src = NULL, *dst = NULL;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
uint32_t size = DMA_SIZE;


rt_kprintf("dma access dsmc speed\n");


rt_kprintf("DMA write DSMC\n");
src = rt_dma_malloc(size);
RT_ASSERT(src != RT_NULL);


dst = (int *)map->phys;


dma_memcpy(dst, src, size);


rt_dma_free(src);


rt_kprintf("DMA read DSMC\n");
src = (int *)map->phys;
dst = rt_dma_malloc(size);
RT_ASSERT(dst != RT_NULL);


dma_memcpy(dst, src, size);


rt_dma_free(dst);


return 0;
}




3.3 例程修改头文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
RT_DSMC_LIB.H


#ifndef __RT_DSMC_LIB_H__
#define __RT_DSMC_LIB_H__


#include <stdint.h>
#include <rthw.h>
#include <rtthread.h>
#include "hal_base.h"
#include "dma.h"
#include "drv_dsmc_host.h"


#ifdef __cplusplus
extern "C"
{
#endif


struct rockchip_rt_dsmc_host;


// 供外部应用调用的测试接口
int dma_test_psram(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs);
void psram_simple_test(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs);
void dsmc_slave_latency_cpu(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs);
void dsmc_slave_speed_cpu(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs);
int dsmc_slave_speed_dma(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs);
int plc_simple_test(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs);
void dma_memcpy(int *dst_mem, int *src_mem, int size);


#ifdef __cplusplus
}
#endif


#endif /* __RT_DSMC_LIB_H__ */




四 Linux DMA例程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568


#include <stdio.h>
#include <stdint.h>
#include <time.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <linux/input.h>
#include <errno.h>
#include <string.h>
#include <getopt.h>
#include <stdbool.h>
#include <libgen.h>
#include <sys/select.h>
#include <signal.h>


const char * const VERSION = "1.0";


#define DMA_MEMCPY_PATH "/dev/dsmc/cs0/region0"


/* commands codes */
#define DMA_MEMCPY_SETUP 0x01000000
#define DMA_MEMCPY_START 0x02000000
#define DMA_MEMCPY_GET_TIME 0x03000000
#define DMA_MEMCPY_SHUTDOWN 0x04000000
#define DSMC_MEM 0xc0000000


uint8_t *w_buffer = NULL;
uint8_t *r_buffer = NULL;


struct DMASetupParams {
unsigned int dev_phy_addr;
unsigned int dst_phy_addr;
unsigned int transfer_direction;
unsigned int transfer_single_size;
unsigned int transfer_total_size;
};


struct DMATransferInfo {
float dma_r_rate;
float dma_w_rate;
uint32_t dma_r_cost_time;
uint32_t dma_w_cost_time;
uint32_t error_count;
};


struct DmaParams {
uint32_t address;
uint32_t size;
uint32_t cycles;
char dev[64];
};


struct DMAInfo {
int dma_fd;
int input_fd;


uint16_t *map_base_addr;
uint16_t *vir_addr;


struct DmaParams params;
struct DMATransferInfo dma_trans_info;
};


struct CmdLineParams {
uint32_t address;
uint32_t size;
uint32_t cycles;
char dev[64];
};


/* Short option names */
static const char g_shortopts [] = ":s:c:vh";
/* Option names */
static const struct option g_longopts [] = {
{ "size", required_argument, NULL, 's' },
{ "cycles", required_argument, NULL, 'c' },
{ "version", no_argument, NULL, 'v' },
{ "help", no_argument, NULL, 'h' },
{ 0, 0, 0, 0 }
};


static void usage(char **argv)
{
fprintf(stdout,
"Usage: %s [options]\n\n"
"Options:\n"
" -s | --size Data size (Unit: Byte) \n"
" -c | --cycles Loop read and write times \n"
" -v | --version Display version information\n"
" -h | --help Show help content\n\n"
"Example:\n"
" # ./%s -a 0x10000000 -s 65536 -c 1000 \n"
"", basename(argv[0]),
basename(argv[0]));
}


static bool parse_parameter(struct CmdLineParams *params, int argc, char **argv)
{
int opt;


memset(params, 0, sizeof(struct CmdLineParams));


while ((opt = getopt_long(argc, argv, g_shortopts, g_longopts, NULL)) != -1) {
switch (opt) {
case 's':
params->size = atoi(optarg);
break;


case 'c':
params->cycles = atoi(optarg);
break;


case 'h':
usage(argv);
exit(0);


case 'v':
printf("version : %s\n", VERSION);
exit(0);


default:
fprintf(stderr, "Unknown option %c\n", optopt);
break;
}
}
return true;
}


//static void neon_copy(volatile void *dst, volatile void *src, int sz)
static void neon_copy(void *dst, void *src, int sz)
{
if (sz & 63) {
sz = (sz & -64) + 64;
}


memcpy(dst, src, sz);
}


static int dma_init(struct DMAInfo *dma_info)
{
if (dma_info == NULL) {
return -1;
}


dma_info->dma_fd = open(DMA_MEMCPY_PATH, O_RDWR);
if(dma_info->dma_fd < 0) {
printf("can't open %s\n", DMA_MEMCPY_PATH);
return -1;
}
return 0;
}


static void dma_destroy(struct DMAInfo *dma_info)
{
if (dma_info->map_base_addr != NULL) {
munmap(dma_info->map_base_addr, dma_info->params.size);
dma_info->map_base_addr = NULL;
}


if (dma_info->dma_fd != -1) close(dma_info->dma_fd);
}


static int dma_write(struct DMAInfo *dma_info)
{
struct DMASetupParams dma_setup_params;
unsigned int dev_addr = 0;
int i = 0, ret = -1;
struct input_event input_buf;
int status = 0;
struct timeval start, end, time_diff;
fd_set input;
struct timeval timeout;
unsigned time2;
float rate;

/* Populate the write buffer */
for (i = 0; i < dma_info->params.size ; i++) {
w_buffer[i] = i+1;
}


/* Setup DMA */
dma_setup_params.dev_phy_addr = dma_info->params.address;
dma_setup_params.transfer_total_size = dma_info->params.size;
dma_setup_params.transfer_single_size = dma_info->params.size;
/* 0: write, 1: read */
dma_setup_params.transfer_direction = 0;
dev_addr = ioctl(dma_info->dma_fd, DMA_MEMCPY_SETUP, &dma_setup_params);
if (dev_addr != 0) {
printf("can't get device addr write: 0x%x\n", dev_addr);
return -1;
}


dma_info->vir_addr = mmap(NULL, dma_info->params.size, PROT_READ | PROT_WRITE, MAP_SHARED, dma_info->dma_fd, 0);
if (dma_info->vir_addr == MAP_FAILED) {
printf("mmap failed\n");
return -1;
}


neon_copy((unsigned char *)dma_info->vir_addr, (unsigned char *)w_buffer, dma_info->params.size);



/* Start DMA */
ret = ioctl(dma_info->dma_fd, DMA_MEMCPY_START, 0);
if (ret != 0) {
return -1;
}


time2 = ioctl(dma_info->dma_fd, DMA_MEMCPY_GET_TIME, 0);
while(!time2){
time2 = ioctl(dma_info->dma_fd, DMA_MEMCPY_GET_TIME, 0);
}
rate = ((dma_info->params.size * 1.0 / (time2 / 1000000.0)) / 1024 / 1024);


dma_info->dma_trans_info.dma_w_cost_time = time2;
dma_info->dma_trans_info.dma_w_rate = ((dma_info->params.size * 1.0 / (dma_info->dma_trans_info.dma_w_cost_time / 1000000.0)) / 1024 / 1024);
error:
/* Shutdown DMA transfer */
ret = ioctl(dma_info->dma_fd, DMA_MEMCPY_SHUTDOWN, 0);
if (ret < 0) {
status = -1;
}
munmap(dma_info->vir_addr, dma_info->params.size);


return status;
}


static int dma_read(struct DMAInfo *dma_info)
{
struct DMASetupParams dma_setup_params;
unsigned int dev_addr = 0;
int status = 0;
int ret = -1, i = 0;
struct input_event input_buf;
struct timeval start, end, time_diff;
fd_set input;
unsigned time2;
float rate;


/* Setup DMA */
dma_setup_params.dev_phy_addr = dma_info->params.address;
dma_setup_params.transfer_total_size = dma_info->params.size;
dma_setup_params.transfer_single_size = dma_info->params.size;
/* 0: write, 1: read */
dma_setup_params.transfer_direction = 1;
dev_addr = ioctl(dma_info->dma_fd, DMA_MEMCPY_SETUP, &dma_setup_params);
if (dev_addr != 0) {
printf("can't get device addr read: 0x%x\n", dev_addr);
return -1;
}


/* Start DMA */
ret = ioctl(dma_info->dma_fd, DMA_MEMCPY_START, 0);
if (ret != 0) {
return -1;
}


dma_info->vir_addr = mmap(NULL, dma_info->params.size, PROT_READ | PROT_WRITE, MAP_SHARED, dma_info->dma_fd, 0);
if (dma_info->vir_addr == MAP_FAILED) {
printf("mmap failed\n");
return -1;
}


time2 = ioctl(dma_info->dma_fd, DMA_MEMCPY_GET_TIME, 0);
while(!time2){
time2 = ioctl(dma_info->dma_fd, DMA_MEMCPY_GET_TIME, 0);
}
rate = ((dma_info->params.size * 1.0 / (time2 / 1000000.0)) / 1024 / 1024);


neon_copy((unsigned char *)r_buffer, (unsigned char *)dma_info->vir_addr, dma_info->params.size);


/* Verify data */
for(i = 0; i < dma_info->params.size ; i ++) {
if(r_buffer[i] != w_buffer[i]) {
dma_info->dma_trans_info.error_count++;
if(dma_info->dma_trans_info.error_count == 1) {
printf("\nData verify error at offset %d (0x%x), expect: 0x%x, actual: 0x%x\n",
i, i * 2, w_buffer[i], r_buffer[i]);
}
}
}


dma_info->dma_trans_info.dma_r_cost_time = time2;
dma_info->dma_trans_info.dma_r_rate = ((dma_info->params.size * 1.0 / (dma_info->dma_trans_info.dma_r_cost_time / 1000000.0)) / 1024 / 1024);


error:
/* Shutdown DMA transfer */
ret = ioctl(dma_info->dma_fd, DMA_MEMCPY_SHUTDOWN, 0);
if (ret < 0) {
status = -1;
}


munmap(dma_info->vir_addr, dma_info->params.size);


return status;
}


static void dma_test(struct DMAInfo *dma_info)
{
int ret = -1;
unsigned int num = 0;
float total_dma_w_rate = 0, total_dma_r_rate = 0;
float max_dma_w_rate = 0, max_dma_r_rate = 0;
float min_dma_w_rate = -1, min_dma_r_rate = -1;
uint32_t total_dma_w_time = 0, total_dma_r_time = 0;
uint32_t max_dma_w_time = 0, max_dma_r_time = 0;
uint32_t min_dma_w_time = -1, min_dma_r_time = -1;
const uint32_t total_cycles = dma_info->params.cycles;


printf("\nStarting DMA test with %d cycles...\n\n", total_cycles);


// Initialize total error count
uint32_t total_error_count = 0;


while (dma_info->params.cycles-- > 0) {
ret = dma_write(dma_info);
if (ret < 0) {
printf("dma_write return error code: %d\r\n\n\n", ret);
break;
}
#if 1
printf("write end ---------\n");
continue;
#endif
printf("start read----\n");
ret = dma_read(dma_info);
if (ret < 0) {
printf("dma_read return error code: %d\r\n\n\n", ret);
break;
}


num++;


// Calculate error rate for this cycle
float error_rate = (dma_info->dma_trans_info.error_count / (float)(dma_info->params.size / 2)) * 100;


// Accumulate total error count
total_error_count += dma_info->dma_trans_info.error_count;
dma_info->dma_trans_info.error_count = 0;


// Print current cycle statistics
printf("\rCycle %d/%d - Data Size: %d bytes | Write: %.2f MB/s | Read: %.2f MB/s | Write Time: %d us | Read Time: %d us | Error Rate: %.2f%%",
num, total_cycles,
dma_info->params.size,
dma_info->dma_trans_info.dma_w_rate,
dma_info->dma_trans_info.dma_r_rate,
dma_info->dma_trans_info.dma_w_cost_time,
dma_info->dma_trans_info.dma_r_cost_time,
error_rate);
fflush(stdout);


// Accumulate for write average calculation
total_dma_w_rate += dma_info->dma_trans_info.dma_w_rate;


// Calculate the write maximum value
if (dma_info->dma_trans_info.dma_w_rate > max_dma_w_rate) {
max_dma_w_rate = dma_info->dma_trans_info.dma_w_rate;
}


// Calculate the write minimum value
if (min_dma_w_rate == -1.0 || dma_info->dma_trans_info.dma_w_rate < min_dma_w_rate) {
min_dma_w_rate = dma_info->dma_trans_info.dma_w_rate;
}


// Accumulate for read average calculation
total_dma_r_rate += dma_info->dma_trans_info.dma_r_rate;


// Calculate the write maximum value
if (dma_info->dma_trans_info.dma_r_rate > max_dma_r_rate) {
max_dma_r_rate = dma_info->dma_trans_info.dma_r_rate;
}


// Calculate the write minimum value
if (min_dma_r_rate == -1.0 || dma_info->dma_trans_info.dma_r_rate < min_dma_r_rate) {
min_dma_r_rate = dma_info->dma_trans_info.dma_r_rate;
}


// Accumulated is used to calculate the average time to write
total_dma_w_time += dma_info->dma_trans_info.dma_w_cost_time;


// Write maximum time calculation
if (dma_info->dma_trans_info.dma_w_cost_time > max_dma_w_time) {
max_dma_w_time = dma_info->dma_trans_info.dma_w_cost_time;
}


// Write minimum time calculation
if (min_dma_w_time == -1.0 || dma_info->dma_trans_info.dma_w_cost_time < min_dma_w_time) {
min_dma_w_time = dma_info->dma_trans_info.dma_w_cost_time;
}


// Accumulated is used to calculate the average time to read
total_dma_r_time += dma_info->dma_trans_info.dma_r_cost_time;


// Read maximum time calculation
if (dma_info->dma_trans_info.dma_r_cost_time > max_dma_r_time) {
max_dma_r_time = dma_info->dma_trans_info.dma_r_cost_time;
}


// Read minimum time calculation
if (min_dma_r_time == -1.0 || dma_info->dma_trans_info.dma_r_cost_time < min_dma_r_time) {
min_dma_r_time = dma_info->dma_trans_info.dma_r_cost_time;
}
}


if (num > 0) {
// Calculate average error rate
float average_error_rate = (total_error_count / (float)(num * (dma_info->params.size / 2))) * 100;


// Print final statistics
printf("\n\nTest Complete! Final Statistics:\n");
printf("Completed Cycles: %d/%d\n", num, total_cycles);
printf("Data Size: %d bytes\n", dma_info->params.size);
printf("Write Performance:\n");
printf(" Rate: %.2f MB/s (avg)\n", total_dma_w_rate / num);
printf(" Time: %d us (avg)\n", total_dma_w_time / num);
printf(" Rate: %.2f MB/s (max)\n", max_dma_w_rate);
printf(" Time: %d us (min)\n", min_dma_w_time);
printf(" Rate: %.2f MB/s (min)\n", min_dma_w_rate);
printf(" Time: %d us (max)\n", max_dma_w_time);

printf("\nRead Performance:\n");
printf(" Rate: %.2f MB/s (avg)\n", total_dma_r_rate / num);
printf(" Time: %d us (avg)\n", total_dma_r_time / num);
printf(" Rate: %.2f MB/s (max)\n", max_dma_r_rate);
printf(" Time: %d us (min)\n", min_dma_r_time);
printf(" Rate: %.2f MB/s (min)\n", min_dma_r_rate);
printf(" Time: %d us (max)\n", max_dma_r_time);


printf("\nAverage Error Rate: %.2f%%\n", average_error_rate);
}


return;
}


int main(int argc, char *argv[])
{
struct CmdLineParams params;
struct DMAInfo dma_info;
int ret = -1;


if (parse_parameter(&params, argc, argv) == false) {
printf("Please try --help to see usage.\n");
exit(1);
}


w_buffer = (uint8_t *)malloc(params.size);
if (w_buffer == NULL) {
printf("malloc w_buffer failed!\n");
exit(-1);
}


r_buffer = (uint8_t *)malloc(params.size);
if (r_buffer == NULL) {
free(w_buffer);
printf("malloc r_buffer failed!\n");
exit(-1);
}


memset(&dma_info, 0, sizeof(struct DMAInfo));
dma_info.params.address = DSMC_MEM;
dma_info.params.size = params.size;
dma_info.params.cycles = params.cycles;


ret = dma_init(&dma_info);
if (ret < 0) {
dma_destroy(&dma_info);
return -1;
}


dma_test(&dma_info);


dma_destroy(&dma_info);


if (w_buffer != NULL) {
free(w_buffer);
}


if (r_buffer != NULL) {
free(r_buffer);
}


return 0;
}

五 RTOS DMA例程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
/**
* Copyright (c) 2024 Rockchip Electronics Co., Ltd
*
* SPDX-License-Identifier: Apache-2.0
******************************************************************************
* @file dsmc_test.c
* @version V1.0
* @brief dsmc test
*
* Change Logs:
* Date Author Notes
* 2024-10-10 Zhihuan He the first version
*
******************************************************************************
*/

#include <stdint.h>

#include <rthw.h>
#include <rtthread.h>

#ifdef RT_USING_COMMON_TEST_DSMC
#include "hal_base.h"
#include "dma.h"
#include "drv_dsmc_host.h"

#define IO_BW_32 0
#define IO_BW_16 1
#define IO_BW_8 2
#define IO_TYPE_0 0

#define DMA_SIZE (0x10000)
#define COUNTS 100

static struct rt_device *dma;
static rt_sem_t mem_sem;

static void m2m_complete(void *param)
{
rt_sem_release(mem_sem);
}

void dma_memcpy(int *dst_mem, int *src_mem, int size)
{
struct rt_dma_transfer *m2m_xfer;
rt_err_t ret;
int i, len;
uint32_t tick_s, tick_e; /* ms */

m2m_xfer = (struct rt_dma_transfer *)rt_malloc(sizeof(*m2m_xfer));
mem_sem = rt_sem_create("memSem", 0, RT_IPC_FLAG_FIFO);
RT_ASSERT(m2m_xfer != RT_NULL);
RT_ASSERT(src_mem != RT_NULL);
RT_ASSERT(dst_mem != RT_NULL);
RT_ASSERT(mem_sem != RT_NULL);

len = size / sizeof(int);
for (i = 0; i < len; i++)
src_mem[i] = len - i;
rt_memset(dst_mem, 0x0, size);
rt_memset(src_mem, 0x6, size);
HAL_DCACHE_CleanInvalidateByRange((uint32_t)src_mem, size);
HAL_DCACHE_CleanInvalidateByRange((uint32_t)dst_mem, size);

dma = rt_device_find("dmac0");

/* memcpy test */
rt_memset(m2m_xfer, 0x0, sizeof(*m2m_xfer));
m2m_xfer->direction = RT_DMA_MEM_TO_MEM;
m2m_xfer->dst_addr = (rt_uint32_t)dst_mem;
m2m_xfer->src_addr = (rt_uint32_t)src_mem;
m2m_xfer->len = size;
m2m_xfer->callback = m2m_complete;
m2m_xfer->cparam = m2m_xfer;
ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_REQUEST_CHANNEL, m2m_xfer);
RT_ASSERT(ret == RT_EOK);

tick_s = HAL_GetTick();

for (i = 0; i < COUNTS; i++)
{
/* dma copy start */
ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_SINGLE_PREPARE, m2m_xfer);
RT_ASSERT(ret == RT_EOK);
ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_START, m2m_xfer);
RT_ASSERT(ret == RT_EOK);

/* wait for complete */
ret = rt_sem_take(mem_sem, RT_WAITING_FOREVER);
RT_ASSERT(ret == RT_EOK);

ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_STOP, m2m_xfer);
RT_ASSERT(ret == RT_EOK);
}

tick_e = HAL_GetTick();

ret = rt_memcmp(src_mem, dst_mem, size);

rt_kprintf("dma memcpy [%s]: avg: %7u MB/S with src: 0x%x dst: 0x%x len: %u counts: %u\n",
ret ? "FAIL" : "PASS", size * COUNTS / (tick_e - tick_s) / 1000,
src_mem, dst_mem, size, COUNTS);

ret = rt_device_control(dma, RT_DEVICE_CTRL_DMA_RELEASE_CHANNEL, m2m_xfer);
RT_ASSERT(ret == RT_EOK);

tick_s = HAL_GetTick();
for (i = 0; i < COUNTS; i++)
rt_memcpy(dst_mem, src_mem, size);
tick_e = HAL_GetTick();

ret = rt_memcmp(dst_mem, src_mem, size);

rt_kprintf("cpu memcpy [%s]: avg: %7u MB/S with src: 0x%x dst: 0x%x len: %u counts: %u\n",
ret ? "FAIL" : "PASS", size * COUNTS / (tick_e - tick_s) / 1000,
src_mem, dst_mem, size, COUNTS);

rt_sem_delete(mem_sem);
rt_free(m2m_xfer);
}

static int dma_test_psram(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
int *src = NULL, *dst = NULL;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
uint32_t size = DMA_SIZE;

rt_kprintf("dma test1: copy from ddr to psram\n");
src = rt_dma_malloc(size);
RT_ASSERT(src != RT_NULL);

dst = (int *)map->phys;

dma_memcpy(dst, src, size);
rt_dma_free(src);

rt_kprintf("dma test2: copy from psram to ddr\n");
src = (int *)map->phys;

dst = rt_dma_malloc(size);
RT_ASSERT(dst != RT_NULL);

dma_memcpy(dst, src, size);
rt_dma_free(dst);

rt_kprintf("dma test3: copy from psram to psram\n");
src = (int *)map->phys;
dst = (int *)(map->phys + size);

dma_memcpy(dst, src, size);
rt_kprintf("DMA test done\n");

return 0;
}

void psram_simple_test(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
uint32_t i, j;
struct dsmc_host_ops *ops = dsmc_host->ops;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
uint32_t test_cap;
uint32_t read, write;
uint32_t test_data[] = {0x5aa5f00f, 0x0, 0xffffffff, 0x3cc3d22d};

rt_kprintf("start cs%d simple test\n", cs);

for (j = 0; j < 4; j++)
{
test_cap = map->size;
rt_kprintf("write, test_cap = 0x%x\n", test_cap);
rt_kprintf(" write\n");
for (i = 0; i < test_cap; i += 4)
{
ops->write(dsmc_host, cs, 0, i, test_data[j]);
}
rt_kprintf("read\n");
test_cap = map->size;
rt_kprintf(" read\n");
for (i = 0; i < test_cap; i += 4)
{
ops->read(dsmc_host, cs, 0, i, &read);
if (read != test_data[j])
{
rt_kprintf("addr offset 0x%x: read = 0x%x, wr = 0x%x, error(0x%x)\n",
i, read, test_data[j], test_data[j] ^ read);
}
}
}

rt_kprintf("phase 1: write address to data\n");

test_cap = map->size;
for (i = 0; i < test_cap; i += 4)
{
ops->write(dsmc_host, cs, 0, i, map->phys + i);
}
for (i = 0; i < test_cap; i += 4)
{
ops->read(dsmc_host, cs, 0, i, &read);
write = map->phys + i;
if (read != write)
{
rt_kprintf("addr offset 0x%x: read = 0x%x, wr = 0x%x, error(0x%x)\n",
i, read, write, write ^ read);
}
}

rt_kprintf("phase 2: write 0x0f0f5aa5 + address\n");
test_cap = map->size;
for (i = 0; i < test_cap; i += 4)
{
ops->write(dsmc_host, cs, 0, i, map->phys + i + 0x0f0f5aa5);
}

test_cap = map->size;
for (i = 0; i < test_cap; i += 4)
{
ops->read(dsmc_host, cs, 0, i, &read);
write = map->phys + i + 0x0f0f5aa5;
if (read != write)
{
rt_kprintf("addr offset 0x%x: read = 0x%x, wr = 0x%x, error(0x%x)\n",
i, read, write, write ^ read);
}
}

rt_kprintf("phase 3: dma copy by software\n");
dma_test_psram(dsmc_host, cs);

rt_kprintf("cs%d simple test done.\n", cs);
}

static int plc_simple_test(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
uint32_t i;
struct dsmc_host_ops *ops = dsmc_host->ops;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
int *src = NULL, *dst = NULL;
uint32_t size = DMA_SIZE;
int ret = 0;
uint32_t timeout = 1000000;

rt_kprintf("phase 1: dma memcopy from host DDR to slave memory\n");
src = rt_dma_malloc(size);
RT_ASSERT(src != RT_NULL);

/* init src address */
rt_memset(src, 0x66, size);

HAL_DCACHE_CleanInvalidateByRange((uint32_t)src, size);
HAL_DCACHE_CleanInvalidateByRange((uint32_t)map->phys, size);

timeout = 10000;
if (!ops->copy_to(dsmc_host, cs, 0, (uint32_t)src, 0x0, size))
{
while (ops->copy_to_state(dsmc_host))
{
rt_thread_mdelay(1);
timeout--;
if (!timeout)
{
rt_kprintf("DSMC: wait copy to complete timeout\n");
ret = -1;
goto err;
}
}
rt_memcmp(src, (int *)map->phys, size);
}
else
{
rt_kprintf("err: phase 1: test skip!!!\n");
}

rt_kprintf("phase2 : dma memcopy from slave memory to host DDR\n");
dst = rt_dma_malloc(size);
RT_ASSERT(dst != RT_NULL);

rt_memset(dst, 0x88, size);
/* init plc src address */
for (i = 0; i < size; i += 4)
{
ops->write(dsmc_host, cs, 0, i, 0x10100000 + i);
}
HAL_DCACHE_CleanInvalidateByRange((uint32_t)dst, size);
HAL_DCACHE_CleanInvalidateByRange((uint32_t)map->phys, size);

timeout = 10000;
if (!ops->copy_from(dsmc_host, cs, 0x0, 0, (uint32_t)dst, size))
{
while (ops->copy_from_state(dsmc_host))
{
rt_thread_mdelay(1);
timeout--;
if (!timeout)
{
rt_kprintf("DSMC: wait copy from complete timeout\n");
ret = -1;
goto err;
}
}
rt_memcmp((int *)map->phys, dst, size);
}
else
{
rt_kprintf("err: phase 1: test skip!!!\n");
}

rt_dma_free(src);
rt_dma_free(dst);

rt_kprintf("plc_simple_test test done\n");
return 0;

err:
rt_kprintf("plc_simple_test test error!\n");
return ret;
}

void dsmc_slave_latency_cpu(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
uint32_t i;
struct dsmc_host_ops *ops = dsmc_host->ops;
uint32_t read, counter;
uint32_t kt[2], diff;

rt_kprintf("cpu access dsmc latency test\n");

counter = 1000000;

kt[0] = HAL_GetTick();
for (i = 0; i < counter; i++)
{
ops->read(dsmc_host, cs, 3, 0x100, &read);
}
kt[1] = HAL_GetTick();
diff = kt[1] - kt[0];
rt_kprintf("counter %d, cost %ums, read latency %uns\n",
counter, diff, diff * 1000000 / counter);
rt_kprintf("read = 0x%x\n", read);

kt[0] = HAL_GetTick();
for (i = 0; i < counter; i++)
{
ops->write(dsmc_host, cs, 3, 0x100, 0x6);
}
kt[1] = HAL_GetTick();
diff = kt[1] - kt[0];
rt_kprintf("counter %d, cost %ums, write latency %uns\n",
counter, diff, diff * 1000000 / counter);
}

void dsmc_slave_speed_cpu(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
uint32_t i;
struct dsmc_host_ops *ops = dsmc_host->ops;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
uint32_t kt[2], rd_diff, wr_diff;
uint32_t read;

rt_kprintf("cpu access dsmc speed\n");

kt[0] = HAL_GetTick();
for (i = 0; i < map->size; i += 4)
{
ops->write(dsmc_host, cs, 0, i, 0xffffffff);
}
HAL_DCACHE_CleanInvalidateByRange((uint32_t)map->phys, map->size);
kt[1] = HAL_GetTick();
wr_diff = kt[1] - kt[0];
rt_kprintf("cpu write %uMB, cost %ums, %uMB/s\n",
(uint32_t)map->size / 1024 / 1024,
wr_diff,
map->size / wr_diff / 1000);

kt[0] = HAL_GetTick();
for (i = 0; i < map->size; i += 4)
{
ops->read(dsmc_host, cs, 0, i, &read);
}
kt[1] = HAL_GetTick();
rd_diff = kt[1] - kt[0];

rt_kprintf("cpu read %uMB, cost %ums, %uMB/s\n",
(uint32_t)map->size / 1024 / 1024,
rd_diff,
map->size / rd_diff / 1000);
}

static int dsmc_slave_speed_dma(struct rockchip_rt_dsmc_host *dsmc_host, uint32_t cs)
{
int *src = NULL, *dst = NULL;
struct DSMC_MAP *map = &dsmc_host->host->dsmcHostDev.ChipSelMap[cs].regionMap[0];
uint32_t size = DMA_SIZE;

rt_kprintf("dma access dsmc speed\n");

rt_kprintf("DMA write DSMC\n");
src = rt_dma_malloc(size);
RT_ASSERT(src != RT_NULL);

dst = (int *)map->phys;

dma_memcpy(dst, src, size);

rt_dma_free(src);

rt_kprintf("DMA read DSMC\n");
src = (int *)map->phys;
dst = rt_dma_malloc(size);
RT_ASSERT(dst != RT_NULL);

dma_memcpy(dst, src, size);

rt_dma_free(dst);

return 0;
}

void dsmc_test(int argc, char **argv)
{
uint32_t i;
rt_device_t dev;

struct rockchip_rt_dsmc_host *dsmc_host;

rt_kprintf("this is dsmc_test\n");

dev = rt_device_find("dsmc_host");
if (dev == RT_NULL)
{
rt_kprintf("dsmc_host not found\n");
return;
}
dsmc_host = (struct rockchip_rt_dsmc_host *)dev;

for (i = 0; i < DSMC_MAX_SLAVE_NUM; i++)
{
if (dsmc_host->host->dsmcHostDev.ChipSelCfg[i].deviceType == DEV_UNKNOWN)
continue;
psram_simple_test(dsmc_host, i);
if (dsmc_host->host->dsmcHostDev.ChipSelCfg[i].deviceType == DEV_DSMC_LB)
{
plc_simple_test(dsmc_host, i);
dsmc_slave_latency_cpu(dsmc_host, i);
}
dsmc_slave_speed_cpu(dsmc_host, i);
dsmc_slave_speed_dma(dsmc_host, i);
}
}

#ifdef RT_USING_FINSH
#include <finsh.h>
MSH_CMD_EXPORT(dsmc_test, dsmc tester);
#endif
#endif