I'm writing a linux device driver to DMA data from FPGA into CPU RAM via PCI express. Running 64 bit Centos 8.1, kernel 4.18.0-147.3.1 on Intel Platform.
The implementation follows the DMA-API-HOWTO. The DMA is 32-bit and the driver uses consistent mapping for a DMA descriptors ring. Accordingly I set the DMA mask to inform the kernel about the devices DMA addressing capabilities.
pciRet = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));
if (pciRet < 0) {
printk(KERN_ERR "dma_set_mask_and_coherent returned: %d\n", pciRet);
return -EIO;
}
The FPGA and the driver are designed to execute the transactions of a ring of 240 buffer descriptors until it is stopped by user space program. The buffer descriptor ring and the actual buffers are allocated by means of dma_alloc_coherent
, mapped when transaction initialization is triggered by user space program, unmapped at the end.
#define DMA_BD_CNT 240
#define DMA_BD_WORD_SIZE 16
#define BUFFER_SIZE 65536
bdsize = sizeof(u32) * DMA_BD_WORD_SIZE * DMA_BD_CNT
// Linked list buffer descriptor list
BdAddr = dma_alloc_coherent(&pdev->dev, bdsize, &BdDmaAddr, GFP_KERNEL);
if (!BdAddr) {
printk(KERN_ERR "failed to allocate coherent buffer\n", DRIVER_NAME);
err = -EIO;
goto err1;
}
// actual buffers of size 64kB
for (i = 0; i < DMA_BD_CNT; i++) {
RxData[i] = dma_alloc_coherent(&pdev->dev, BUFFER_SIZE, (dma_addr_t *)&RxDmaHandle[i], GFP_KERNEL);
if (!RxData[i]) {
printk(KERN_ERR "rx page allocation failure\n");
err = -ENOMEM;
goto err2;
}
}
I'm facing two situations:
Our application works fine when SWIOTLB is enabled. This prevent the usage of hardware VT-d IOMMU and it is the default on Intel machines.
intel_iommu=off iommu=soft
However, it does not work when the Intel VT-d IOMMU is enabled
intel_iommu=on
First time I run the transaction everything goes fine, but from the following dma start the buffers initialization fails because the
dma_alloc_coherent
constantly gets PTE errors (from DMAR:) with the following message for every of the 16 pages it is trying to allocate for each 64k buffer:[ 164.470945] DMAR: ERROR: DMA PTE for vPFN 0xfef90 already set (to 1050b3f003 not 1057e1f003) [ 164.470950] WARNING: CPU: 1 PID: 14017 at drivers/iommu/intel-iommu.c:2321 __domain_mapping.cold.88+0x46/0x4d [ 164.470950] Modules linked in: dma_ch_dev(OE) 8021q garp mrp stp llc dell_rbu dcdbas dell_smbios dell_wmi_descriptor wmi_bmof intel_rapl skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf wdat_wdt pcspkr ipmi_ssif sg i2c_i801 lpc_ich ftdi_sio mei_me mei wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter ip_tables ext4 mbcache jbd2 sd_mod mgag200 mlx5_core i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci libahci mlxfw drm tg3 libata megaraid_sas dm_mirror dm_region_hash dm_log dm_mod sctp libcrc32c crc32c_intel [ 164.470963] CPU: 1 PID: 14017 Comm: dma-app Kdump: loaded Tainted: G W OE --------- - - 4.18.0-147.3.1.el8_1.x86_64 #1 [ 164.470964] Hardware name: /01YM03, BIOS 2.2.11 06/13/2019 [ 164.470965] RIP: 0010:__domain_mapping.cold.88+0x46/0x4d [ 164.470965] Code: 4c 24 08 e8 88 b3 bb ff 8b 05 54 48 de 00 4c 8b 4c 24 08 4c 8b 54 24 10 4c 8b 44 24 18 85 c0 74 09 83 e8 01 89 05 38 48 de 00 <0f> 0b e9 e6 ce ff ff 89 da 4c 89 c1 48 c7 c6 70 c8 4a 87 48 c7 c7 [ 164.470966] RSP: 0018:ffffb6f620bcf538 EFLAGS: 00010246 [ 164.470966] RAX: 0000000000000000 RBX: 0000001057e1f003 RCX: 0000000000000006 [ 164.470967] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff985ddf216a00 [ 164.470967] RBP: ffff985dddd1dcf8 R08: 0000000000000001 R09: ffff985dddd1dc80 [ 164.470968] R10: 0000000001057e1f R11: 00000000000ee800 R12: 0000000000000001 [ 164.470968] R13: 0000000000000000 R14: 00000000000fef9f R15: ffff985dd993ee00 [ 164.470969] FS: 00007efd1ca2a740(0000) GS:ffff985ddf200000(0000) knlGS:0000000000000000 [ 164.470969] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 164.470969] CR2: 00007efd1ac035f0 CR3: 0000000844df4003 CR4: 00000000007606e0 [ 164.470970] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 164.470970] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 164.470970] PKRU: 55555554 [ 164.470971] Call Trace: [ 164.470972] domain_mapping+0x1b/0xe0 [ 164.470973] __intel_map_page+0xf1/0x140 [ 164.470975] intel_alloc_coherent+0x96/0x120 [ 164.470976] descriptor_init+0x4b/0xe0 [dma_ch_dev] [ 164.470978] dma_register+0x147/0x210 [dma_ch_dev] [ 164.470983] ? 0xffffffffc0836000 [ 164.470985] dma_start.cold.25+0x77/0x28d [dma_ch_dev] [ 164.470985] ? 0xffffffffc0836000 [ 164.470995] ? get_page_from_freelist+0xd87/0x1210 [ 164.470997] ? get_page_from_freelist+0xd87/0x1210 [ 164.470999] ? mem_cgroup_commit_charge+0x7a/0x560 [ 164.471000] ? mem_cgroup_try_charge+0x8b/0x1a0 [ 164.471001] ? mem_cgroup_throttle_swaprate+0x17/0x10e [ 164.471003] ? do_anonymous_page+0x1d2/0x370 [ 164.471004] ? __handle_mm_fault+0x66e/0x6b0 [ 164.471006] ? lookup_fast+0xc8/0x2f0 [ 164.471009] ? update_load_avg+0x87/0x590 [ 164.471011] ? account_entity_enqueue+0xc5/0xf0 [ 164.471011] ? enqueue_entity+0xf6/0x630 [ 164.471013] ? legitimize_path.isra.44+0x2d/0x60 [ 164.471015] ? enqueue_task_fair+0x7d/0x3e0 [ 164.471016] ? select_idle_sibling+0x22/0x3d0 [ 164.471017] ? check_preempt_curr+0x7a/0x90 [ 164.471017] ? ttwu_do_wakeup+0x19/0x130 [ 164.471019] ? try_to_wake_up+0x54/0x4b0 [ 164.471019] ? filename_lookup.part.64+0xe0/0x170 [ 164.471021] ? tty_insert_flip_string_fixed_flag+0x85/0xe0 [ 164.471023] ? pty_write+0x78/0x90 [ 164.471025] ? __wake_up_common_lock+0x89/0xc0 [ 164.471026] do_vfs_ioctl+0xa4/0x630 [ 164.471028] ? syscall_trace_enter+0x1d3/0x2c0 [ 164.471029] ksys_ioctl+0x60/0x90 [ 164.471030] __x64_sys_ioctl+0x16/0x20 [ 164.471031] do_syscall_64+0x5b/0x1b0 [ 164.471032] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 164.471033] RIP: 0033:0x7efd1ac9ceab [ 164.471034] Code: 0f 1e fa 48 8b 05 dd 9f 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ad 9f 2c 00 f7 d8 64 89 01 48 [ 164.471034] RSP: 002b:00007fff47df73a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 164.471035] RAX: ffffffffffffffda RBX: 00000000025b3ca0 RCX: 00007efd1ac9ceab [ 164.471035] RDX: 0000000000000000 RSI: 0000000000005302 RDI: 0000000000000003 [ 164.471035] RBP: 0000000000000000 R08: 000000000000000a R09: 0000000000000001 [ 164.471036] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 164.471036] R13: 0000000000000498 R14: 00000000025b3a60 R15: 0000000000000002 [ 164.471037] ---[ end trace d8ffc4ec65fb8ee9 ]---
In the second run, it looks like intel_alloc_coherent
tries to reuse some of already allocated buffers.
The dma_alloc_coherent
for the buffer descriptor ring does not give any warning.
The buffers are correctly free at the end of every run by means of dma_free_coherent
.
Are there specific strategies to handle 32-bit DMA with Intel IOMMU enabled?
Thank you for any help you can provide.