I have a intel atom j1900 board with 2G memory and a FPGA pcie device developed. the FPGA pcie device information is shown belows:
root@GNS:~# lspci -s 03:00.0 -vvv
03:00.0 Non-VGA unclassified device: Analog Devices Device 1536
Subsystem: Analog Devices Device 0007
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 95
Region 0: Memory at d0b00000 (32-bit, non-prefetchable) [size=1M]
Region 1: Memory at d0a00000 (32-bit, non-prefetchable) [size=1M]
Region 2: Memory at d0900000 (32-bit, non-prefetchable) [size=1M]
Region 3: Memory at d0800000 (32-bit, non-prefetchable) [size=1M]
Region 4: Memory at d0700000 (32-bit, non-prefetchable) [size=1M]
Region 5: Memory at d0600000 (32-bit, non-prefetchable) [size=1M]
Capabilities: [50] MSI: Enable+ Count=1/4 Maskable- 64bit+
Address: 00000000fee0100c Data: 4172
Capabilities: [78] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [80] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10.000W
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #1, Speed 2.5GT/s, Width x1, ASPM not supported, Exit Latency L0s <4us, L1 <1us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis+, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
Status: NegoPending- InProgress-
Capabilities: [200 v1] Vendor Specific Information: ID=1172 Rev=0 Len=044 <?>
Kernel driver in use: MyDriver_PCIe_i686
when I compiled a kernel without PAE support, the bar io access performance of this device was reduced significantly. here is the test result under none-PAE and PAE kernel. My test OS environment is debian 9 stretch
Kernel Version | Bytes Length | Write Throughput | Read Throughput | Write Time | Read Time |
---|---|---|---|---|---|
4.9.0-13-686 | 2048 | 96.45 Mib/s | 16.26 Mib/s | 162 us | 961 us |
4.9.0-13-686 | 1024 | 65.10 Mib/s | 15.20 Mib/s | 120 us | 514 us |
4.9.0-13-686 | 512 | 38.30 Mib/s | 13.38 Mib/s | 102 us | 292 us |
4.9.0-13-686 | 256 | 21.23 Mib/s | 10.56 Mib/s | 92 us | 185 us |
4.9.0-13-686 | 128 | 10.97 Mib/s | 7.57 Mib/s | 89 us | 129 us |
4.9.0-13-686 | 64 | 5.55 Mib/s | 4.70 Mib/s | 88 us | 104 us |
4.9.0-13-686 | 32 | 2.74 Mib/s | 2.81 Mib/s | 89 us | 87 us |
4.9.0-13-686 | 8 | 0.69 Mib/s | 0.78 Mib/s | 88 us | 78 us |
4.9.0-13-686 | 4 | 0.35 Mib/s | 0.40 Mib/s | 87 us | 77 us |
4.9.0-13-686-pae | 2048 | 181.69 Mib/s | 17.48 Mib/s | 86 us | 894 us |
4.9.0-13-686-pae | 1024 | 166.22 Mib/s | 17.32 Mib/s | 47 us | 451 us |
4.9.0-13-686-pae | 512 | 139.51 Mib/s | 17.13 Mib/s | 28 us | 228 us |
4.9.0-13-686-pae | 256 | 108.51 Mib/s | 16.41 Mib/s | 18 us | 119 us |
4.9.0-13-686-pae | 128 | 61.04 Mib/s | 15.50 Mib/s | 16 us | 63 us |
4.9.0-13-686-pae | 64 | 37.56 Mib/s | 13.95 Mib/s | 13 us | 35 us |
4.9.0-13-686-pae | 32 | 18.78 Mib/s | 11.10 Mib/s | 13 us | 22 us |
4.9.0-13-686-pae | 8 | 4.36 Mib/s | 5.09 Mib/s | 14 us | 12 us |
4.9.0-13-686-pae | 4 | 2.35 Mib/s | 3.39 Mib/s | 13 us | 9 us |
4.9.0-18-rt-686-pae | 2048 | 171.70 Mib/s | 17.42 Mib/s | 91 us | 897 us |
4.9.0-18-rt-686-pae | 1024 | 159.44 Mib/s | 17.17 Mib/s | 49 us | 455 us |
4.9.0-18-rt-686-pae | 512 | 126.01 Mib/s | 16.84 Mib/s | 31 us | 232 us |
4.9.0-18-rt-686-pae | 256 | 93.01 Mib/s | 15.88 Mib/s | 21 us | 123 us |
4.9.0-18-rt-686-pae | 128 | 54.25 Mib/s | 15.02 Mib/s | 18 us | 65 us |
4.9.0-18-rt-686-pae | 64 | 30.52 Mib/s | 13.20 Mib/s | 16 us | 37 us |
4.9.0-18-rt-686-pae | 32 | 15.26 Mib/s | 9.77 Mib/s | 16 us | 25 us |
4.9.0-18-rt-686-pae | 8 | 4.70 Mib/s | 4.07 Mib/s | 13 us | 15 us |
4.9.0-18-rt-686-pae | 4 | 2.35 Mib/s | 2.35 Mib/s | 13 us | 13 us |