-
Notifications
You must be signed in to change notification settings - Fork 5.3k
6.18: PCIe GPU testing (AMD and Intel Xe) #7113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: rpi-6.18.y
Are you sure you want to change the base?
Conversation
|
Also just to keep the breadcrumbs going:
|
d9f2960 to
4a4ea03
Compare
2ef2bea to
b17e55e
Compare
|
Branch rebased to keep it relevant. I've just noted https://lore.kernel.org/dri-devel/[email protected]/ I don't see it having been merged, so at some point it would be useful to test it as the thread appears to have stalled. |
e380363 to
d003d86
Compare
This patch adds the device ID for the BCM4343A2 module, found e.g. in the Infineon (Cypress) CYW43439 chip. The required firmware file is named 'BCM4343A2.hcd'. Signed-off-by: Phil Elwell <[email protected]>
i2c_mux_add_adapter takes a force_nr parameter that allows an explicit bus number to be associated with a channel. However, only i2c-mux-reg and i2c-mux-gpio make use of it. To help with situations where it is desirable to have a fixed, known base address for the channels of a mux, create a "base-nr" property. When force_nr is 0 and base-nr is set and non-zero, form a force_nr value from the sum of base-nr and the channel ID. Signed-off-by: Phil Elwell <[email protected]>
In deep sleep mode (DS1) ARM is off and once exit trigger comes than mailbox Interrupt comes to host and whole reinitiation should be done in the ARM to start TX/RX. Also fix below issus for DS1 exit: 1. Sent Tx Control frame only after firmware redownload complete (check F2 Ready before sending Tx Control frame to Firmware) 2. intermittent High DS1 TX Exit latency time (almost 3sec) ==> This is fixed by skipping host Mailbox interrupt Multiple times (ulp state mechanism) 3. RX GlOM save/restore in Firmware 4. Add ULP event enable & event_msgs_ext iovar configuration in FMAC 5. Add ULP_EVENT_RECV state machine for sbwad support 6. Support 2 Byte Shared memory read for DS1 Exit HUDI implementation Signed-off-by: Praveen Babu C <[email protected]> Signed-off-by: Naveen Gupta <[email protected]> [Merge from 4.14.77 to 5.4.18; set BRCMF_SDIO_MAX_ACCESS_ERRORS to 20] Signed-off-by: Chi-hsien Lin <[email protected]> JIRA: SWWLAN-135583 JIRA: SWWLAN-136577
1. If firmware supports 4-way handshake offload but not supports DPP 4-way offload, when user first connects encryption network, driver will set "sup_wpa 1" to firmware, but it will further result in DPP connection failure since firmware won't send EAPOL frame to host. 2. Fix DPP AP mode handling action frames. 3. For some firmware without fwsup support, the join procedure will be skipped due to "sup_wpa" iovar returning not-support. Check the fwsup feature before do such iovar. Signed-off-by: Kurt Lee <[email protected]> Signed-off-by: Double Lo <[email protected]> Signed-off-by: Chi-hsien Lin <[email protected]>
Commit 7d239fb broke 802.1X authentication by setting profile->use_fwsup = NONE whenever PSK is not used. However 802.1X does not use PSK and requires profile->use_fwsup set to 1X, or brcmf_cfg80211_set_pmk() fails. Fix this by checking that profile->use_fwsup is not already set to 1X and avoid setting it to NONE in that case. Fixes: 7d239fb (brcmfmac: Fix interoperating DPP and other encryption network access) Fixes: raspberrypi#5964
Application class A2 cards require CQ to be enabled to realise their stated performance figures. Add support to enable/disable card CQ via the Performance Enhancement extension register, and cater for the slight differences in command set versus eMMC. Signed-off-by: Jonathan Bell <[email protected]>
The Performance Extension register is regularly accessed in a hot path to do write cache flushes. Don't invoke kmalloc/kfree for every access, preallocate a 512B buffer for this purpose. Also remove an unused alloc in sd_enable_cache(). Signed-off-by: Jonathan Bell <[email protected]>
Add a LED_FULL trigger equivalent to mmc_start_request() in mmc_cqe_start_req(), otherwise it stays off forever. Signed-off-by: Jonathan Bell <[email protected]>
For unknown reasons the controller seems to reset the idle polling timer interval on CQE enable/disable to 8 clocks which is extremely short. Just use the reset value in the eMMC spec (4096 clock periods which at 200MHz is ~20uS). Signed-off-by: Jonathan Bell <[email protected]>
The eMMC spec says that in certain circumstances the controller can't respond to a halt request - in practice, this occurs if a CMD timeout happens (card went away/crashed). Clear the halt request by writing 0 to CQHCI_CTL. Also fix a logic error testing for halt in cqhci_request. Signed-off-by: Jonathan Bell <[email protected]>
Certain status bits in these registers may need polling outside of SD-specific code. Export in sd_ops.h Signed-off-by: Jonathan Bell <[email protected]>
Don't attempt to turn on CQ if the other mandatory features are not indicated as supported by the card. Also make sure that the register write actually stuck, as some cards claim support but never report back that the queue engine is enabled. Signed-off-by: Jonathan Bell <[email protected]>
Also report the card's supported queue depth in the message log. Signed-off-by: Jonathan Bell <[email protected]>
The spec allows for up to two 512-byte pages to be allocated for the Extension Register General Info block, so allocate accordingly. Signed-off-by: Jonathan Bell <[email protected]>
The attached PHY performs parameter validation, so the switch from HS200 to HS (before selecting HS400/HS400es) with a 200MHz clock fails to update pad timings and results in CRC errors from the card. Underclocking the interface is safe, so do that in the downgrade callback. Signed-off-by: Jonathan Bell <[email protected]>
This gains about 8-12% sequential write speed with the fastest SD/eMMC cards, and Class A1/A2 card sequential performance is only assured with a 4MiB write length. Signed-off-by: Jonathan Bell <[email protected]>
If the controller is being reset, then the CQE needs to be reset as well. For removable cards, CQHCI_SSC1 must specify a polling mode (CBC=0) otherwise it's possible that the controller stops emitting periodic CMD13s on card removal, without raising an error status interrupt. Signed-off-by: Jonathan Bell <[email protected]>
Recovery claims the MMC card so the card-detect work gets significantly delayed - leading to lots of error recovery loops that can never do anything but fail. Explicitly detect the card after CQE has halted and bail if it's not there. Also ratelimit a not-very-descriptive warning - one occurrence in dmesg is enough to signal that something is amiss. Signed-off-by: Jonathan Bell <[email protected]>
Command Queueing requires Write Cache and Power off Notification support from the card - but using the write cache forms a contract with the host whereby the card expects to be told about impending power-down. The implication is that (for performance) the card can do unsafe things with pending write data - including reordering what gets committed to nonvolatile storage at what time. Exposed SD slots and platforms powered by hotpluggable means (i.e. Raspberry Pis) can't guarantee that surprise removal won't happen. To limit the scope for cards to invent new ways to trash filesystems, limit pending writes to 1 (equivalent to the non-CQ behaviour). Signed-off-by: Jonathan Bell <[email protected]> fixup: mmc: restrict posted write counts for SD cards in CQ mode Leaving card->max_posted_writes unintialised was a bad thing to do. Also, cqe_enable is 1 if hsq is enabled as hsq substitutes the cqhci implementation with its own. Signed-off-by: Jonathan Bell <[email protected]>
Posted write tracking introduced in the commit below raced with re-use of the requests between completion and submission, potentially causing underflow of the pending write count. Fixes: e6c1e86 ("mmc: restrict posted write counts for SD cards in CQ mode") Signed-off-by: Jonathan Bell <[email protected]>
Drop from RGB to YUV422 output if RGB couldn't be supported within the defined max_bpc and TMDS rates, and then try dropping max_bpc. Signed-off-by: Dave Stevenson <[email protected]>
The upstream version has limited functionality. Signed-off-by: Dave Stevenson <[email protected]> Fixup downstream pinctrl-rp1 driver
While we continue to use the downstream RP1 driver, update some other Kconfig settings to recognise MFD_RP1 as a valid RP1 driver. Signed-off-by: Phil Elwell <[email protected]>
The DesignWare AXI DMAC IP can be configured with heterogeneous channel parameters. Allow maximum burst length to be set per-channel by making snps,axi-max-burst-len an array. Signed-off-by: Phil Elwell <[email protected]>
Add a mechanism to allow clients to prefer some DMA channels over others. This is required to allow high-bandwidth clients to request one of the two "heavy" channels, but could also be used to prevent some clients from hogging all channels. Signed-off-by: Phil Elwell <[email protected]>
The patch "dmaengine: dw-axi-dmac: add per-channel AXI burst length
support" programs ARLEN/AWLEN from the snps,axi-max-burst-len array but
still exposed a single max_burst value via dma_get_slave_caps(). As a
result all channels reported 8 even when limited to 4, leading to
warnings:
dma dma2chan5: requested source burst length 8 exceeds supported 4
Add a .device_caps callback to return the correct per-channel max_burst.
This allows drivers like amba-pl011 to clamp burst lengths properly.
Fixes: 0e4e6a0c4f4e ("dmaengine: dw-axi-dmac: add per-channel AXI burst length support")
Signed-off-by: Nicolai Buchwitz <[email protected]>
Enable by adding the following to cmdline.txt: `fullscreen_logo_name=logo.tga fullscreen_logo=1` Will show the logo file present in /lib/firmware/ on the screen. This will be fullscreen and rendered early at boot. Any remaining space is filled with solid color from the image border. If TGA file is too big, image is clipped accordingly. Signed-off-by: Ben Benson <[email protected]> Splash Screen: bug fix Prevents fullscreen logos from being drawn multiple times. With small enough logos, the image would be drawn multiple times across the screen. Signed-off-by: Ben Benson <[email protected]> fbcon: Add defensive coding to logo loader There were various points where the loader was using uninitialised data, had the potential to run off the end of an array, or was handling core functions incorrectly. Fix these up. Also handle 24bpp and 32bpp framebuffers. Signed-off-by: Dave Stevenson <[email protected]>
Step wise governor increases the mitigation level when the temperature goes above a threshold and will decrease the mitigation when the temperature falls below the threshold. If it were a case, where the temperature hovers around a threshold, the mitigation will be applied and removed at every iteration. This reaction to the temperature is inefficient for performance. The use of hysteresis temperature could avoid this ping-pong of mitigation by relaxing the mitigation to happen only when the temperature goes below this lower hysteresis value. Signed-off-by: Ram Chandrasekar <[email protected]> Signed-off-by: Lina Iyer <[email protected]> drivers: thermal: step_wise: avoid throttling at hysteresis temperature after dropping below it Signed-off-by: Serge Schneider <[email protected]> Fix hysteresis support in gov_step_wise.c Directly get hyst value instead of going through an optional and, now, unimplemented function. Signed-off-by: Jürgen Kreileder <[email protected]>
Upstream series https://lore.kernel.org/linux-media/[email protected]/ The subdev format documentation has a subsection describing how to use the media bus pixel codes for serial buses. While it describes the sampling part well, it doesn't really describe the current convention used for the components order. Let's improve that. Signed-off-by: Maxime Ripard <[email protected]>
Upstream series https://lore.kernel.org/linux-media/[email protected]/ The tc358743 is an HDMI to MIPI-CSI2 bridge. It can output all three HDMI 1.4 video formats: RGB 4:4:4, YCbCr 4:2:2, and YCbCr 4:4:4. RGB 4:4:4 is converted to the MIPI-CSI2 RGB888 video format, and listed in the driver as MEDIA_BUS_FMT_RGB888_1X24. Most CSI2 receiver drivers then map MEDIA_BUS_FMT_RGB888_1X24 to V4L2_PIX_FMT_RGB24. However, V4L2_PIX_FMT_RGB24 is defined as having its color components in the R, G and B order, from left to right. MIPI-CSI2 however defines the RGB888 format with blue first. This essentially means that the R and B will be swapped compared to what V4L2_PIX_FMT_RGB24 defines. The proper MBUS format would be BGR888, so let's use that. Fixes: d32d986 ("[media] Driver for Toshiba TC358743 HDMI to CSI-2 bridge") Signed-off-by: Maxime Ripard <[email protected]>
The mappings are the reverse of r8g8b8 and r5g6b5 respectively Signed-off-by: Dave Stevenson <[email protected]>
Pass-through mode disables all gamma and brightness processing, sending the raw pixel data directly to the LEDs. It is enabled by setting the brightness to zero, either in Device Tree or using the runtime method of writing a single byte (in this case 0) to the device. See: raspberrypi#7108 Signed-off-by: Phil Elwell <[email protected]>
Signed-off-by: Waveshare_Team <[email protected]>
d003d86 to
dff72a6
Compare
Leave pcie1/pciex1/nvme disabled unless a DT parameter is used. Signed-off-by: Phil Elwell <[email protected]>
Corrects typo that set that register to only be 8 bit. Fixes: 7736218 ("media: imx477: Convert to use V4L2_CCI library") Signed-off-by: Dave Stevenson <[email protected]>
The Raspberry Pi firmware has assumed that top level #size-cells value in dtb files is 1. As a result, the dts source files have had to use 32-bit sizes, making it awkward to declare memory regions of 4GB or larger, requiring them to be split into chunks. This primarily affects Pi 5, where the dts source has made use of conditional compilation to choose either 64-bit or 32-bit sizes, based on the presence or absence of the defined cpp symbol FIRMWARE_UPDATED. As of EEPROM release pieeprom-2025-02-11, the firmware has read and made use of the actual #size-cells value declared in the dtb, allowing the use of 64-bit sizes. Remove the conditional sections, retaining the 64-bit size values. Signed-off-by: Phil Elwell <[email protected]>
Various PCIe controllers on ARM64 platforms don't support cache snooping, which leads to numerous issues when attempting to use PCIe graphics cards. Switching ttm_prot_from_caching to return pgprot_dmacoherent for ttm_cached pages solves the issue, albeit with a performance hit. There is a second check in ttm_prot_from_caching that also needs updating. Signed-off-by: Yang Bo <[email protected]> Signed-off-by: Dave Stevenson <[email protected]>
Also includes SND_HDA_* modules for audio on AMD GPUs. Signed-off-by: Dave Stevenson <[email protected]>
b17e55e to
6bf3935
Compare
Taken from https://github.com/chimera-linux/cports/blob/master/main/linux-stable/patches/xe-nonx86.patch Signed-off-by: Dave Stevenson <[email protected]>
Signed-off-by: Dave Stevenson <[email protected]>
Signed-off-by: Dave Stevenson <[email protected]>
|
Branch rebased. |
|
I guess you don't want noise and user testing here. But just as a thumbs up, I've installed this kernel on a rpi5 with oculink-connected RX 6800 XT, and the combo seems rock-solid from a user perspective. I've run multiple games and benchmarks, as well as Jeffs llama examples both in chat- and in web-server mode. Not a single hickup. The card even seems to go into suspend mode and recovers on use. |
|
With the old PR on 6.17.x, I was able to get the AI Pro R9700 working. With this PR on 6.18.x, I get: Maybe the AMD driver in 6.18 has a new thing where it forces a BAR resize, and that's currently failing? The only thing I could find that's remotely related is this patch. Or could be related to this post: [REGRESSION] amdgpu fails to load external RX 580 since PCI: Allow relaxed bridge window tail sizing. The RX 7900 XT is loading fine, however. |
|
@pigong I haven't really the time to dig into these @geerlingguy There were other patches around for resizing the BAR and releasing the old allocation before creating the new one. I don't know if those would help in this situation. They seem to be referenced in #6621 |
|
@6by9 - With a fresh rebuild of the entire OS, I was able to get the R9700 working again, not sure why it wasn't before... but that previous install was about a month old at this point, and I generally rebuild things weekly, oops :) It's nice to have just one or two commands to run, and not have to wait for a kernel recompile! |
aa8e7a1 to
e132677
Compare
PR to create CI artifacts supporting AMD and Intel Xe GPUs.