summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2023-03-03arm64: dts: uniphier: Fix property name in PXs3 USB nodeKunihiko Hayashi2-2/+2
commit 2508d5efd7a588d07915a762e1731173854525f9 upstream. The property "snps,usb2_gadget_lpm_disable" is wrong. It should be fixed to "snps,usb2-gadget-lpm-disable". Cc: stable@vger.kernel.org Fixes: 19fee1a1096d ("arm64: dts: uniphier: Add USB-device support for PXs3 reference board") Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Link: https://lore.kernel.org/r/20230207021429.28925-1-hayashi.kunihiko@socionext.com Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-03x86/cpu: Add Lunar Lake MKan Liang1-0/+2
[ Upstream commit f545e8831e70065e127f903fc7aca09aa50422c7 ] Intel confirmed the existence of this CPU in Q4'2022 earnings presentation. Add the CPU model number. [ dhansen: Merging these as soon as possible makes it easier on all the folks developing model-specific features. ] Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20230208172340.158548-1-tony.luck%40intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-03ARM: dts: stihxxx-b2120: fix polarity of reset line of tsin0 portDmitry Torokhov1-1/+1
[ Upstream commit 4722dd4029c63f10414ffd8d3ffdd6c748391cd7 ] According to c8sectpfe driver code we first drive reset line low and then high to reset the port, therefore the reset line is supposed to be annotated as "active low". This will be important when we convert the driver to gpiod API. Reviewed-by: Patrice Chotard <patrice.chotard@foss.st.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-03powerpc: Don't select ARCH_WANTS_NO_INSTRMichael Ellerman1-1/+0
[ Upstream commit e33416fca8a2313b8650bd5807aaf34354d39a4c ] Commit 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only KASAN support") added a select of ARCH_WANTS_NO_INSTR, because it also added some uses of noinstr. However noinstr is always defined, regardless of ARCH_WANTS_NO_INSTR, so there's no need to select it just for that. As PeterZ says [1]: Note that by selecting ARCH_WANTS_NO_INSTR you effectively state to abide by its rules. As of now the powerpc code does not abide by those rules, and trips some new warnings added by Peter in linux-next. So until the code can be fixed to avoid those warnings, disable ARCH_WANTS_NO_INSTR. Note that ARCH_WANTS_NO_INSTR is also used to gate building KCOV and parts of KCSAN. However none of the noinstr annotations in powerpc were added for KCOV or KCSAN, instead instrumentation is blocked at the file level using KCOV_INSTRUMENT_foo.o := n. [1]: https://lore.kernel.org/linuxppc-dev/Y9t6yoafrO5YqVgM@hirez.programming.kicks-ass.net Reported-by: Sachin Sant <sachinp@linux.ibm.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-03arm64: dts: rockchip: align rk3399 DMC OPP table with bindingsKrzysztof Kozlowski1-1/+1
[ Upstream commit b67b09733d8a41eec33d5d37be2f8cff8af82a5e ] Bindings expect certain pattern for OPP table node name and underscores are not allowed: rk3399-rock-pi-4a-plus.dtb: dmc_opp_table: $nodename:0: 'dmc_opp_table' does not match '^opp-table(-[a-z0-9]+)?$' Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230119124631.91080-1-krzysztof.kozlowski@linaro.org Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-03arm64: dts: rockchip: fix probe of analog sound card on rock-3aJonas Karlman1-0/+2
[ Upstream commit 1104693cdfcd337e73ab585a225f05445ff7a864 ] The following was observed on my Radxa ROCK 3 Model A board: rockchip-pinctrl pinctrl: pin gpio1-9 already requested by vcc-cam-regulator; cannot claim for fe410000.i2s ... platform rk809-sound: deferred probe pending Fix this by supplying a board specific pinctrl with the i2s1 pins used by pmic codec according to the schematic [1]. [1] https://dl.radxa.com/rock3/docs/hw/3a/ROCK-3A-V1.3-SCH.pdf Signed-off-by: Jonas Karlman <jonas@kwiboo.se> Acked-by: Michael Riesch <michael.riesch@wolfvision.net> Link: https://lore.kernel.org/r/20230115211553.445007-1-jonas@kwiboo.se Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-03arm64: dts: rockchip: add missing #interrupt-cells to rk356x pcie2x1Jensen Huang1-0/+1
[ Upstream commit a323e6b5737bb6e3d3946369b97099abb7dde695 ] This fixes the following issue: pcieport 0000:00:00.0: of_irq_parse_pci: failed with rc=-22 Signed-off-by: Jensen Huang <jensenhuang@friendlyarm.com> Link: https://lore.kernel.org/r/20230113064457.7105-1-jensenhuang@friendlyarm.com Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-03ARM: dts: rockchip: add power-domains property to dp node on rk3288Johan Jonker1-0/+1
[ Upstream commit 80422339a75088322b4d3884bd12fa0fe5d11050 ] The clocks in the Rockchip rk3288 DisplayPort node are included in the power-domain@RK3288_PD_VIO logic, but the power-domains property in the dp node is missing, so fix it. Signed-off-by: Johan Jonker <jbx6244@gmail.com> Link: https://lore.kernel.org/r/dab85bfb-9f55-86a1-5cd5-7388c43e0ec5@gmail.com Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-03arm64: dts: rockchip: drop unused LED mode property from rk3328-roc-ccKrzysztof Kozlowski1-2/+0
[ Upstream commit 1692bffec674551163a7a4be32f59fdde04ecd27 ] GPIO LEDs do not have a 'mode' property: rockchip/rk3328-roc-pc.dtb: leds: led-0: Unevaluated properties are not allowed ('mode' was unexpected) Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20221125144135.477144-1-krzysztof.kozlowski@linaro.org Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-03arm64: dts: rockchip: reduce thermal limits on rk3399-pinephone-proJarrah Gosbell1-0/+7
[ Upstream commit 33e24f0738b922b6f5f4118dbdc26cac8400d7b9 ] While this device uses the rk3399 it is also enclosed in a tight package and cooled through the screen and back case. The default rk3399 thermal limits can result in a burnt screen. These lower limits have resulted in the existing burn not expanding and will hopefully result in future devices not experiencing the issue. Signed-off-by: Jarrah Gosbell <kernel@undef.tools> Link: https://lore.kernel.org/r/20221207113212.8216-1-kernel@undef.tools Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-25sh: define RUNTIME_DISCARD_EXITTom Saeger1-0/+1
commit c1c551bebf928889e7a8fef7415b44f9a64975f4 upstream. sh vmlinux fails to link with GNU ld < 2.40 (likely < 2.36) since commit 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv"). This is similar to fixes for powerpc and s390: commit 4b9880dbf3bd ("powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT"). commit a494398bde27 ("s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36"). $ sh4-linux-gnu-ld --version | head -n1 GNU ld (GNU Binutils for Debian) 2.35.2 $ make ARCH=sh CROSS_COMPILE=sh4-linux-gnu- microdev_defconfig $ make ARCH=sh CROSS_COMPILE=sh4-linux-gnu- `.exit.text' referenced in section `__bug_table' of crypto/algboss.o: defined in discarded section `.exit.text' of crypto/algboss.o `.exit.text' referenced in section `__bug_table' of drivers/char/hw_random/core.o: defined in discarded section `.exit.text' of drivers/char/hw_random/core.o make[2]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1 make[1]: *** [Makefile:1252: vmlinux] Error 2 arch/sh/kernel/vmlinux.lds.S keeps EXIT_TEXT: /* * .exit.text is discarded at runtime, not link time, to deal with * references from __bug_table */ .exit.text : AT(ADDR(.exit.text)) { EXIT_TEXT } However, EXIT_TEXT is thrown away by DISCARD(include/asm-generic/vmlinux.lds.h) because sh does not define RUNTIME_DISCARD_EXIT. GNU ld 2.40 does not have this issue and builds fine. This corresponds with Masahiro's comments in a494398bde27: "Nathan [Chancellor] also found that binutils commit 21401fc7bf67 ("Duplicate output sections in scripts") cured this issue, so we cannot reproduce it with binutils 2.36+, but it is better to not rely on it." Link: https://lkml.kernel.org/r/9166a8abdc0f979e50377e61780a4bba1dfa2f52.1674518464.git.tom.saeger@oracle.com Fixes: 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") Link: https://lore.kernel.org/all/Y7Jal56f6UBh1abE@dev-arch.thelio-3990X/ Link: https://lore.kernel.org/all/20230123194218.47ssfzhrpnv3xfez@oracle.com/ Signed-off-by: Tom Saeger <tom.saeger@oracle.com> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Dennis Gilmore <dennis@ausil.us> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: Rich Felker <dalias@libc.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tom Saeger <tom.saeger@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-25s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36Masahiro Yamada1-0/+2
commit a494398bde273143c2352dd373cad8211f7d94b2 upstream. Nathan Chancellor reports that the s390 vmlinux fails to link with GNU ld < 2.36 since commit 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv"). It happens for defconfig, or more specifically for CONFIG_EXPOLINE=y. $ s390x-linux-gnu-ld --version | head -n1 GNU ld (GNU Binutils for Debian) 2.35.2 $ make -s ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- allnoconfig $ ./scripts/config -e CONFIG_EXPOLINE $ make -s ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- olddefconfig $ make -s ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- `.exit.text' referenced in section `.s390_return_reg' of drivers/base/dd.o: defined in discarded section `.exit.text' of drivers/base/dd.o make[1]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1 make: *** [Makefile:1252: vmlinux] Error 2 arch/s390/kernel/vmlinux.lds.S wants to keep EXIT_TEXT: .exit.text : { EXIT_TEXT } But, at the same time, EXIT_TEXT is thrown away by DISCARD because s390 does not define RUNTIME_DISCARD_EXIT. I still do not understand why the latter wins after 99cb0d917ffa, but defining RUNTIME_DISCARD_EXIT seems correct because the comment line in arch/s390/kernel/vmlinux.lds.S says: /* * .exit.text is discarded at runtime, not link time, * to deal with references from __bug_table */ Nathan also found that binutils commit 21401fc7bf67 ("Duplicate output sections in scripts") cured this issue, so we cannot reproduce it with binutils 2.36+, but it is better to not rely on it. Fixes: 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") Link: https://lore.kernel.org/all/Y7Jal56f6UBh1abE@dev-arch.thelio-3990X/ Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20230105031306.1455409-1-masahiroy@kernel.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Tom Saeger <tom.saeger@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-25powerpc/vmlinux.lds: Don't discard .rela* for relocatable buildsMichael Ellerman1-1/+4
commit 07b050f9290ee012a407a0f64151db902a1520f5 upstream. Relocatable kernels must not discard relocations, they need to be processed at runtime. As such they are included for CONFIG_RELOCATABLE builds in the powerpc linker script (line 340). However they are also unconditionally discarded later in the script (line 414). Previously that worked because the earlier inclusion superseded the discard. However commit 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") introduced an earlier use of DISCARD as part of the RO_DATA macro (line 137). With binutils < 2.36 that causes the DISCARD directives later in the script to be applied earlier, causing .rela* to actually be discarded at link time, leading to build warnings and a kernel that doesn't boot: ld: warning: discarding dynamic section .rela.init.rodata Fix it by conditionally discarding .rela* only when CONFIG_RELOCATABLE is disabled. Fixes: 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230105132349.384666-2-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tom Saeger <tom.saeger@oracle.com>
2023-02-25powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXITMichael Ellerman1-0/+1
commit 4b9880dbf3bdba3a7c56445137c3d0e30aaa0a40 upstream. The powerpc linker script explicitly includes .exit.text, because otherwise the link fails due to references from __bug_table and __ex_table. The code is freed (discarded) at runtime along with .init.text and data. That has worked in the past despite powerpc not defining RUNTIME_DISCARD_EXIT because DISCARDS appears late in the powerpc linker script (line 410), and the explicit inclusion of .exit.text earlier (line 280) supersedes the discard. However commit 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") introduced an earlier use of DISCARD as part of the RO_DATA macro (line 136). With binutils < 2.36 that causes the DISCARD directives later in the script to be applied earlier [1], causing .exit.text to actually be discarded at link time, leading to build errors: '.exit.text' referenced in section '__bug_table' of crypto/algboss.o: defined in discarded section '.exit.text' of crypto/algboss.o '.exit.text' referenced in section '__ex_table' of drivers/nvdimm/core.o: defined in discarded section '.exit.text' of drivers/nvdimm/core.o Fix it by defining RUNTIME_DISCARD_EXIT, which causes the generic DISCARDS macro to not include .exit.text at all. 1: https://lore.kernel.org/lkml/87fscp2v7k.fsf@igel.home/ Fixes: 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230105132349.384666-1-mpe@ellerman.id.au Signed-off-by: Tom Saeger <tom.saeger@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-25x86/static_call: Add support for Jcc tail-callsPeter Zijlstra1-3/+46
commit 923510c88d2b7d947c4217835fd9ca6bd65cc56c upstream. Clang likes to create conditional tail calls like: 0000000000000350 <amd_pmu_add_event>: 350: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 351: R_X86_64_NONE __fentry__-0x4 355: 48 83 bf 20 01 00 00 00 cmpq $0x0,0x120(%rdi) 35d: 0f 85 00 00 00 00 jne 363 <amd_pmu_add_event+0x13> 35f: R_X86_64_PLT32 __SCT__amd_pmu_branch_add-0x4 363: e9 00 00 00 00 jmp 368 <amd_pmu_add_event+0x18> 364: R_X86_64_PLT32 __x86_return_thunk-0x4 Where 0x35d is a static call site that's turned into a conditional tail-call using the Jcc class of instructions. Teach the in-line static call text patching about this. Notably, since there is no conditional-ret, in that case patch the Jcc to point at an empty stub function that does the ret -- or the return thunk when needed. Reported-by: "Erhard F." <erhard_f@mailbox.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/Y9Kdg9QjHkr9G5b5@hirez.programming.kicks-ass.net [nathan: Backport to 6.1: - Use __x86_return_thunk instead of x86_return_thunk for func in __static_call_transform() - Remove ASM_FUNC_ALIGN in __static_call_return() asm, as call depth tracking was merged in 6.2] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-25x86/alternatives: Teach text_poke_bp() to patch Jcc.d32 instructionsPeter Zijlstra1-11/+48
commit ac0ee0a9560c97fa5fe1409e450c2425d4ebd17a upstream. In order to re-write Jcc.d32 instructions text_poke_bp() needs to be taught about them. The biggest hurdle is that the whole machinery is currently made for 5 byte instructions and extending this would grow struct text_poke_loc which is currently a nice 16 bytes and used in an array. However, since text_poke_loc contains a full copy of the (s32) displacement, it is possible to map the Jcc.d32 2 byte opcodes to Jcc.d8 1 byte opcode for the int3 emulation. This then leaves the replacement bytes; fudge that by only storing the last 5 bytes and adding the rule that 'length == 6' instruction will be prefixed with a 0x0f byte. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20230123210607.115718513@infradead.org [nathan: Introduce is_jcc32() as part of this change; upstream introduced it in 3b6c1747da48] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-25x86/alternatives: Introduce int3_emulate_jcc()Peter Zijlstra2-30/+39
commit db7adcfd1cec4e95155e37bc066fddab302c6340 upstream. Move the kprobe Jcc emulation into int3_emulate_jcc() so it can be used by more code -- specifically static_call() will need this. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20230123210607.057678245@infradead.org Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-25powerpc/64s/radix: Fix RWX mapping with relocated kernelMichael Ellerman1-0/+13
[ Upstream commit 111bcb37385353f0510e5847d5abcd1c613dba23 ] If a relocatable kernel is loaded at a non-zero address and told not to relocate to zero (kdump or RELOCATABLE_TEST), the mapping of the interrupt code at zero is left with RWX permissions. That is a security weakness, and leads to a warning at boot if CONFIG_DEBUG_WX is enabled: powerpc/mm: Found insecure W+X mapping at address 00000000056435bc/0xc000000000000000 WARNING: CPU: 1 PID: 1 at arch/powerpc/mm/ptdump/ptdump.c:193 note_page+0x484/0x4c0 CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc1-00001-g8ae8e98aea82-dirty #175 Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-dd0dca hv:linux,kvm pSeries NIP: c0000000004a1c34 LR: c0000000004a1c30 CTR: 0000000000000000 REGS: c000000003503770 TRAP: 0700 Not tainted (6.2.0-rc1-00001-g8ae8e98aea82-dirty) MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 24000220 XER: 00000000 CFAR: c000000000545a58 IRQMASK: 0 ... NIP note_page+0x484/0x4c0 LR note_page+0x480/0x4c0 Call Trace: note_page+0x480/0x4c0 (unreliable) ptdump_pmd_entry+0xc8/0x100 walk_pgd_range+0x618/0xab0 walk_page_range_novma+0x74/0xc0 ptdump_walk_pgd+0x98/0x170 ptdump_check_wx+0x94/0x100 mark_rodata_ro+0x30/0x70 kernel_init+0x78/0x1a0 ret_from_kernel_thread+0x5c/0x64 The fix has two parts. Firstly the pages from zero up to the end of interrupts need to be marked read-only, so that they are left with R-X permissions. Secondly the mapping logic needs to be taught to ensure there is a page boundary at the end of the interrupt region, so that the permission change only applies to the interrupt text, and not the region following it. Fixes: c55d7b5e6426 ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE") Reported-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230110124753.1325426-2-mpe@ellerman.id.au Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-25KVM: x86: fix deadlock for KVM_XEN_EVTCHN_RESETPaolo Bonzini1-3/+27
[ Upstream commit a79b53aaaab53de017517bf9579b6106397a523c ] While KVM_XEN_EVTCHN_RESET is usually called with no vCPUs running, if that happened it could cause a deadlock. This is due to kvm_xen_eventfd_reset() doing a synchronize_srcu() inside a kvm->lock critical section. To avoid this, first collect all the evtchnfd objects in an array and free all of them once the kvm->lock critical section is over and th SRCU grace period has expired. Reported-by: Michal Luczaj <mhal@rbox.co> Cc: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-25powerpc: dts: t208x: Disable 10G on MAC1 and MAC2Sean Anderson1-0/+16
[ Upstream commit 8d8bee13ae9e316443c6666286360126a19c8d94 ] There aren't enough resources to run these ports at 10G speeds. Disable 10G for these ports, reverting to the previous speed. Fixes: 36926a7d70c2 ("powerpc: dts: t208x: Mark MAC1 and MAC2 as 10G") Reported-by: Camelia Alexandra Groza <camelia.groza@nxp.com> Signed-off-by: Sean Anderson <sean.anderson@seco.com> Reviewed-by: Camelia Groza <camelia.groza@nxp.com> Tested-by: Camelia Groza <camelia.groza@nxp.com> Link: https://lore.kernel.org/r/20221216172937.2960054-1-sean.anderson@seco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-25KVM: VMX: Execute IBPB on emulated VM-exit when guest has IBRSJim Mattson2-2/+15
[ Upstream commit 2e7eab81425ad6c875f2ed47c0ce01e78afc38a5 ] According to Intel's document on Indirect Branch Restricted Speculation, "Enabling IBRS does not prevent software from controlling the predicted targets of indirect branches of unrelated software executed later at the same predictor mode (for example, between two different user applications, or two different virtual machines). Such isolation can be ensured through use of the Indirect Branch Predictor Barrier (IBPB) command." This applies to both basic and enhanced IBRS. Since L1 and L2 VMs share hardware predictor modes (guest-user and guest-kernel), hardware IBRS is not sufficient to virtualize IBRS. (The way that basic IBRS is implemented on pre-eIBRS parts, hardware IBRS is actually sufficient in practice, even though it isn't sufficient architecturally.) For virtual CPUs that support IBRS, add an indirect branch prediction barrier on emulated VM-exit, to ensure that the predicted targets of indirect branches executed in L1 cannot be controlled by software that was executed in L2. Since we typically don't intercept guest writes to IA32_SPEC_CTRL, perform the IBPB at emulated VM-exit regardless of the current IA32_SPEC_CTRL.IBRS value, even though the IBPB could technically be deferred until L1 sets IA32_SPEC_CTRL.IBRS, if IA32_SPEC_CTRL.IBRS is clear at emulated VM-exit. This is CVE-2022-2196. Fixes: 5c911beff20a ("KVM: nVMX: Skip IBPB when switching between vmcs01 and vmcs02") Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221019213620.1953281-3-jmattson@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-25KVM: SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't validSean Christopherson1-2/+8
[ Upstream commit 5c30e8101e8d5d020b1d7119117889756a6ed713 ] Skip the WRMSR fastpath in SVM's VM-Exit handler if the next RIP isn't valid, e.g. because KVM is running with nrips=false. SVM must decode and emulate to skip the WRMSR if the CPU doesn't provide the next RIP. Getting the instruction bytes to decode the WRMSR requires reading guest memory, which in turn means dereferencing memslots, and that isn't safe because KVM doesn't hold SRCU when the fastpath runs. Don't bother trying to enable the fastpath for this case, e.g. by doing only the WRMSR and leaving the "skip" until later. NRIPS is supported on all modern CPUs (KVM has considered making it mandatory), and the next RIP will be valid the vast, vast majority of the time. ============================= WARNING: suspicious RCU usage 6.0.0-smp--4e557fcd3d80-skip #13 Tainted: G O ----------------------------- include/linux/kvm_host.h:954 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by stable/206475: #0: ffff9d9dfebcc0f0 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x8b/0x620 [kvm] stack backtrace: CPU: 152 PID: 206475 Comm: stable Tainted: G O 6.0.0-smp--4e557fcd3d80-skip #13 Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 10.48.0 01/27/2022 Call Trace: <TASK> dump_stack_lvl+0x69/0xaa dump_stack+0x10/0x12 lockdep_rcu_suspicious+0x11e/0x130 kvm_vcpu_gfn_to_memslot+0x155/0x190 [kvm] kvm_vcpu_gfn_to_hva_prot+0x18/0x80 [kvm] paging64_walk_addr_generic+0x183/0x450 [kvm] paging64_gva_to_gpa+0x63/0xd0 [kvm] kvm_fetch_guest_virt+0x53/0xc0 [kvm] __do_insn_fetch_bytes+0x18b/0x1c0 [kvm] x86_decode_insn+0xf0/0xef0 [kvm] x86_emulate_instruction+0xba/0x790 [kvm] kvm_emulate_instruction+0x17/0x20 [kvm] __svm_skip_emulated_instruction+0x85/0x100 [kvm_amd] svm_skip_emulated_instruction+0x13/0x20 [kvm_amd] handle_fastpath_set_msr_irqoff+0xae/0x180 [kvm] svm_vcpu_run+0x4b8/0x5a0 [kvm_amd] vcpu_enter_guest+0x16ca/0x22f0 [kvm] kvm_arch_vcpu_ioctl_run+0x39d/0x900 [kvm] kvm_vcpu_ioctl+0x538/0x620 [kvm] __se_sys_ioctl+0x77/0xc0 __x64_sys_ioctl+0x1d/0x20 do_syscall_64+0x3d/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 404d5d7bff0d ("KVM: X86: Introduce more exit_fastpath_completion enum values") Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220930234031.1732249-1-seanjc@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-25KVM: x86: Fail emulation during EMULTYPE_SKIP on any exceptionSean Christopherson1-1/+3
[ Upstream commit 17122c06b86c9f77f45b86b8e62c3ed440847a59 ] Treat any exception during instruction decode for EMULTYPE_SKIP as a "full" emulation failure, i.e. signal failure instead of queuing the exception. When decoding purely to skip an instruction, KVM and/or the CPU has already done some amount of emulation that cannot be unwound, e.g. on an EPT misconfig VM-Exit KVM has already processeed the emulated MMIO. KVM already does this if a #UD is encountered, but not for other exceptions, e.g. if a #PF is encountered during fetch. In SVM's soft-injection use case, queueing the exception is particularly problematic as queueing exceptions while injecting events can put KVM into an infinite loop due to bailing from VM-Enter to service the newly pending exception. E.g. multiple warnings to detect such behavior fire: ------------[ cut here ]------------ WARNING: CPU: 3 PID: 1017 at arch/x86/kvm/x86.c:9873 kvm_arch_vcpu_ioctl_run+0x1de5/0x20a0 [kvm] Modules linked in: kvm_amd ccp kvm irqbypass CPU: 3 PID: 1017 Comm: svm_nested_soft Not tainted 6.0.0-rc1+ #220 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_arch_vcpu_ioctl_run+0x1de5/0x20a0 [kvm] Call Trace: kvm_vcpu_ioctl+0x223/0x6d0 [kvm] __x64_sys_ioctl+0x85/0xc0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 ---[ end trace 0000000000000000 ]--- ------------[ cut here ]------------ WARNING: CPU: 3 PID: 1017 at arch/x86/kvm/x86.c:9987 kvm_arch_vcpu_ioctl_run+0x12a3/0x20a0 [kvm] Modules linked in: kvm_amd ccp kvm irqbypass CPU: 3 PID: 1017 Comm: svm_nested_soft Tainted: G W 6.0.0-rc1+ #220 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_arch_vcpu_ioctl_run+0x12a3/0x20a0 [kvm] Call Trace: kvm_vcpu_ioctl+0x223/0x6d0 [kvm] __x64_sys_ioctl+0x85/0xc0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 ---[ end trace 0000000000000000 ]--- Fixes: 6ea6e84309ca ("KVM: x86: inject exceptions produced by x86_decode_insn") Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220930233632.1725475-1-seanjc@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-25powerpc: dts: t208x: Mark MAC1 and MAC2 as 10GSean Anderson3-2/+90
[ Upstream commit 36926a7d70c2d462fca1ed85bfee000d17fd8662 ] On the T208X SoCs, MAC1 and MAC2 support XGMII. Add some new MAC dtsi fragments, and mark the QMAN ports as 10G. Fixes: da414bb923d9 ("powerpc/mpc85xx: Add FSL QorIQ DPAA FMan support to the SoC device tree(s)") Signed-off-by: Sean Anderson <sean.anderson@seco.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-22perf/x86: Refuse to export capabilities for hybrid PMUsSean Christopherson1-5/+7
commit 4b4191b8ae1278bde3642acaaef8f92810ed111a upstream. Now that KVM disables vPMU support on hybrid CPUs, WARN and return zeros if perf_get_x86_pmu_capability() is invoked on a hybrid CPU. The helper doesn't provide an accurate accounting of the PMU capabilities for hybrid CPUs and needs to be enhanced if KVM, or anything else outside of perf, wants to act on the PMU capabilities. Cc: stable@vger.kernel.org Cc: Andrew Cooper <Andrew.Cooper3@citrix.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/all/20220818181530.2355034-1-kan.liang@linux.intel.com Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20230208204230.1360502-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-22kvm: initialize all of the kvm_debugregs structure before sending it to ↵Greg Kroah-Hartman1-2/+1
userspace commit 2c10b61421a28e95a46ab489fd56c0f442ff6952 upstream. When calling the KVM_GET_DEBUGREGS ioctl, on some configurations, there might be some unitialized portions of the kvm_debugregs structure that could be copied to userspace. Prevent this as is done in the other kvm ioctls, by setting the whole structure to 0 before copying anything into it. Bonus is that this reduces the lines of code as the explicit flag setting and reserved space zeroing out can be removed. Cc: Sean Christopherson <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: <x86@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: stable <stable@kernel.org> Reported-by: Xingyuan Mo <hdthky0@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Message-Id: <20230214103304.3689213-1-gregkh@linuxfoundation.org> Tested-by: Xingyuan Mo <hdthky0@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-22KVM: x86/pmu: Disable vPMU support on hybrid CPUs (host PMUs)Sean Christopherson1-7/+19
commit 4d7404e5ee0066e9a9e8268675de8a273b568b08 upstream. Disable KVM support for virtualizing PMUs on hosts with hybrid PMUs until KVM gains a sane way to enumeration the hybrid vPMU to userspace and/or gains a mechanism to let userspace opt-in to the dangers of exposing a hybrid vPMU to KVM guests. Virtualizing a hybrid PMU, or at least part of a hybrid PMU, is possible, but it requires careful, deliberate configuration from userspace. E.g. to expose full functionality, vCPUs need to be pinned to pCPUs to prevent migrating a vCPU between a big core and a little core, userspace must enumerate a reasonable topology to the guest, and guest CPUID must be curated per vCPU to enumerate accurate vPMU capabilities. The last point is especially problematic, as KVM doesn't control which pCPU it runs on when enumerating KVM's vPMU capabilities to userspace, i.e. userspace can't rely on KVM_GET_SUPPORTED_CPUID in it's current form. Alternatively, userspace could enable vPMU support by enumerating the set of features that are common and coherent across all cores, e.g. by filtering PMU events and restricting guest capabilities. But again, that requires userspace to take action far beyond reflecting KVM's supported feature set into the guest. For now, simply disable vPMU support on hybrid CPUs to avoid inducing seemingly random #GPs in guests, and punt support for hybrid CPUs to a future enabling effort. Reported-by: Jianfeng Gao <jianfeng.gao@intel.com> Cc: stable@vger.kernel.org Cc: Andrew Cooper <Andrew.Cooper3@citrix.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/all/20220818181530.2355034-1-kan.liang@linux.intel.com Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20230208204230.1360502-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-22s390/decompressor: specify __decompress() buf len to avoid overflowVasily Gorbik1-1/+1
[ Upstream commit 7ab41c2c08a32132ba8c14624910e2fe8ce4ba4b ] Historically calls to __decompress() didn't specify "out_len" parameter on many architectures including s390, expecting that no writes beyond uncompressed kernel image are performed. This has changed since commit 2aa14b1ab2c4 ("zstd: import usptream v1.5.2") which includes zstd library commit 6a7ede3dfccb ("Reduce size of dctx by reutilizing dst buffer (#2751)"). Now zstd decompression code might store literal buffer in the unwritten portion of the destination buffer. Since "out_len" is not set, it is considered to be unlimited and hence free to use for optimization needs. On s390 this might corrupt initrd or ipl report which are often placed right after the decompressor buffer. Luckily the size of uncompressed kernel image is already known to the decompressor, so to avoid the problem simply specify it in the "out_len" parameter. Link: https://github.com/facebook/zstd/commit/6a7ede3dfccb Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com> Link: https://lore.kernel.org/r/patch-1.thread-41c676.git-41c676c2d153.your-ad-here.call-01675030179-ext-9637@work.hours Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-22powerpc/64: Fix perf profiling asynchronous interrupt handlersNicholas Piggin4-15/+32
[ Upstream commit c28548012ee2bac55772ef7685138bd1124b80c3 ] Interrupt entry sets the soft mask to IRQS_ALL_DISABLED to match the hard irq disabled state. So when should_hard_irq_enable() returns true because we want PMI interrupts in irq handlers, MSR[EE] is enabled but PMIs just get soft-masked. Fix this by clearing IRQS_PMI_DISABLED before enabling MSR[EE]. This also tidies some of the warnings, no need to duplicate them in both should_hard_irq_enable() and do_hard_irq_enable(). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230121100156.2824054-1-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14KVM: x86: Mitigate the cross-thread return address predictions bugTom Lendacky1-11/+32
commit 6f0f2d5ef895d66a3f2b32dd05189ec34afa5a55 upstream. By default, KVM/SVM will intercept attempts by the guest to transition out of C0. However, the KVM_CAP_X86_DISABLE_EXITS capability can be used by a VMM to change this behavior. To mitigate the cross-thread return address predictions bug (X86_BUG_SMT_RSB), a VMM must not be allowed to override the default behavior to intercept C0 transitions. Use a module parameter to control the mitigation on processors that are vulnerable to X86_BUG_SMT_RSB. If the processor is vulnerable to the X86_BUG_SMT_RSB bug and the module parameter is set to mitigate the bug, KVM will not allow the disabling of the HLT, MWAIT and CSTATE exits. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <4019348b5e07148eb4d593380a5f6713b93c9a16.1675956146.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14x86/speculation: Identify processors vulnerable to SMT RSB predictionsTom Lendacky2-2/+8
commit be8de49bea505e7777a69ef63d60e02ac1712683 upstream. Certain AMD processors are vulnerable to a cross-thread return address predictions bug. When running in SMT mode and one of the sibling threads transitions out of C0 state, the other sibling thread could use return target predictions from the sibling thread that transitioned out of C0. The Spectre v2 mitigations cover the Linux kernel, as it fills the RSB when context switching to the idle thread. However, KVM allows a VMM to prevent exiting guest mode when transitioning out of C0. A guest could act maliciously in this situation, so create a new x86 BUG that can be used to detect if the processor is vulnerable. Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <91cec885656ca1fcd4f0185ce403a53dd9edecb7.1675956146.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14arm64: dts: meson-axg: Make mmc host controller interrupts level-sensitiveHeiner Kallweit1-2/+2
commit d182bcf300772d8b2e5f43e47fa0ebda2b767cc4 upstream. The usage of edge-triggered interrupts lead to lost interrupts under load, see [0]. This was confirmed to be fixed by using level-triggered interrupts. The report was about SDIO. However, as the host controller is the same for SD and MMC, apply the change to all mmc controller instances. [0] https://www.spinics.net/lists/linux-mmc/msg73991.html Fixes: 221cf34bac54 ("ARM64: dts: meson-axg: enable the eMMC controller") Reported-by: Peter Suti <peter.suti@streamunlimited.com> Tested-by: Vyacheslav Bocharov <adeep@lexina.in> Tested-by: Peter Suti <peter.suti@streamunlimited.com> Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/c00655d3-02f8-6f5f-4239-ca2412420cad@gmail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14arm64: dts: meson-g12-common: Make mmc host controller interrupts ↵Heiner Kallweit1-3/+3
level-sensitive commit ac8db4cceed218cca21c84f9d75ce88182d8b04f upstream. The usage of edge-triggered interrupts lead to lost interrupts under load, see [0]. This was confirmed to be fixed by using level-triggered interrupts. The report was about SDIO. However, as the host controller is the same for SD and MMC, apply the change to all mmc controller instances. [0] https://www.spinics.net/lists/linux-mmc/msg73991.html Fixes: 4759fd87b928 ("arm64: dts: meson: g12a: add mmc nodes") Tested-by: FUKAUMI Naoki <naoki@radxa.com> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Tested-by: Jerome Brunet <jbrunet@baylibre.com> Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/27d89baa-b8fa-baca-541b-ef17a97cde3c@gmail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14arm64: dts: meson-gx: Make mmc host controller interrupts level-sensitiveHeiner Kallweit1-3/+3
commit 66e45351f7d6798751f98001d1fcd572024d87f0 upstream. The usage of edge-triggered interrupts lead to lost interrupts under load, see [0]. This was confirmed to be fixed by using level-triggered interrupts. The report was about SDIO. However, as the host controller is the same for SD and MMC, apply the change to all mmc controller instances. [0] https://www.spinics.net/lists/linux-mmc/msg73991.html Fixes: ef8d2ffedf18 ("ARM64: dts: meson-gxbb: add MMC support") Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/76e042e0-a610-5ed5-209f-c4d7f879df44@gmail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14powerpc/64s/interrupt: Fix interrupt exit race with security mitigation switchNicholas Piggin1-2/+4
commit 2ea31e2e62bbc4d11c411eeb36f1b02841dbcab1 upstream. The RFI and STF security mitigation options can flip the interrupt_exit_not_reentrant static branch condition concurrently with the interrupt exit code which tests that branch. Interrupt exit tests this condition to set MSR[EE|RI] for exit, then again in the case a soft-masked interrupt is found pending, to recover the MSR so the interrupt can be replayed before attempting to exit again. If the condition changes between these two tests, the MSR and irq soft-mask state will become corrupted, leading to warnings and possible crashes. For example, if the branch is initially true then false, MSR[EE] will be 0 but PACA_IRQ_HARD_DIS clear and EE may not get enabled, leading to warnings in irq_64.c. Fixes: 13799748b957 ("powerpc/64: use interrupt restart table to speed up return from interrupt") Cc: stable@vger.kernel.org # v5.14+ Reported-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230206042240.92103-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14riscv: kprobe: Fixup misaligned load textGuo Ren1-3/+5
commit eb7423273cc9922ee2d05bf660c034d7d515bb91 upstream. The current kprobe would cause a misaligned load for the probe point. This patch fixup it with two half-word loads instead. Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/linux-riscv/878rhig9zj.fsf@all.your.base.are.belong.to.us/ Reported-by: Bjorn Topel <bjorn.topel@gmail.com> Reviewed-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20230204063531.740220-1-guoren@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14riscv: Fixup race condition on PG_dcache_clean in flush_icache_pteGuo Ren1-1/+3
commit 950b879b7f0251317d26bae0687e72592d607532 upstream. In commit 588a513d3425 ("arm64: Fix race condition on PG_dcache_clean in __sync_icache_dcache()"), we found RISC-V has the same issue as the previous arm64. The previous implementation didn't guarantee the correct sequence of operations, which means flush_icache_all() hasn't been called when the PG_dcache_clean was set. That would cause a risk of page synchronization. Fixes: 08f051eda33b ("RISC-V: Flush I$ when making a dirty page executable") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230127035306.1819561-1-guoren@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14arm64: dts: mediatek: mt8195: Fix vdosys* compatible stringsChen-Yu Tsai1-2/+2
[ Upstream commit 97801cfcf9565247bcc53b67ea47fa87b1704375 ] When vdosys1 was initially added, it was incorrectly assumed to be compatible with vdosys0, and thus both had the same mt8195-mmsys compatible attached. This has since been corrected in commit b237efd47df7 ("dt-bindings: arm: mediatek: mmsys: change compatible for MT8195") and commit 82219cfbef18 ("dt-bindings: arm: mediatek: mmsys: add vdosys1 compatible for MT8195"). The device tree needs to be fixed as well, otherwise the vdosys1 block fails to work, and causes its dependent power domain controller to not work either. Change the compatible string of vdosys1 to "mediatek,mt8195-vdosys1". While at it, also add the new "mediatek,mt8195-vdosys0" compatible to vdosys0. Fixes: 6aa5b46d1755 ("arm64: dts: mt8195: Add vdosys and vppsys clock nodes") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Acked-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20230202104014.2931517-1-wenst@chromium.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14riscv: stacktrace: Fix missing the first frameLiu Shixin1-1/+2
[ Upstream commit cb80242cc679d6397e77d8a964deeb3ff218d2b5 ] When running kfence_test, I found some testcases failed like this: # test_out_of_bounds_read: EXPECTATION FAILED at mm/kfence/kfence_test.c:346 Expected report_matches(&expect) to be true, but is false not ok 1 - test_out_of_bounds_read The corresponding call-trace is: BUG: KFENCE: out-of-bounds read in kunit_try_run_case+0x38/0x84 Out-of-bounds read at 0x(____ptrval____) (32B right of kfence-#10): kunit_try_run_case+0x38/0x84 kunit_generic_run_threadfn_adapter+0x12/0x1e kthread+0xc8/0xde ret_from_exception+0x0/0xc The kfence_test using the first frame of call trace to check whether the testcase is succeed or not. Commit 6a00ef449370 ("riscv: eliminate unreliable __builtin_frame_address(1)") skip first frame for all case, which results the kfence_test failed. Indeed, we only need to skip the first frame for case (task==NULL || task==current). With this patch, the call-trace will be: BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0x88/0x19e Out-of-bounds read at 0x(____ptrval____) (1B left of kfence-#7): test_out_of_bounds_read+0x88/0x19e kunit_try_run_case+0x38/0x84 kunit_generic_run_threadfn_adapter+0x12/0x1e kthread+0xc8/0xde ret_from_exception+0x0/0xc Fixes: 6a00ef449370 ("riscv: eliminate unreliable __builtin_frame_address(1)") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Tested-by: Samuel Holland <samuel@sholland.org> Link: https://lore.kernel.org/r/20221207025038.1022045-1-liushixin2@huawei.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14arm64: dts: rockchip: set sdmmc0 speed to sd-uhs-sdr50 on rock-3aDan Johansen1-1/+1
[ Upstream commit bc121b707e816616567683e51fd9194c2309977a ] As other rk336x based devices, the Rock 3 Model A has issues with high speed SD cards, so lower the speed to 50 instead of 104 in the same manor has the Quartz64 Model B has. Fixes: 22a442e6586c ("arm64: dts: rockchip: add basic dts for the radxa rock3 model a") Signed-off-by: Dan Johansen <strit@manjaro.org> Link: https://lore.kernel.org/r/20230128112432.132302-1-strit@manjaro.org Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14arm64: dts: rockchip: fix input enable pinconf on rk3399Arnaud Ferraris1-2/+0
[ Upstream commit 6f515b663d49a14fb63f8c5d0a2a4ae53d44790a ] When the input enable pinconf was introduced, a default drive-strength value of 2 was set for the pull up/down configs. However, this parameter is unneeded when configuring the pin as input, and having a single hardcoded value here is actually harmful: GPIOs on the RK3399 have various same drive-strength capabilities depending on the bank and port they belong to. As an example, trying to configure the GPIO4_PD3 pin as an input with pull-up enabled fails with the following output: [ 10.706542] rockchip-pinctrl pinctrl: unsupported driver strength 2 [ 10.713661] rockchip-pinctrl pinctrl: pin_config_set op failed for pin 155 (acceptable drive-strength values for this pin being 3, 6, 9 and 12) Let's drop the drive-strength property from all input pinconfs in order to solve this issue. Fixes: ec48c3e82ca3 ("arm64: dts: rockchip: add an input enable pinconf to rk3399") Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> Reviewed-by: Caleb Connolly <kc@postmarketos.org> Link: https://lore.kernel.org/r/20221215101947.254896-1-arnaud.ferraris@collabora.com Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09powerpc/imc-pmu: Revert nest_init_lock to being a mutexMichael Ellerman1-7/+7
commit ad53db4acb415976761d7302f5b02e97f2bd097e upstream. The recent commit 76d588dddc45 ("powerpc/imc-pmu: Fix use of mutex in IRQs disabled section") fixed warnings (and possible deadlocks) in the IMC PMU driver by converting the locking to use spinlocks. It also converted the init-time nest_init_lock to a spinlock, even though it's not used at runtime in IRQ disabled sections or while holding other spinlocks. This leads to warnings such as: BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:49 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 preempt_count: 1, expected: 0 CPU: 7 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc2-14719-gf12cd06109f4-dirty #1 Hardware name: Mambo,Simulated-System POWER9 0x4e1203 opal:v6.6.6 PowerNV Call Trace: dump_stack_lvl+0x74/0xa8 (unreliable) __might_resched+0x178/0x1a0 __cpuhp_setup_state+0x64/0x1e0 init_imc_pmu+0xe48/0x1250 opal_imc_counters_probe+0x30c/0x6a0 platform_probe+0x78/0x110 really_probe+0x104/0x420 __driver_probe_device+0xb0/0x170 driver_probe_device+0x58/0x180 __driver_attach+0xd8/0x250 bus_for_each_dev+0xb4/0x140 driver_attach+0x34/0x50 bus_add_driver+0x1e8/0x2d0 driver_register+0xb4/0x1c0 __platform_driver_register+0x38/0x50 opal_imc_driver_init+0x2c/0x40 do_one_initcall+0x80/0x360 kernel_init_freeable+0x310/0x3b8 kernel_init+0x30/0x1a0 ret_from_kernel_thread+0x5c/0x64 Fix it by converting nest_init_lock back to a mutex, so that we can call sleeping functions while holding it. There is no interaction between nest_init_lock and the runtime spinlocks used by the actual PMU routines. Fixes: 76d588dddc45 ("powerpc/imc-pmu: Fix use of mutex in IRQs disabled section") Tested-by: Kajol Jain<kjain@linux.ibm.com> Reviewed-by: Kajol Jain<kjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230130014401.540543-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-09powerpc/64s: Fix local irq disable when PMIs are disabledNicholas Piggin1-1/+1
commit bc88ef663265676419555df2dc469a471c0add31 upstream. When PMI interrupts are soft-masked, local_irq_save() will clear the PMI mask bit, allowing PMIs in and causing a race condition. This causes a deadlock in native_hpte_insert via hash_preload, which depends on PMIs being disabled since commit 8b91cee5eadd ("powerpc/64s/hash: Make hash faults work in NMI context"). native_hpte_insert calls local_irq_save(). It's possible the lpar hash code is also affected when tracing is enabled because __trace_hcall_entry() calls local_irq_save(). Fix this by making arch_local_irq_save() _or_ the IRQS_DISABLED bit into the mask. This was found with the stress_hpt option with a kbuild workload running together with `perf record -g`. Fixes: f442d004806e ("powerpc/64s: Add support to mask perf interrupts and replay them") Fixes: 8b91cee5eadd ("powerpc/64s/hash: Make hash faults work in NMI context") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Just take the fix without the new warning] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230121095352.2823517-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-09powerpc/64s/radix: Fix crash with unaligned relocated kernelMichael Ellerman1-0/+11
commit 98d0219e043e09013e883eacde3b93e0b2bf944d upstream. If a relocatable kernel is loaded at an address that is not 2MB aligned and told not to relocate to zero, the kernel can crash due to mark_rodata_ro() incorrectly changing some read-write data to read-only. Scenarios where the misalignment can occur are when the kernel is loaded by kdump or using the RELOCATABLE_TEST config option. Example crash with the kernel loaded at 5MB: Run /sbin/init as init process BUG: Unable to handle kernel data access on write at 0xc000000000452000 Faulting instruction address: 0xc0000000005b6730 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries CPU: 1 PID: 1 Comm: init Not tainted 6.2.0-rc1-00011-g349188be4841 #166 Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-5b4c5a hv:linux,kvm pSeries NIP: c0000000005b6730 LR: c000000000ae9ab8 CTR: 0000000000000380 REGS: c000000004503250 TRAP: 0300 Not tainted (6.2.0-rc1-00011-g349188be4841) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 44288480 XER: 00000000 CFAR: c0000000005b66ec DAR: c000000000452000 DSISR: 0a000000 IRQMASK: 0 ... NIP memset+0x68/0x104 LR zero_user_segments.constprop.0+0xa8/0xf0 Call Trace: ext4_mpage_readpages+0x7f8/0x830 ext4_readahead+0x48/0x60 read_pages+0xb8/0x380 page_cache_ra_unbounded+0x19c/0x250 filemap_fault+0x58c/0xae0 __do_fault+0x60/0x100 __handle_mm_fault+0x1230/0x1a40 handle_mm_fault+0x120/0x300 ___do_page_fault+0x20c/0xa80 do_page_fault+0x30/0xc0 data_access_common_virt+0x210/0x220 This happens because mark_rodata_ro() tries to change permissions on the range _stext..__end_rodata, but _stext sits in the middle of the 2MB page from 4MB to 6MB: radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec) radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages radix-mmu: Mapped 0x0000000000400000-0x0000000002400000 with 2.00 MiB pages (exec) The logic that changes the permissions assumes the linear mapping was split correctly at boot, so it marks the entire 2MB page read-only. That leads to the write fault above. To fix it, the boot time mapping logic needs to consider that if the kernel is running at a non-zero address then _stext is a boundary where it must split the mapping. That leads to the mapping being split correctly, allowing the rodata permission change to take happen correctly, with no spillover: radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec) radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages radix-mmu: Mapped 0x0000000000400000-0x0000000000500000 with 64.0 KiB pages radix-mmu: Mapped 0x0000000000500000-0x0000000000600000 with 64.0 KiB pages (exec) radix-mmu: Mapped 0x0000000000600000-0x0000000002400000 with 2.00 MiB pages (exec) If the kernel is loaded at a 2MB aligned address, the mapping continues to use 2MB pages as before: radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec) radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages radix-mmu: Mapped 0x0000000000400000-0x0000000002c00000 with 2.00 MiB pages (exec) radix-mmu: Mapped 0x0000000002c00000-0x0000000100000000 with 2.00 MiB pages Fixes: c55d7b5e6426 ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230110124753.1325426-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-09ia64: fix build error due to switch case label appearing next to declarationJames Morse1-2/+5
commit 6f28a2613497fc587e347afa99fa2c52230678a7 upstream. Since commit aa06a9bd8533 ("ia64: fix clock_getres(CLOCK_MONOTONIC) to report ITC frequency"), gcc 10.1.0 fails to build ia64 with the gnomic: | ../arch/ia64/kernel/sys_ia64.c: In function 'ia64_clock_getres': | ../arch/ia64/kernel/sys_ia64.c:189:3: error: a label can only be part of a statement and a declaration is not a statement | 189 | s64 tick_ns = DIV_ROUND_UP(NSEC_PER_SEC, local_cpu_data->itc_freq); This line appears immediately after a case label in a switch. Move the declarations out of the case, to the top of the function. Link: https://lkml.kernel.org/r/20230117151632.393836-1-james.morse@arm.com Fixes: aa06a9bd8533 ("ia64: fix clock_getres(CLOCK_MONOTONIC) to report ITC frequency") Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Sergei Trofimovich <slyich@gmail.com> Cc: Émeric Maschino <emeric.maschino@gmail.com> Cc: matoro <matoro_mailinglist_kernel@matoro.tk> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-09x86/debug: Fix stack recursion caused by wrongly ordered DR7 accessesJoerg Roedel1-2/+24
commit 9d2c7203ffdb846399b82b0660563c89e918c751 upstream. In kernels compiled with CONFIG_PARAVIRT=n, the compiler re-orders the DR7 read in exc_nmi() to happen before the call to sev_es_ist_enter(). This is problematic when running as an SEV-ES guest because in this environment the DR7 read might cause a #VC exception, and taking #VC exceptions is not safe in exc_nmi() before sev_es_ist_enter() has run. The result is stack recursion if the NMI was caused on the #VC IST stack, because a subsequent #VC exception in the NMI handler will overwrite the stack frame of the interrupted #VC handler. As there are no compiler barriers affecting the ordering of DR7 reads/writes, make the accesses to this register volatile, forbidding the compiler to re-order them. [ bp: Massage text, make them volatile too, to make sure some aggressive compiler optimization pass doesn't discard them. ] Fixes: 315562c9af3d ("x86/sev-es: Adjust #VC IST Stack on entering NMI handler") Reported-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230127035616.508966-1-aik@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-09riscv: disable generation of unwind tablesAndreas Schwab1-0/+3
commit 2f394c0e7d1129a35156e492bc8f445fb20f43ac upstream. GCC 13 will enable -fasynchronous-unwind-tables by default on riscv. In the kernel, we don't have any use for unwind tables yet, so disable them. More importantly, the .eh_frame section brings relocations (R_RISC_32_PCREL, R_RISCV_SET{6,8,16}, R_RISCV_SUB{6,8,16}) into modules that we are not prepared to handle. Signed-off-by: Andreas Schwab <schwab@suse.de> Link: https://lore.kernel.org/r/mvmzg9xybqu.fsf@suse.de Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-09parisc: Wire up PTRACE_GETREGS/PTRACE_SETREGS for compat caseHelge Deller1-2/+13
commit 316f1f42b5cc1d95124c1f0387c867c1ba7b6d0e upstream. Wire up the missing ptrace requests PTRACE_GETREGS, PTRACE_SETREGS, PTRACE_GETFPREGS and PTRACE_SETFPREGS when running 32-bit applications on 64-bit kernels. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # 4.7+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-09parisc: Replace hardcoded value with PRIV_USER constant in ptrace.cHelge Deller1-3/+3
commit 3f0c17809a098d3f0c1ec83f1fb3ca61638d3dcd upstream. Prefer usage of the PRIV_USER constant over the hard-coded value to set the lowest 2 bits for the userspace privilege. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # 5.16+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-09parisc: Fix return code of pdc_iodc_print()Helge Deller1-2/+3
commit 5d1335dabb3c493a3d6d5b233953b6ac7b6c1ff2 upstream. There is an off-by-one if the printed string includes a new-line char. Cc: stable@vger.kernel.org Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>