summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2023-03-03selftests: ocelot: tc_flower_chains: make test_vlan_ingress_modify() more ↵Vladimir Oltean1-1/+1
comprehensive [ Upstream commit bbb253b206b9c417928a6c827d038e457f3012e9 ] We have two IS1 filters of the OCELOT_VCAP_KEY_ANY key type (the one with "action vlan pop" and the one with "action vlan modify") and one of the OCELOT_VCAP_KEY_IPV4 key type (the one with "action skbedit priority"). But we have no IS1 filter with the OCELOT_VCAP_KEY_ETYPE key type, and there was an uncaught breakage there. To increase test coverage, convert one of the OCELOT_VCAP_KEY_ANY filters to OCELOT_VCAP_KEY_ETYPE, by making the filter also match on the MAC SA of the traffic sent by mausezahn, $h1_mac. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20230205192409.1796428-2-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-25selftests: kvm: move declaration at the beginning of main()Paolo Bonzini1-4/+3
[ Upstream commit 50aa870ba2f7735f556e52d15f61cd0f359c4c0b ] Placing a declaration of evt_reset is pedantically invalid according to the C standard. While GCC does not really care and only warns with -Wpedantic, clang ignores the declaration altogether with an error: x86_64/xen_shinfo_test.c:965:2: error: expected expression struct kvm_xen_hvm_attr evt_reset = { ^ x86_64/xen_shinfo_test.c:969:38: error: use of undeclared identifier evt_reset vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &evt_reset); ^ Reported-by: Yu Zhang <yu.c.zhang@linux.intel.com> Reported-by: Sean Christopherson <seanjc@google.com> Fixes: a79b53aaaab5 ("KVM: x86: fix deadlock for KVM_XEN_EVTCHN_RESET", 2022-12-28) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-25KVM: x86: fix deadlock for KVM_XEN_EVTCHN_RESETPaolo Bonzini1-0/+6
[ Upstream commit a79b53aaaab53de017517bf9579b6106397a523c ] While KVM_XEN_EVTCHN_RESET is usually called with no vCPUs running, if that happened it could cause a deadlock. This is due to kvm_xen_eventfd_reset() doing a synchronize_srcu() inside a kvm->lock critical section. To avoid this, first collect all the evtchnfd objects in an array and free all of them once the kvm->lock critical section is over and th SRCU grace period has expired. Reported-by: Michal Luczaj <mhal@rbox.co> Cc: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-22Revert "mm: Always release pages to the buddy allocator in ↵Aaron Thompson1-4/+0
memblock_free_late()." commit 647037adcad00f2bab8828d3d41cd0553d41f3bd upstream. This reverts commit 115d9d77bb0f9152c60b6e8646369fa7f6167593. The pages being freed by memblock_free_late() have already been initialized, but if they are in the deferred init range, __free_one_page() might access nearby uninitialized pages when trying to coalesce buddies. This can, for example, trigger this BUG: BUG: unable to handle page fault for address: ffffe964c02580c8 RIP: 0010:__list_del_entry_valid+0x3f/0x70 <TASK> __free_one_page+0x139/0x410 __free_pages_ok+0x21d/0x450 memblock_free_late+0x8c/0xb9 efi_free_boot_services+0x16b/0x25c efi_enter_virtual_mode+0x403/0x446 start_kernel+0x678/0x714 secondary_startup_64_no_verify+0xd2/0xdb </TASK> A proper fix will be more involved so revert this change for the time being. Fixes: 115d9d77bb0f ("mm: Always release pages to the buddy allocator in memblock_free_late().") Signed-off-by: Aaron Thompson <dev@aaront.org> Link: https://lore.kernel.org/r/20230207082151.1303-1-dev@aaront.org Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-22selftests: mptcp: userspace: fix v4-v6 test in v6.1Matthieu Baerts1-0/+11
The commit 4656d72c1efa ("selftests: mptcp: userspace: validate v4-v6 subflows mix") has been backported to v6.1.8 without any conflicts. But it looks like it was depending on a previous one: commit 1cc94ac1af4b ("selftests: mptcp: make evts global in userspace_pm") Without it, the test fails with: ./userspace_pm.sh: line 788: : No such file or directory # ADD_ADDR4 id:14 10.0.2.1 (ns1) => ns2, reuse port [FAIL] sed: can't read : No such file or directory This dependence refactors the way the monitoring files are being created: only once for all the different sub-tests instead of per sub-test. It is probably better to avoid backporting the refactoring. That is why the new sub-test has been adapted to work using the previous way that is still in place here in v6.1: the monitoring is started at the beginning of each sub-test and the created file is removed at the end. Fixes: f59549814a64 ("selftests: mptcp: userspace: validate v4-v6 subflows mix") Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-22selftest: net: Improve IPV6_TCLASS/IPV6_HOPLIMIT tests apparmor compatibilityAndrei Gherzan1-1/+1
[ Upstream commit a6efc42a86c0c87cfe2f1c3d1f09a4c9b13ba890 ] "tcpdump" is used to capture traffic in these tests while using a random, temporary and not suffixed file for it. This can interfere with apparmor configuration where the tool is only allowed to read from files with 'known' extensions. The MINE type application/vnd.tcpdump.pcap was registered with IANA for pcap files and .pcap is the extension that is both most common but also aligned with standard apparmor configurations. See TCPDUMP(8) for more details. This improves compatibility with standard apparmor configurations by using ".pcap" as the file extension for the tests' temporary files. Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-22tools/virtio: fix the vringh test for virtio ring changesShunsuke Mie8-5/+45
[ Upstream commit 3f7b75abf41cc4143aa295f62acbb060a012868d ] Fix the build caused by missing kmsan_handle_dma() and is_power_of_2() that are used in drivers/virtio/virtio_ring.c. Signed-off-by: Shunsuke Mie <mie@igel.co.jp> Message-Id: <20230110034310.779744-1-mie@igel.co.jp> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-22selftests/bpf: Verify copy_register_state() preserves parent/live fieldsEduard Zingerman1-0/+36
[ Upstream commit b9fa9bc839291020b362ab5392e5f18ba79657ac ] A testcase to check that verifier.c:copy_register_state() preserves register parentage chain and livness information. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230106142214.1040390-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14selftests: mptcp: stop tests earlierMatthieu Baerts1-4/+10
commit 070d6dafacbaa9d1f2e4e3edc263853d194af15e upstream. These 'endpoint' tests from 'mptcp_join.sh' selftest start a transfer in the background and check the status during this transfer. Once the expected events have been recorded, there is no reason to wait for the data transfer to finish. It can be stopped earlier to reduce the execution time by more than half. For these tests, the exchanged data were not verified. Errors, if any, were ignored but that's fine, plenty of other tests are looking at that. It is then OK to mute stderr now that we are sure errors will be printed (and still ignored) because the transfer is stopped before the end. Fixes: e274f7154008 ("selftests: mptcp: add subflow limits test-cases") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14selftests: mptcp: allow more slack for slow test-casePaolo Abeni1-2/+8
commit a635a8c3df66ab68dc088c08a4e9e955e22c0e64 upstream. A test-case is frequently failing on some extremely slow VMs. The mptcp transfer completes before the script is able to do all the required PM manipulation. Address the issue in the simplest possible way, making the transfer even more slow. Additionally dump more info in case of failures, to help debugging similar problems in the future and init dump_stats var. Fixes: e274f7154008 ("selftests: mptcp: add subflow limits test-cases") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/323 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14selftests: forwarding: lib: quote the sysctl valuesHangbin Liu1-2/+2
[ Upstream commit 3a082086aa200852545cf15159213582c0c80eba ] When set/restore sysctl value, we should quote the value as some keys may have multi values, e.g. net.ipv4.ping_group_range Fixes: f5ae57784ba8 ("selftests: forwarding: lib: Add sysctl_set(), sysctl_restore()") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/20230208032110.879205-1-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14selftests: Fix failing VXLAN VNI filtering testIdo Schimmel1-13/+5
[ Upstream commit b963d9d5b9437a6b99504987310f98537c9e77d4 ] iproute2 does not recognize the "group6" and "remote6" keywords. Fix by using "group" and "remote" instead. Before: # ./test_vxlan_vnifiltering.sh [...] Tests passed: 25 Tests failed: 2 After: # ./test_vxlan_vnifiltering.sh [...] Tests passed: 27 Tests failed: 0 Fixes: 3edf5f66c12a ("selftests: add new tests for vxlan vnifiltering") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230207141819.256689-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09cgroup/cpuset: Fix wrong check in update_parent_subparts_cpumask()Waiman Long1-0/+1
commit e5ae8803847b80fe9d744a3174abe2b7bfed222a upstream. It was found that the check to see if a partition could use up all the cpus from the parent cpuset in update_parent_subparts_cpumask() was incorrect. As a result, it is possible to leave parent with no effective cpu left even if there are tasks in the parent cpuset. This can lead to system panic as reported in [1]. Fix this probem by updating the check to fail the enabling the partition if parent's effective_cpus is a subset of the child's cpus_allowed. Also record the error code when an error happens in update_prstate() and add a test case where parent partition and child have the same cpu list and parent has task. Enabling partition in the child will fail in this case. [1] https://www.spinics.net/lists/cgroups/msg36254.html Fixes: f0af1bfc27b5 ("cgroup/cpuset: Relax constraints to partition & cpus changes") Cc: stable@vger.kernel.org # v6.1 Reported-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-09selftests: net: udpgso_bench_tx: Cater for pending datagrams zerocopy ↵Andrei Gherzan1-7/+27
benchmarking [ Upstream commit 329c9cd769c2e306957df031efff656c40922c76 ] The test tool can check that the zerocopy number of completions value is valid taking into consideration the number of datagram send calls. This can catch the system into a state where the datagrams are still in the system (for example in a qdisk, waiting for the network interface to return a completion notification, etc). This change adds a retry logic of computing the number of completions up to a configurable (via CLI) timeout (default: 2 seconds). Fixes: 79ebc3c26010 ("net/udpgso_bench_tx: options to exercise TX CMSG") Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230201001612.515730-4-andrei.gherzan@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09selftests: net: udpgso_bench: Fix racing bug between the rx/tx programsAndrei Gherzan1-4/+20
[ Upstream commit dafe93b9ee21028d625dce347118b82659652eff ] "udpgro_bench.sh" invokes udpgso_bench_rx/udpgso_bench_tx programs subsequently and while doing so, there is a chance that the rx one is not ready to accept socket connections. This racing bug could fail the test with at least one of the following: ./udpgso_bench_tx: connect: Connection refused ./udpgso_bench_tx: sendmsg: Connection refused ./udpgso_bench_tx: write: Connection refused This change addresses this by making udpgro_bench.sh wait for the rx program to be ready before firing off the tx one - up to a 10s timeout. Fixes: 3a687bef148d ("selftests: udp gso benchmark") Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230201001612.515730-3-andrei.gherzan@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09selftests: net: udpgso_bench_rx/tx: Stop when wrong CLI args are providedAndrei Gherzan2-0/+4
[ Upstream commit db9b47ee9f5f375ab0c5daeb20321c75b4fa657d ] Leaving unrecognized arguments buried in the output, can easily hide a CLI/script typo. Avoid this by exiting when wrong arguments are provided to the udpgso_bench test programs. Fixes: 3a687bef148d ("selftests: udp gso benchmark") Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230201001612.515730-2-andrei.gherzan@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09selftests: net: udpgso_bench_rx: Fix 'used uninitialized' compiler warningAndrei Gherzan1-1/+1
[ Upstream commit c03c80e3a03ffb4f790901d60797e9810539d946 ] This change fixes the following compiler warning: /usr/include/x86_64-linux-gnu/bits/error.h:40:5: warning: ‘gso_size’ may be used uninitialized [-Wmaybe-uninitialized] 40 | __error_noreturn (__status, __errnum, __format, __va_arg_pack ()); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ udpgso_bench_rx.c: In function ‘main’: udpgso_bench_rx.c:253:23: note: ‘gso_size’ was declared here 253 | int ret, len, gso_size, budget = 256; Fixes: 3327a9c46352 ("selftests: add functionals test for UDP GRO") Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230201001612.515730-1-andrei.gherzan@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09selftests/filesystems: grant executable permission to run_fat_tests.shPengfei Xu1-0/+0
[ Upstream commit 24b5308cf5ee9f52dd22f3af78a5b0cdc9d35e72 ] When use tools/testing/selftests/kselftest_install.sh to make the kselftest-list.txt under tools/testing/selftests/kselftest_install. Then use tools/testing/selftests/kselftest_install/run_kselftest.sh to run all the kselftests in kselftest-list.txt, it will be blocked by case "filesystems/fat: run_fat_tests.sh" with "Warning: file run_fat_tests.sh is not executable", so grant executable permission to run_fat_tests.sh to fix this issue. Link: https://lkml.kernel.org/r/dfdbba6df8a1ab34bb1e81cd8bd7ca3f9ed5c369.1673424747.git.pengfei.xu@intel.com Fixes: dd7c9be330d8 ("selftests/filesystems: add a vfat RENAME_EXCHANGE test") Signed-off-by: Pengfei Xu <pengfei.xu@intel.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-06kselftest: Fix error message for unconfigured LLVM buildsMark Brown1-1/+1
[ Upstream commit 9fdaca2c1e157dc0a3c0faecf3a6a68e7d8d0c7b ] We are missing a ) when we attempt to complain about not having enough configuration for clang, resulting in the rather inscrutable error: ../lib.mk:23: *** unterminated call to function 'error': missing ')'. Stop. Add the required ) so we print the message we were trying to print. Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01tools: gpio: fix -c option of gpio-event-monIvo Borisov Shopov1-0/+1
[ Upstream commit 677d85e1a1ee69fa05ccea83847309484be3781c ] Following line should listen for a rising edge and exit after the first one since '-c 1' is provided. # gpio-event-mon -n gpiochip1 -o 0 -r -c 1 It works with kernel 4.19 but it doesn't work with 5.10. In 5.10 the above command doesn't exit after the first rising edge it keep listening for an event forever. The '-c 1' is not taken into an account. The problem is in commit 62757c32d5db ("tools: gpio: add multi-line monitoring to gpio-event-mon"). Before this commit the iterator 'i' in monitor_device() is used for counting of the events (loops). In the case of the above command (-c 1) we should start from 0 and increment 'i' only ones and hit the 'break' statement and exit the process. But after the above commit counting doesn't start from 0, it start from 1 when we listen on one line. It is because 'i' is used from one more purpose, counting of lines (num_lines) and it isn't restore to 0 after following code for (i = 0; i < num_lines; i++) gpiotools_set_bit(&values.mask, i); Restore the initial value of the iterator to 0 in order to allow counting of loops to work for any cases. Fixes: 62757c32d5db ("tools: gpio: add multi-line monitoring to gpio-event-mon") Signed-off-by: Ivo Borisov Shopov <ivoshopov@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> [Bartosz: tweak the commit message] Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01Revert "selftests/bpf: check null propagation only neither reg is PTR_TO_BTF_ID"Sasha Levin2-51/+0
This reverts commit 2e5d5c4ae77dedc7eba0e496474ab93f05b25adf. Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01tools/nolibc: prevent gcc from making memset() loop over itselfWilly Tarreau1-1/+4
[ Upstream commit 1bfbe1f3e96720daf185f03d101f072d69753f88 ] When building on ARM in thumb mode with gcc-11.3 at -O2 or -O3, nolibc-test segfaults during the select() tests. It turns out that at this level, gcc recognizes an opportunity for using memset() to zero the fd_set, but it miscompiles it because it also recognizes a memset pattern as well, and decides to call memset() from the memset() code: 000122bc <memset>: 122bc: b510 push {r4, lr} 122be: 0004 movs r4, r0 122c0: 2a00 cmp r2, #0 122c2: d003 beq.n 122cc <memset+0x10> 122c4: 23ff movs r3, #255 ; 0xff 122c6: 4019 ands r1, r3 122c8: f7ff fff8 bl 122bc <memset> 122cc: 0020 movs r0, r4 122ce: bd10 pop {r4, pc} Simply placing an empty asm() statement inside the loop suffices to avoid this. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01tools/nolibc: fix missing includes causing build issues at -O0Willy Tarreau10-0/+29
[ Upstream commit 55abdd1f5e1e07418bf4a46c233a92f83cb5ae97 ] After the nolibc includes were split to facilitate portability from standard libcs, programs that include only what they need may miss some symbols which are needed by libgcc. This is the case for raise() which is needed by the divide by zero code in some architectures for example. Regardless, being able to include only the apparently needed files is convenient. Instead of trying to move all exported definitions to a single file, since this can change over time, this patch takes another approach consisting in including the nolibc header at the end of all standard include files. This way their types and functions are already known at the moment of inclusion, and including any single one of them is sufficient to bring all the required ones. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01tools/nolibc: Fix S_ISxxx macrosWarner Losh1-7/+7
[ Upstream commit 16f5cea74179b5795af7ce359971f5128d10f80e ] The mode field has the type encoded as an value in a field, not as a bit mask. Mask the mode with S_IFMT instead of each type to test. Otherwise, false positives are possible: eg S_ISDIR will return true for block devices because S_IFDIR = 0040000 and S_IFBLK = 0060000 since mode is masked with S_IFDIR instead of S_IFMT. These macros now match the similar definitions in tools/include/uapi/linux/stat.h. Signed-off-by: Warner Losh <imp@bsdimp.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01nolibc: fix fd_set typeSven Schnelle1-23/+30
[ Upstream commit feaf75658783a919410f8c2039dbc24b6a29603d ] The kernel uses unsigned long for the fd_set bitmap, but nolibc use u32. This works fine on little endian machines, but fails on big endian. Convert to unsigned long to fix this. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01selftests/net: toeplitz: fix race on tpacket_v3 block closeWillem de Bruijn1-5/+7
[ Upstream commit 903848249a781d76d59561d51676c95b3a4d7162 ] Avoid race between process wakeup and tpacket_v3 block timeout. The test waits for cfg_timeout_msec for packets to arrive. Packets arrive in tpacket_v3 rings, which pass packets ("frames") to the process in batches ("blocks"). The sk waits for req3.tp_retire_blk_tov msec to release a block. Set the block timeout lower than the process waiting time, else the process may find that no block has been released by the time it scans the socket list. Convert to a ring of more than one, smaller, blocks with shorter timeouts. Blocks must be page aligned, so >= 64KB. Fixes: 5ebfb4cc3048 ("selftests/net: toeplitz test") Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230118151847.4124260-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-24selftests: mptcp: userspace: validate v4-v6 subflows mixMatthieu Baerts1-0/+47
commit 4656d72c1efa495a58ad6d8b073a60907073e4e6 upstream. MPTCP protocol supports having subflows in both IPv4 and IPv6. In Linux, it is possible to have that if the MPTCP socket has been created with AF_INET6 family without the IPV6_V6ONLY option. Here, a new IPv4 subflow is being added to the initial IPv6 connection, then being removed using Netlink commands. Cc: stable@vger.kernel.org # v5.19+ Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-24proc: fix PIE proc-empty-vm, proc-pid-vm testsAlexey Dobriyan2-9/+12
commit 5316a017d093f644675a56523bcf5787ba8f4fef upstream. vsyscall detection code uses direct call to the beginning of the vsyscall page: asm ("call %P0" :: "i" (0xffffffffff600000)) It generates "call rel32" instruction but it is not relocated if binary is PIE, so binary segfaults into random userspace address and vsyscall page status is detected incorrectly. Do more direct: asm ("call *%rax") which doesn't do need any relocaltions. Mark g_vsyscall as volatile for a good measure, I didn't find instruction setting it to 0. Now the code is obviously correct: xor eax, eax mov rdi, rbp mov rsi, rbp mov DWORD PTR [rip+0x2d15], eax # g_vsyscall = 0 mov rax, 0xffffffffff600000 call rax mov DWORD PTR [rip+0x2d02], 1 # g_vsyscall = 1 mov eax, DWORD PTR ds:0xffffffffff600000 mov DWORD PTR [rip+0x2cf1], 2 # g_vsyscall = 2 mov edi, [rip+0x2ceb] # exit(g_vsyscall) call exit Note: fixed proc-empty-vm test oopses 5.19.0-28-generic kernel but this is separate story. Link: https://lkml.kernel.org/r/Y7h2xvzKLg36DSq8@p183 Fixes: 5bc73bb3451b9 ("proc: test how it holds up with mapping'less process") Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-24memblock tests: Fix compilation error.Aaron Thompson2-1/+3
[ Upstream commit 340726747336716350eb5a928b860a29db955f05 ] Commit cf4694be2b2cf ("tools: Add atomic_test_and_set_bit()") changed tools/arch/x86/include/asm/atomic.h to include <asm/asm.h>, which causes 'make -C tools/testing/memblock' to fail with: In file included from ../../include/asm/atomic.h:6, from ../../include/linux/atomic.h:5, from ./linux/mmzone.h:5, from ../../include/linux/mm.h:5, from ../../include/linux/pfn.h:5, from ./linux/memory_hotplug.h:6, from ./linux/init.h:7, from ./linux/memblock.h:11, from tests/common.h:8, from tests/basic_api.h:5, from main.c:2: ../../include/asm/../../arch/x86/include/asm/atomic.h:11:10: fatal error: asm/asm.h: No such file or directory 11 | #include <asm/asm.h> | ^~~~~~~~~~~ Create a symlink to asm/asm.h in the same manner as the existing one to asm/cmpxchg.h. Signed-off-by: Aaron Thompson <dev@aaront.org> Link: https://lore.kernel.org/r/010101857c402765-96e2dbc6-b82b-47e2-a437-4834dbe0b96b-000000@us-west-2.amazonses.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-24selftests: net: fix cmsg_so_mark.sh test hangPo-Hsu Lin1-1/+1
[ Upstream commit 1573c6882018f69991aead951d09423ce978adac ] This cmsg_so_mark.sh test will hang on non-amd64 systems because of the infinity loop for argument parsing in cmsg_sender. Variable "o" in cs_parse_args() for taking getopt() should be an int, otherwise it will be 255 when getopt() returns -1 on non-amd64 system and thus causing infinity loop. Link: https://lore.kernel.org/lkml/CA+G9fYsM2k7mrF7W4V_TrZ-qDauWM394=8yEJ=-t1oUg8_40YA@mail.gmail.com/t/ Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-24tools/virtio: initialize spinlocks in vring_test.cRicardo Cañuelo1-0/+2
[ Upstream commit c262f75cb6bb5a63828e72ce3b8fe808e5029479 ] The virtio_device vqs_list spinlocks must be initialized before use to prevent functions that manipulate the device virtualqueues, such as vring_new_virtqueue(), from blocking indefinitely. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> Message-Id: <20221012062949.1526176-1-ricardo.canuelo@collabora.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-24selftests/bpf: check null propagation only neither reg is PTR_TO_BTF_IDHao Sun2-0/+51
[ Upstream commit cedebd74cf3883f0384af9ec26b4e6f8f1964dd4 ] Verify that nullness information is not porpagated in the branches of register to register JEQ and JNE operations if one of them is PTR_TO_BTF_ID. Implement this in C level so we can use CO-RE. Signed-off-by: Hao Sun <sunhao.th@gmail.com> Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20221222024414.29539-2-sunhao.th@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18perf kmem: Support field "node" in evsel__process_alloc_event() coping with ↵Leo Yan1-12/+24
recent tracepoint restructuring [ Upstream commit dce088ab0d51ae3b14fb2bd608e9c649aadfe5dc ] Commit 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints") adds the field "node" into the tracepoints 'kmalloc' and 'kmem_cache_alloc', so this patch modifies the event process function to support the field "node". If field "node" is detected by checking function evsel__field(), it stats the cross allocation. When the "node" value is NUMA_NO_NODE (-1), it means the memory can be allocated from any memory node, in this case, we don't account it as a cross allocation. Fixes: 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints") Reported-by: Ravi Bangoria <ravi.bangoria@amd.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/20230108062400.250690-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18perf kmem: Support legacy tracepointsLeo Yan1-3/+26
[ Upstream commit b3719108ae60169eda5c941ca5e1be1faa371c57 ] Commit 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints") removed tracepoints 'kmalloc_node' and 'kmem_cache_alloc_node', we need to consider the tool should be backward compatible. If it detect the tracepoint "kmem:kmalloc_node", this patch enables the legacy tracepoints, otherwise, it will ignore them. Fixes: 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints") Reported-by: Ravi Bangoria <ravi.bangoria@amd.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/20230108062400.250690-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18perf build: Properly guard libbpf includesIan Rogers2-0/+8
[ Upstream commit d891f2b724b39a2a41e3ad7b57110193993242ff ] Including libbpf header files should be guarded by HAVE_LIBBPF_SUPPORT. In bpf_counter.h, move the skeleton utilities under HAVE_BPF_SKEL. Fixes: d6a735ef3277c45f ("perf bpf_counter: Move common functions to bpf_counter.h") Reported-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Mike Leach <mike.leach@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20230105172243.7238-1-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18selftests/net: l2_tos_ttl_inherit.sh: Ensure environment cleanup on failure.Guillaume Nault1-4/+36
[ Upstream commit d68ff8ad3351b8fc8d6f14b9a4f5cc8ba3e8bd13 ] Use 'set -e' and an exit handler to stop the script if a command fails and ensure the test environment is cleaned up in any case. Also, handle the case where the script is interrupted by SIGINT. The only command that's expected to fail is 'wait $ping_pid', since it's killed by the script. Handle this case with '|| true' to make it play well with 'set -e'. Finally, return the Kselftest SKIP code (4) when the script breaks because of an environment problem or a command line failure. The 0 and 1 return codes should now reliably indicate that all tests have been run (0: all tests run and passed, 1: all tests run but at least one failed, 4: test script didn't run completely). Fixes: b690842d12fd ("selftests/net: test l2 tunnel TOS/TTL inheriting") Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18selftests/net: l2_tos_ttl_inherit.sh: Run tests in their own netns.Guillaume Nault1-69/+93
[ Upstream commit c53cb00f7983a5474f2d36967f84908b85af9159 ] This selftest currently runs half in the current namespace and half in a netns of its own. Therefore, the test can fail if the current namespace is already configured with incompatible parameters (for example if it already has a veth0 interface). Adapt the script to put both ends of the veth pair in their own netns. Now veth0 is created in NS0 instead of the current namespace, while veth1 is set up in NS1 (instead of the 'testing' netns). The user visible netns names are randomised to minimise the risk of conflicts with already existing namespaces. The cleanup() function doesn't need to remove the virtual interface anymore: deleting NS0 and NS1 automatically removes the virtual interfaces they contained. We can remove $ns, which was only used to run ip commands in the 'testing' netns (let's use the builtin "-netns" option instead). However, we still need a similar functionality as ping and tcpdump now need to run in NS0. So we now have $RUN_NS0 for that. Fixes: b690842d12fd ("selftests/net: test l2 tunnel TOS/TTL inheriting") Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18selftests/net: l2_tos_ttl_inherit.sh: Set IPv6 addresses with "nodad".Guillaume Nault1-4/+4
[ Upstream commit e59370b2e96eb8e7e057a2a16e999ff385a3f2fb ] The ping command can run before DAD completes. In that case, ping may fail and break the selftest. We don't need DAD here since we're working on isolated device pairs. Fixes: b690842d12fd ("selftests/net: test l2 tunnel TOS/TTL inheriting") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tools/nolibc: fix the O_* fcntl/open macro definitions for riscvWilly Tarreau1-7/+7
[ Upstream commit 00b18da4089330196906b9fe075c581c17eb726c ] When RISCV port was imported in 5.2, the O_* macros were taken with their octal value and written as-is in hex, resulting in the getdents64() to fail in nolibc-test. Fixes: 582e84f7b779 ("tool headers nolibc: add RISCV support") #5.2 Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18tools/nolibc: restore mips branch ordering in the _start blockWilly Tarreau1-0/+2
[ Upstream commit 184177c3d6e023da934761e198c281344d7dd65b ] Depending on the compiler used and the optimization options, the sbrk() test was crashing, both on real hardware (mips-24kc) and in qemu. One such example is kernel.org toolchain in version 11.3 optimizing at -Os. Inspecting the sys_brk() call shows the following code: 0040047c <sys_brk>: 40047c: 24020fcd li v0,4045 400480: 27bdffe0 addiu sp,sp,-32 400484: 0000000c syscall 400488: 27bd0020 addiu sp,sp,32 40048c: 10e00001 beqz a3,400494 <sys_brk+0x18> 400490: 00021023 negu v0,v0 400494: 03e00008 jr ra It is obviously wrong, the "negu" instruction is placed in beqz's delayed slot, and worse, there's no nop nor instruction after the return, so the next function's first instruction (addiu sip,sip,-32) will also be executed as part of the delayed slot that follows the return. This is caused by the ".set noreorder" directive in the _start block, that applies to the whole program. The compiler emits code without the delayed slots and relies on the compiler to swap instructions when this option is not set. Removing the option would require to change the startup code in a way that wouldn't make it look like the resulting code, which would not be easy to debug. Instead let's just save the default ordering before changing it, and restore it at the end of the _start block. Now the code is correct: 0040047c <sys_brk>: 40047c: 24020fcd li v0,4045 400480: 27bdffe0 addiu sp,sp,-32 400484: 0000000c syscall 400488: 10e00002 beqz a3,400494 <sys_brk+0x18> 40048c: 27bd0020 addiu sp,sp,32 400490: 00021023 negu v0,v0 400494: 03e00008 jr ra 400498: 00000000 nop Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc") #5.0 Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18af_unix: selftest: Fix the size of the parameter to connect()Mirsad Goran Todorovac1-1/+1
[ Upstream commit 7d6ceeb1875cc08dc3d1e558e191434d94840cd5 ] Adjust size parameter in connect() to match the type of the parameter, to fix "No such file or directory" error in selftests/net/af_unix/ test_oob_unix.c:127. The existing code happens to work provided that the autogenerated pathname is shorter than sizeof (struct sockaddr), which is why it hasn't been noticed earlier. Visible from the trace excerpt: bind(3, {sa_family=AF_UNIX, sun_path="unix_oob_453059"}, 110) = 0 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fa6a6577a10) = 453060 [pid <child>] connect(6, {sa_family=AF_UNIX, sun_path="unix_oob_45305"}, 16) = -1 ENOENT (No such file or directory) BUG: The filename is trimmed to sizeof (struct sockaddr). Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Cc: Florian Westphal <fw@strlen.de> Reviewed-by: Florian Westphal <fw@strlen.de> Fixes: 314001f0bf92 ("af_unix: Add OOB support") Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18selftests: netfilter: fix transaction test script timeout handlingFlorian Westphal2-7/+10
commit c273289fac370b6488757236cd62cc2cf04830b7 upstream. The kselftest framework uses a default timeout of 45 seconds for all test scripts. Increase the timeout to two minutes for the netfilter tests, this should hopefully be enough, Make sure that, should the script be canceled, the net namespace and the spawned ping instances are removed. Fixes: 25d8bcedbf43 ("selftests: add script to stress-test nft packet path vs. control plane") Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18mm: Always release pages to the buddy allocator in memblock_free_late().Aaron Thompson1-0/+4
commit 115d9d77bb0f9152c60b6e8646369fa7f6167593 upstream. If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages() only releases pages to the buddy allocator if they are not in the deferred range. This is correct for free pages (as defined by for_each_free_mem_pfn_range_in_zone()) because free pages in the deferred range will be initialized and released as part of the deferred init process. memblock_free_pages() is called by memblock_free_late(), which is used to free reserved ranges after memblock_free_all() has run. All pages in reserved ranges have been initialized at that point, and accordingly, those pages are not touched by the deferred init process. This means that currently, if the pages that memblock_free_late() intends to release are in the deferred range, they will never be released to the buddy allocator. They will forever be reserved. In addition, memblock_free_pages() calls kmsan_memblock_free_pages(), which is also correct for free pages but is not correct for reserved pages. KMSAN metadata for reserved pages is initialized by kmsan_init_shadow(), which runs shortly before memblock_free_all(). For both of these reasons, memblock_free_pages() should only be called for free pages, and memblock_free_late() should call __free_pages_core() directly instead. One case where this issue can occur in the wild is EFI boot on x86_64. The x86 EFI code reserves all EFI boot services memory ranges via memblock_reserve() and frees them later via memblock_free_late() (efi_reserve_boot_services() and efi_free_boot_services(), respectively). If any of those ranges happens to fall within the deferred init range, the pages will not be released and that memory will be unavailable. For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI: v6.2-rc2: # grep -E 'Node|spanned|present|managed' /proc/zoneinfo Node 0, zone DMA spanned 4095 present 3999 managed 3840 Node 0, zone DMA32 spanned 246652 present 245868 managed 178867 v6.2-rc2 + patch: # grep -E 'Node|spanned|present|managed' /proc/zoneinfo Node 0, zone DMA spanned 4095 present 3999 managed 3840 Node 0, zone DMA32 spanned 246652 present 245868 managed 222816 # +43,949 pages Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") Signed-off-by: Aaron Thompson <dev@aaront.org> Link: https://lore.kernel.org/r/01010185892de53e-e379acfb-7044-4b24-b30a-e2657c1ba989-000000@us-west-2.amazonses.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18perf auxtrace: Fix address filter duplicate symbol selectionAdrian Hunter1-1/+1
commit cf129830ee820f7fc90b98df193cd49d49344d09 upstream. When a match has been made to the nth duplicate symbol, return success not error. Example: Before: $ cat file.c cat: file.c: No such file or directory $ cat file1.c #include <stdio.h> static void func(void) { printf("First func\n"); } void other(void); int main() { func(); other(); return 0; } $ cat file2.c #include <stdio.h> static void func(void) { printf("Second func\n"); } void other(void) { func(); } $ gcc -Wall -Wextra -o test file1.c file2.c $ perf record -e intel_pt//u --filter 'filter func @ ./test' -- ./test Multiple symbols with name 'func' #1 0x1149 l func which is near main #2 0x1179 l func which is near other Disambiguate symbol name by inserting #n after the name e.g. func #2 Or select a global symbol by inserting #0 or #g or #G Failed to parse address filter: 'filter func @ ./test' Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>] Where multiple filters are separated by space or comma. $ perf record -e intel_pt//u --filter 'filter func #2 @ ./test' -- ./test Failed to parse address filter: 'filter func #2 @ ./test' Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>] Where multiple filters are separated by space or comma. After: $ perf record -e intel_pt//u --filter 'filter func #2 @ ./test' -- ./test First func Second func [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.016 MB perf.data ] $ perf script --itrace=b -Ftime,flags,ip,sym,addr --ns 1231062.526977619: tr strt 0 [unknown] => 558495708179 func 1231062.526977619: tr end call 558495708188 func => 558495708050 _init 1231062.526979286: tr strt 0 [unknown] => 55849570818d func 1231062.526979286: tr end return 55849570818f func => 55849570819d other Fixes: 1b36c03e356936d6 ("perf record: Add support for using symbols in address filters") Reported-by: Dmitrii Dolgov <9erthalion6@gmail.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230110185659.15979-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-14selftests/vm/pkeys: Add a regression test for setting PKRU through ptraceKyle Huey2-2/+141
commit 6ea25770b043c7997ab21d1ce95ba5de4d3d85d9 upstream. This tests PTRACE_SETREGSET with NT_X86_XSTATE modifying PKRU directly and removing the PKRU bit from XSTATE_BV. Signed-off-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20221115230932.7126-7-khuey%40kylehuey.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-14parisc: Align parisc MADV_XXX constants with all other architecturesHelge Deller2-18/+6
commit 71bdea6f798b425bc0003780b13e3fdecb16a010 upstream. Adjust some MADV_XXX constants to be in sync what their values are on all other platforms. There is currently no reason to have an own numbering on parisc, but it requires workarounds in many userspace sources (e.g. glibc, qemu, ...) - which are often forgotten and thus introduce bugs and different behaviour on parisc. A wrapper avoids an ABI breakage for existing userspace applications by translating any old values to the new ones, so this change allows us to move over all programs to the new ABI over time. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-12perf stat: Fix handling of --for-each-cgroup with --bpf-counters to match ↵Namhyung Kim1-5/+18
non BPF mode [ Upstream commit 54b353a20c7e8be98414754f5aff98c8a68fcc1f ] The --for-each-cgroup can have the same cgroup multiple times, but this confuses BPF counters (since they have the same cgroup id), making only the last cgroup events to be counted. Let's check the cgroup name before adding a new entry to the cgroups list. Before: $ sudo ./perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1 Performance counter stats for 'system wide': <not counted> msec cpu-clock / <not counted> context-switches / <not counted> cpu-migrations / <not counted> page-faults / <not counted> cycles / <not counted> instructions / <not counted> branches / <not counted> branch-misses / 8,016.04 msec cpu-clock / # 7.998 CPUs utilized 6,152 context-switches / # 767.461 /sec 250 cpu-migrations / # 31.187 /sec 442 page-faults / # 55.139 /sec 613,111,487 cycles / # 0.076 GHz 280,599,604 instructions / # 0.46 insn per cycle 57,692,724 branches / # 7.197 M/sec 3,385,168 branch-misses / # 5.87% of all branches 1.002220125 seconds time elapsed After it becomes similar to the non-BPF mode: $ sudo ./perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1 Performance counter stats for 'system wide': 8,013.38 msec cpu-clock / # 7.998 CPUs utilized 6,859 context-switches / # 855.944 /sec 334 cpu-migrations / # 41.680 /sec 345 page-faults / # 43.053 /sec 782,326,119 cycles / # 0.098 GHz 471,645,724 instructions / # 0.60 insn per cycle 94,963,430 branches / # 11.851 M/sec 3,685,511 branch-misses / # 3.88% of all branches 1.001864539 seconds time elapsed Committer notes: As a reminder, to test with BPF counters one has to use BUILD_BPF_SKEL=1 in the make command line and have clang/llvm installed when building perf, otherwise the --bpf-counters option will not be available: # perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1 Error: unknown option `bpf-counters' Usage: perf stat [<options>] [<command>] -a, --all-cpus system-wide collection from all CPUs <SNIP> # Fixes: bb1c15b60b981d10 ("perf stat: Support regex pattern in --for-each-cgroup") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: bpf@vger.kernel.org Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/r/20230104064402.1551516-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-12perf stat: Fix handling of unsupported cgroup events when using BPF countersNamhyung Kim1-11/+3
[ Upstream commit 2d656b0f81b22101db0447f890e39fdd736b745e ] When --for-each-cgroup option is used, it fails when any of events is not supported and exits immediately. This is not how 'perf stat' handles unsupported events. Let's ignore the failure and proceed with others so that the output is similar to when BPF counters are not used: Before: $ sudo ./perf stat -a --bpf-counters -e L1-icache-loads,L1-dcache-loads --for-each-cgroup system.slice,user.slice sleep 1 Failed to open first cgroup events $ After it shows output similat to when --bpf-counters isn't specified: $ sudo ./perf stat -a --bpf-counters -e L1-icache-loads,L1-dcache-loads --for-each-cgroup system.slice,user.slice sleep 1 Performance counter stats for 'system wide': <not supported> L1-icache-loads system.slice 29,892,418 L1-dcache-loads system.slice <not supported> L1-icache-loads user.slice 52,497,220 L1-dcache-loads user.slice $ Fixes: 944138f048f7d759 ("perf stat: Enable BPF counter with --for-each-cgroup") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/r/20230104064402.1551516-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-12perf lock contention: Fix core dump related to not finding the ↵Thomas Richter1-0/+2
"__sched_text_end" symbol on s/390 [ Upstream commit d8d85ce86dc82de4f88b821a78f533b9d5b22a45 ] The test case perf lock contention dumps core on s390. Run the following commands: # ./perf lock record -- ./perf bench sched messaging # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 2.799 [sec] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.073 MB perf.data (100 samples) ] # # ./perf lock contention Segmentation fault (core dumped) # The function call stack is lengthy, here are the top 5 functions: # gdb ./perf core.24048 GNU gdb (GDB) Fedora Linux 12.1-6.fc37 Core was generated by `./perf lock contention'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28, addr=1789230) at util/machine.c:3356 3356 machine->sched.text_end = kmap->unmap_ip(kmap, sym->start); (gdb) where #0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28, addr=1789230) at util/machine.c:3356 #1 0x000000000109f244 in callchain_id (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:957 #2 0x000000000109e094 in get_key_by_aggr_mode (key=0x3ffea4f7290, addr=27758136, evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:586 #3 0x000000000109f4d0 in report_lock_contention_begin_event (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:1004 #4 0x00000000010a00ae in evsel__process_contention_begin (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:1254 #5 0x00000000010a0e14 in process_sample_event (tool=0x3ffea4f8480, event=0x3ff85601ef8, sample=0x3ffea4f77d0, evsel=0x30313e0, machine=0x3029e28) at builtin-lock.c:1464 ..... The issue is in function machine__is_lock_function() in file ./util/machine.c lines 3355: /* should not fail from here */ sym = machine__find_kernel_symbol_by_name(machine, "__sched_text_end", &kmap); machine->sched.text_end = kmap->unmap_ip(kmap, sym->start) On s390 the symbol __sched_text_end is *NOT* in the symbol list and the resulting pointer sym is set to NULL. The sym->start is then a NULL pointer access and generates the core dump. The reason why __sched_text_end is not in the symbol list on s390 is simple: When the symbol list is created at perf start up with function calls dso__load +--> dso__load_vmlinux_path +--> dso__load_vmlinux +--> dso__load_sym +--> dso__load_sym_internal (reads kernel symbols) +--> symbols__fixup_end +--> symbols__fixup_duplicate The issue is in function symbols__fixup_duplicate(). It deletes all symbols with have the same address. On s390: # nm -g ~/linux/vmlinux| fgrep c68390 0000000000c68390 T __cpuidle_text_start 0000000000c68390 T __sched_text_end # two symbols have identical addresses and __sched_text_end is considered duplicate (in ascending sort order) and removed from the symbol list. Therefore it is missing and an invalid pointer reference occurs. The code checks for symbol __sched_text_start and when it exists assumes symbol __sched_text_end is also in the symbol table. However this is not the case on s390. Same situation exists for symbol __lock_text_start: 0000000000c68770 T __cpuidle_text_end 0000000000c68770 T __lock_text_start This symbol is also removed from the symbol table but used in function machine__is_lock_function(). To fix this and keep duplicate symbols in the symbol table, set symbol_conf.allow_aliases to true. This prevents the removal of duplicate symbols in function symbols__fixup_duplicate(). Output After: # ./perf lock contention contended total wait max wait avg wait type caller 48 124.39 ms 123.99 ms 2.59 ms rwsem:W unlink_anon_vmas+0x24a 47 83.68 ms 83.26 ms 1.78 ms rwsem:W free_pgtables+0x132 5 41.22 us 10.55 us 8.24 us rwsem:W free_pgtables+0x140 4 40.12 us 20.55 us 10.03 us rwsem:W copy_process+0x1ac8 # Fixes: 0d2997f750d1de39 ("perf lock: Look up callchain for the contended locks") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20221230102627.2410847-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-12perf tools: Fix resources leak in perf_data__open_dir()Miaoqian Lin1-0/+2
[ Upstream commit 0a6564ebd953c4590663c9a3c99a3ea9920ade6f ] In perf_data__open_dir(), opendir() opens the directory stream. Add missing closedir() to release it after use. Fixes: eb6176709b235b96 ("perf data: Add perf_data__open_dir_data function") Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221229090903.1402395-1-linmq006@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>