Age | Commit message (Collapse) | Author | Files | Lines |
|
comprehensive
[ Upstream commit bbb253b206b9c417928a6c827d038e457f3012e9 ]
We have two IS1 filters of the OCELOT_VCAP_KEY_ANY key type (the one with
"action vlan pop" and the one with "action vlan modify") and one of the
OCELOT_VCAP_KEY_IPV4 key type (the one with "action skbedit priority").
But we have no IS1 filter with the OCELOT_VCAP_KEY_ETYPE key type, and
there was an uncaught breakage there.
To increase test coverage, convert one of the OCELOT_VCAP_KEY_ANY
filters to OCELOT_VCAP_KEY_ETYPE, by making the filter also match on the
MAC SA of the traffic sent by mausezahn, $h1_mac.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20230205192409.1796428-2-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 50aa870ba2f7735f556e52d15f61cd0f359c4c0b ]
Placing a declaration of evt_reset is pedantically invalid
according to the C standard. While GCC does not really care
and only warns with -Wpedantic, clang ignores the declaration
altogether with an error:
x86_64/xen_shinfo_test.c:965:2: error: expected expression
struct kvm_xen_hvm_attr evt_reset = {
^
x86_64/xen_shinfo_test.c:969:38: error: use of undeclared identifier evt_reset
vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &evt_reset);
^
Reported-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Reported-by: Sean Christopherson <seanjc@google.com>
Fixes: a79b53aaaab5 ("KVM: x86: fix deadlock for KVM_XEN_EVTCHN_RESET", 2022-12-28)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a79b53aaaab53de017517bf9579b6106397a523c ]
While KVM_XEN_EVTCHN_RESET is usually called with no vCPUs running,
if that happened it could cause a deadlock. This is due to
kvm_xen_eventfd_reset() doing a synchronize_srcu() inside
a kvm->lock critical section.
To avoid this, first collect all the evtchnfd objects in an
array and free all of them once the kvm->lock critical section
is over and th SRCU grace period has expired.
Reported-by: Michal Luczaj <mhal@rbox.co>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
memblock_free_late()."
commit 647037adcad00f2bab8828d3d41cd0553d41f3bd upstream.
This reverts commit 115d9d77bb0f9152c60b6e8646369fa7f6167593.
The pages being freed by memblock_free_late() have already been
initialized, but if they are in the deferred init range,
__free_one_page() might access nearby uninitialized pages when trying to
coalesce buddies. This can, for example, trigger this BUG:
BUG: unable to handle page fault for address: ffffe964c02580c8
RIP: 0010:__list_del_entry_valid+0x3f/0x70
<TASK>
__free_one_page+0x139/0x410
__free_pages_ok+0x21d/0x450
memblock_free_late+0x8c/0xb9
efi_free_boot_services+0x16b/0x25c
efi_enter_virtual_mode+0x403/0x446
start_kernel+0x678/0x714
secondary_startup_64_no_verify+0xd2/0xdb
</TASK>
A proper fix will be more involved so revert this change for the time
being.
Fixes: 115d9d77bb0f ("mm: Always release pages to the buddy allocator in memblock_free_late().")
Signed-off-by: Aaron Thompson <dev@aaront.org>
Link: https://lore.kernel.org/r/20230207082151.1303-1-dev@aaront.org
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The commit 4656d72c1efa ("selftests: mptcp: userspace: validate v4-v6 subflows mix")
has been backported to v6.1.8 without any conflicts. But it looks like
it was depending on a previous one:
commit 1cc94ac1af4b ("selftests: mptcp: make evts global in userspace_pm")
Without it, the test fails with:
./userspace_pm.sh: line 788: : No such file or directory
# ADD_ADDR4 id:14 10.0.2.1 (ns1) => ns2, reuse port [FAIL]
sed: can't read : No such file or directory
This dependence refactors the way the monitoring files are being
created: only once for all the different sub-tests instead of per
sub-test.
It is probably better to avoid backporting the refactoring. That is why
the new sub-test has been adapted to work using the previous way that is
still in place here in v6.1: the monitoring is started at the beginning
of each sub-test and the created file is removed at the end.
Fixes: f59549814a64 ("selftests: mptcp: userspace: validate v4-v6 subflows mix")
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a6efc42a86c0c87cfe2f1c3d1f09a4c9b13ba890 ]
"tcpdump" is used to capture traffic in these tests while using a random,
temporary and not suffixed file for it. This can interfere with apparmor
configuration where the tool is only allowed to read from files with
'known' extensions.
The MINE type application/vnd.tcpdump.pcap was registered with IANA for
pcap files and .pcap is the extension that is both most common but also
aligned with standard apparmor configurations. See TCPDUMP(8) for more
details.
This improves compatibility with standard apparmor configurations by
using ".pcap" as the file extension for the tests' temporary files.
Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3f7b75abf41cc4143aa295f62acbb060a012868d ]
Fix the build caused by missing kmsan_handle_dma() and is_power_of_2() that
are used in drivers/virtio/virtio_ring.c.
Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Message-Id: <20230110034310.779744-1-mie@igel.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b9fa9bc839291020b362ab5392e5f18ba79657ac ]
A testcase to check that verifier.c:copy_register_state() preserves
register parentage chain and livness information.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230106142214.1040390-3-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 070d6dafacbaa9d1f2e4e3edc263853d194af15e upstream.
These 'endpoint' tests from 'mptcp_join.sh' selftest start a transfer in
the background and check the status during this transfer.
Once the expected events have been recorded, there is no reason to wait
for the data transfer to finish. It can be stopped earlier to reduce the
execution time by more than half.
For these tests, the exchanged data were not verified. Errors, if any,
were ignored but that's fine, plenty of other tests are looking at that.
It is then OK to mute stderr now that we are sure errors will be printed
(and still ignored) because the transfer is stopped before the end.
Fixes: e274f7154008 ("selftests: mptcp: add subflow limits test-cases")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a635a8c3df66ab68dc088c08a4e9e955e22c0e64 upstream.
A test-case is frequently failing on some extremely slow VMs.
The mptcp transfer completes before the script is able to do
all the required PM manipulation.
Address the issue in the simplest possible way, making the
transfer even more slow.
Additionally dump more info in case of failures, to help debugging
similar problems in the future and init dump_stats var.
Fixes: e274f7154008 ("selftests: mptcp: add subflow limits test-cases")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/323
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 3a082086aa200852545cf15159213582c0c80eba ]
When set/restore sysctl value, we should quote the value as some keys
may have multi values, e.g. net.ipv4.ping_group_range
Fixes: f5ae57784ba8 ("selftests: forwarding: lib: Add sysctl_set(), sysctl_restore()")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://lore.kernel.org/r/20230208032110.879205-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b963d9d5b9437a6b99504987310f98537c9e77d4 ]
iproute2 does not recognize the "group6" and "remote6" keywords. Fix by
using "group" and "remote" instead.
Before:
# ./test_vxlan_vnifiltering.sh
[...]
Tests passed: 25
Tests failed: 2
After:
# ./test_vxlan_vnifiltering.sh
[...]
Tests passed: 27
Tests failed: 0
Fixes: 3edf5f66c12a ("selftests: add new tests for vxlan vnifiltering")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20230207141819.256689-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit e5ae8803847b80fe9d744a3174abe2b7bfed222a upstream.
It was found that the check to see if a partition could use up all
the cpus from the parent cpuset in update_parent_subparts_cpumask()
was incorrect. As a result, it is possible to leave parent with no
effective cpu left even if there are tasks in the parent cpuset. This
can lead to system panic as reported in [1].
Fix this probem by updating the check to fail the enabling the partition
if parent's effective_cpus is a subset of the child's cpus_allowed.
Also record the error code when an error happens in update_prstate()
and add a test case where parent partition and child have the same cpu
list and parent has task. Enabling partition in the child will fail in
this case.
[1] https://www.spinics.net/lists/cgroups/msg36254.html
Fixes: f0af1bfc27b5 ("cgroup/cpuset: Relax constraints to partition & cpus changes")
Cc: stable@vger.kernel.org # v6.1
Reported-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
benchmarking
[ Upstream commit 329c9cd769c2e306957df031efff656c40922c76 ]
The test tool can check that the zerocopy number of completions value is
valid taking into consideration the number of datagram send calls. This can
catch the system into a state where the datagrams are still in the system
(for example in a qdisk, waiting for the network interface to return a
completion notification, etc).
This change adds a retry logic of computing the number of completions up to
a configurable (via CLI) timeout (default: 2 seconds).
Fixes: 79ebc3c26010 ("net/udpgso_bench_tx: options to exercise TX CMSG")
Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230201001612.515730-4-andrei.gherzan@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit dafe93b9ee21028d625dce347118b82659652eff ]
"udpgro_bench.sh" invokes udpgso_bench_rx/udpgso_bench_tx programs
subsequently and while doing so, there is a chance that the rx one is not
ready to accept socket connections. This racing bug could fail the test
with at least one of the following:
./udpgso_bench_tx: connect: Connection refused
./udpgso_bench_tx: sendmsg: Connection refused
./udpgso_bench_tx: write: Connection refused
This change addresses this by making udpgro_bench.sh wait for the rx
program to be ready before firing off the tx one - up to a 10s timeout.
Fixes: 3a687bef148d ("selftests: udp gso benchmark")
Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230201001612.515730-3-andrei.gherzan@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit db9b47ee9f5f375ab0c5daeb20321c75b4fa657d ]
Leaving unrecognized arguments buried in the output, can easily hide a
CLI/script typo. Avoid this by exiting when wrong arguments are provided to
the udpgso_bench test programs.
Fixes: 3a687bef148d ("selftests: udp gso benchmark")
Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230201001612.515730-2-andrei.gherzan@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c03c80e3a03ffb4f790901d60797e9810539d946 ]
This change fixes the following compiler warning:
/usr/include/x86_64-linux-gnu/bits/error.h:40:5: warning: ‘gso_size’ may
be used uninitialized [-Wmaybe-uninitialized]
40 | __error_noreturn (__status, __errnum, __format,
__va_arg_pack ());
|
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
udpgso_bench_rx.c: In function ‘main’:
udpgso_bench_rx.c:253:23: note: ‘gso_size’ was declared here
253 | int ret, len, gso_size, budget = 256;
Fixes: 3327a9c46352 ("selftests: add functionals test for UDP GRO")
Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230201001612.515730-1-andrei.gherzan@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 24b5308cf5ee9f52dd22f3af78a5b0cdc9d35e72 ]
When use tools/testing/selftests/kselftest_install.sh to make the
kselftest-list.txt under tools/testing/selftests/kselftest_install.
Then use tools/testing/selftests/kselftest_install/run_kselftest.sh to run
all the kselftests in kselftest-list.txt, it will be blocked by case
"filesystems/fat: run_fat_tests.sh" with "Warning: file run_fat_tests.sh
is not executable", so grant executable permission to run_fat_tests.sh to
fix this issue.
Link: https://lkml.kernel.org/r/dfdbba6df8a1ab34bb1e81cd8bd7ca3f9ed5c369.1673424747.git.pengfei.xu@intel.com
Fixes: dd7c9be330d8 ("selftests/filesystems: add a vfat RENAME_EXCHANGE test")
Signed-off-by: Pengfei Xu <pengfei.xu@intel.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9fdaca2c1e157dc0a3c0faecf3a6a68e7d8d0c7b ]
We are missing a ) when we attempt to complain about not having enough
configuration for clang, resulting in the rather inscrutable error:
../lib.mk:23: *** unterminated call to function 'error': missing ')'. Stop.
Add the required ) so we print the message we were trying to print.
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 677d85e1a1ee69fa05ccea83847309484be3781c ]
Following line should listen for a rising edge and exit after the first
one since '-c 1' is provided.
# gpio-event-mon -n gpiochip1 -o 0 -r -c 1
It works with kernel 4.19 but it doesn't work with 5.10. In 5.10 the
above command doesn't exit after the first rising edge it keep listening
for an event forever. The '-c 1' is not taken into an account.
The problem is in commit 62757c32d5db ("tools: gpio: add multi-line
monitoring to gpio-event-mon").
Before this commit the iterator 'i' in monitor_device() is used for
counting of the events (loops). In the case of the above command (-c 1)
we should start from 0 and increment 'i' only ones and hit the 'break'
statement and exit the process. But after the above commit counting
doesn't start from 0, it start from 1 when we listen on one line.
It is because 'i' is used from one more purpose, counting of lines
(num_lines) and it isn't restore to 0 after following code
for (i = 0; i < num_lines; i++)
gpiotools_set_bit(&values.mask, i);
Restore the initial value of the iterator to 0 in order to allow counting
of loops to work for any cases.
Fixes: 62757c32d5db ("tools: gpio: add multi-line monitoring to gpio-event-mon")
Signed-off-by: Ivo Borisov Shopov <ivoshopov@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
[Bartosz: tweak the commit message]
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
This reverts commit 2e5d5c4ae77dedc7eba0e496474ab93f05b25adf.
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1bfbe1f3e96720daf185f03d101f072d69753f88 ]
When building on ARM in thumb mode with gcc-11.3 at -O2 or -O3,
nolibc-test segfaults during the select() tests. It turns out that at
this level, gcc recognizes an opportunity for using memset() to zero
the fd_set, but it miscompiles it because it also recognizes a memset
pattern as well, and decides to call memset() from the memset() code:
000122bc <memset>:
122bc: b510 push {r4, lr}
122be: 0004 movs r4, r0
122c0: 2a00 cmp r2, #0
122c2: d003 beq.n 122cc <memset+0x10>
122c4: 23ff movs r3, #255 ; 0xff
122c6: 4019 ands r1, r3
122c8: f7ff fff8 bl 122bc <memset>
122cc: 0020 movs r0, r4
122ce: bd10 pop {r4, pc}
Simply placing an empty asm() statement inside the loop suffices to
avoid this.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 55abdd1f5e1e07418bf4a46c233a92f83cb5ae97 ]
After the nolibc includes were split to facilitate portability from
standard libcs, programs that include only what they need may miss
some symbols which are needed by libgcc. This is the case for raise()
which is needed by the divide by zero code in some architectures for
example.
Regardless, being able to include only the apparently needed files is
convenient.
Instead of trying to move all exported definitions to a single file,
since this can change over time, this patch takes another approach
consisting in including the nolibc header at the end of all standard
include files. This way their types and functions are already known
at the moment of inclusion, and including any single one of them is
sufficient to bring all the required ones.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 16f5cea74179b5795af7ce359971f5128d10f80e ]
The mode field has the type encoded as an value in a field, not as a bit
mask. Mask the mode with S_IFMT instead of each type to test. Otherwise,
false positives are possible: eg S_ISDIR will return true for block
devices because S_IFDIR = 0040000 and S_IFBLK = 0060000 since mode is
masked with S_IFDIR instead of S_IFMT. These macros now match the
similar definitions in tools/include/uapi/linux/stat.h.
Signed-off-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit feaf75658783a919410f8c2039dbc24b6a29603d ]
The kernel uses unsigned long for the fd_set bitmap,
but nolibc use u32. This works fine on little endian
machines, but fails on big endian. Convert to unsigned
long to fix this.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 903848249a781d76d59561d51676c95b3a4d7162 ]
Avoid race between process wakeup and tpacket_v3 block timeout.
The test waits for cfg_timeout_msec for packets to arrive. Packets
arrive in tpacket_v3 rings, which pass packets ("frames") to the
process in batches ("blocks"). The sk waits for req3.tp_retire_blk_tov
msec to release a block.
Set the block timeout lower than the process waiting time, else
the process may find that no block has been released by the time it
scans the socket list. Convert to a ring of more than one, smaller,
blocks with shorter timeouts. Blocks must be page aligned, so >= 64KB.
Fixes: 5ebfb4cc3048 ("selftests/net: toeplitz test")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230118151847.4124260-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 4656d72c1efa495a58ad6d8b073a60907073e4e6 upstream.
MPTCP protocol supports having subflows in both IPv4 and IPv6. In Linux,
it is possible to have that if the MPTCP socket has been created with
AF_INET6 family without the IPV6_V6ONLY option.
Here, a new IPv4 subflow is being added to the initial IPv6 connection,
then being removed using Netlink commands.
Cc: stable@vger.kernel.org # v5.19+
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5316a017d093f644675a56523bcf5787ba8f4fef upstream.
vsyscall detection code uses direct call to the beginning of
the vsyscall page:
asm ("call %P0" :: "i" (0xffffffffff600000))
It generates "call rel32" instruction but it is not relocated if binary
is PIE, so binary segfaults into random userspace address and vsyscall
page status is detected incorrectly.
Do more direct:
asm ("call *%rax")
which doesn't do need any relocaltions.
Mark g_vsyscall as volatile for a good measure, I didn't find instruction
setting it to 0. Now the code is obviously correct:
xor eax, eax
mov rdi, rbp
mov rsi, rbp
mov DWORD PTR [rip+0x2d15], eax # g_vsyscall = 0
mov rax, 0xffffffffff600000
call rax
mov DWORD PTR [rip+0x2d02], 1 # g_vsyscall = 1
mov eax, DWORD PTR ds:0xffffffffff600000
mov DWORD PTR [rip+0x2cf1], 2 # g_vsyscall = 2
mov edi, [rip+0x2ceb] # exit(g_vsyscall)
call exit
Note: fixed proc-empty-vm test oopses 5.19.0-28-generic kernel
but this is separate story.
Link: https://lkml.kernel.org/r/Y7h2xvzKLg36DSq8@p183
Fixes: 5bc73bb3451b9 ("proc: test how it holds up with mapping'less process")
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 340726747336716350eb5a928b860a29db955f05 ]
Commit cf4694be2b2cf ("tools: Add atomic_test_and_set_bit()") changed
tools/arch/x86/include/asm/atomic.h to include <asm/asm.h>, which causes
'make -C tools/testing/memblock' to fail with:
In file included from ../../include/asm/atomic.h:6,
from ../../include/linux/atomic.h:5,
from ./linux/mmzone.h:5,
from ../../include/linux/mm.h:5,
from ../../include/linux/pfn.h:5,
from ./linux/memory_hotplug.h:6,
from ./linux/init.h:7,
from ./linux/memblock.h:11,
from tests/common.h:8,
from tests/basic_api.h:5,
from main.c:2:
../../include/asm/../../arch/x86/include/asm/atomic.h:11:10: fatal error: asm/asm.h: No such file or directory
11 | #include <asm/asm.h>
| ^~~~~~~~~~~
Create a symlink to asm/asm.h in the same manner as the existing one to
asm/cmpxchg.h.
Signed-off-by: Aaron Thompson <dev@aaront.org>
Link: https://lore.kernel.org/r/010101857c402765-96e2dbc6-b82b-47e2-a437-4834dbe0b96b-000000@us-west-2.amazonses.com
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1573c6882018f69991aead951d09423ce978adac ]
This cmsg_so_mark.sh test will hang on non-amd64 systems because of the
infinity loop for argument parsing in cmsg_sender.
Variable "o" in cs_parse_args() for taking getopt() should be an int,
otherwise it will be 255 when getopt() returns -1 on non-amd64 system
and thus causing infinity loop.
Link: https://lore.kernel.org/lkml/CA+G9fYsM2k7mrF7W4V_TrZ-qDauWM394=8yEJ=-t1oUg8_40YA@mail.gmail.com/t/
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c262f75cb6bb5a63828e72ce3b8fe808e5029479 ]
The virtio_device vqs_list spinlocks must be initialized before use to
prevent functions that manipulate the device virtualqueues, such as
vring_new_virtqueue(), from blocking indefinitely.
Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Message-Id: <20221012062949.1526176-1-ricardo.canuelo@collabora.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit cedebd74cf3883f0384af9ec26b4e6f8f1964dd4 ]
Verify that nullness information is not porpagated in the branches
of register to register JEQ and JNE operations if one of them is
PTR_TO_BTF_ID. Implement this in C level so we can use CO-RE.
Signed-off-by: Hao Sun <sunhao.th@gmail.com>
Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20221222024414.29539-2-sunhao.th@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
recent tracepoint restructuring
[ Upstream commit dce088ab0d51ae3b14fb2bd608e9c649aadfe5dc ]
Commit 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of
tracepoints") adds the field "node" into the tracepoints 'kmalloc' and
'kmem_cache_alloc', so this patch modifies the event process function to
support the field "node".
If field "node" is detected by checking function evsel__field(), it
stats the cross allocation.
When the "node" value is NUMA_NO_NODE (-1), it means the memory can be
allocated from any memory node, in this case, we don't account it as a
cross allocation.
Fixes: 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints")
Reported-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20230108062400.250690-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b3719108ae60169eda5c941ca5e1be1faa371c57 ]
Commit 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of
tracepoints") removed tracepoints 'kmalloc_node' and
'kmem_cache_alloc_node', we need to consider the tool should be backward
compatible.
If it detect the tracepoint "kmem:kmalloc_node", this patch enables the
legacy tracepoints, otherwise, it will ignore them.
Fixes: 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints")
Reported-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20230108062400.250690-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d891f2b724b39a2a41e3ad7b57110193993242ff ]
Including libbpf header files should be guarded by HAVE_LIBBPF_SUPPORT.
In bpf_counter.h, move the skeleton utilities under HAVE_BPF_SKEL.
Fixes: d6a735ef3277c45f ("perf bpf_counter: Move common functions to bpf_counter.h")
Reported-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Mike Leach <mike.leach@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230105172243.7238-1-mike.leach@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d68ff8ad3351b8fc8d6f14b9a4f5cc8ba3e8bd13 ]
Use 'set -e' and an exit handler to stop the script if a command fails
and ensure the test environment is cleaned up in any case. Also, handle
the case where the script is interrupted by SIGINT.
The only command that's expected to fail is 'wait $ping_pid', since
it's killed by the script. Handle this case with '|| true' to make it
play well with 'set -e'.
Finally, return the Kselftest SKIP code (4) when the script breaks
because of an environment problem or a command line failure. The 0 and
1 return codes should now reliably indicate that all tests have been
run (0: all tests run and passed, 1: all tests run but at least one
failed, 4: test script didn't run completely).
Fixes: b690842d12fd ("selftests/net: test l2 tunnel TOS/TTL inheriting")
Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c53cb00f7983a5474f2d36967f84908b85af9159 ]
This selftest currently runs half in the current namespace and half in
a netns of its own. Therefore, the test can fail if the current
namespace is already configured with incompatible parameters (for
example if it already has a veth0 interface).
Adapt the script to put both ends of the veth pair in their own netns.
Now veth0 is created in NS0 instead of the current namespace, while
veth1 is set up in NS1 (instead of the 'testing' netns).
The user visible netns names are randomised to minimise the risk of
conflicts with already existing namespaces. The cleanup() function
doesn't need to remove the virtual interface anymore: deleting NS0 and
NS1 automatically removes the virtual interfaces they contained.
We can remove $ns, which was only used to run ip commands in the
'testing' netns (let's use the builtin "-netns" option instead).
However, we still need a similar functionality as ping and tcpdump
now need to run in NS0. So we now have $RUN_NS0 for that.
Fixes: b690842d12fd ("selftests/net: test l2 tunnel TOS/TTL inheriting")
Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e59370b2e96eb8e7e057a2a16e999ff385a3f2fb ]
The ping command can run before DAD completes. In that case, ping may
fail and break the selftest.
We don't need DAD here since we're working on isolated device pairs.
Fixes: b690842d12fd ("selftests/net: test l2 tunnel TOS/TTL inheriting")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 00b18da4089330196906b9fe075c581c17eb726c ]
When RISCV port was imported in 5.2, the O_* macros were taken with
their octal value and written as-is in hex, resulting in the getdents64()
to fail in nolibc-test.
Fixes: 582e84f7b779 ("tool headers nolibc: add RISCV support") #5.2
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 184177c3d6e023da934761e198c281344d7dd65b ]
Depending on the compiler used and the optimization options, the sbrk()
test was crashing, both on real hardware (mips-24kc) and in qemu. One
such example is kernel.org toolchain in version 11.3 optimizing at -Os.
Inspecting the sys_brk() call shows the following code:
0040047c <sys_brk>:
40047c: 24020fcd li v0,4045
400480: 27bdffe0 addiu sp,sp,-32
400484: 0000000c syscall
400488: 27bd0020 addiu sp,sp,32
40048c: 10e00001 beqz a3,400494 <sys_brk+0x18>
400490: 00021023 negu v0,v0
400494: 03e00008 jr ra
It is obviously wrong, the "negu" instruction is placed in beqz's
delayed slot, and worse, there's no nop nor instruction after the
return, so the next function's first instruction (addiu sip,sip,-32)
will also be executed as part of the delayed slot that follows the
return.
This is caused by the ".set noreorder" directive in the _start block,
that applies to the whole program. The compiler emits code without the
delayed slots and relies on the compiler to swap instructions when this
option is not set. Removing the option would require to change the
startup code in a way that wouldn't make it look like the resulting
code, which would not be easy to debug. Instead let's just save the
default ordering before changing it, and restore it at the end of the
_start block. Now the code is correct:
0040047c <sys_brk>:
40047c: 24020fcd li v0,4045
400480: 27bdffe0 addiu sp,sp,-32
400484: 0000000c syscall
400488: 10e00002 beqz a3,400494 <sys_brk+0x18>
40048c: 27bd0020 addiu sp,sp,32
400490: 00021023 negu v0,v0
400494: 03e00008 jr ra
400498: 00000000 nop
Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc") #5.0
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7d6ceeb1875cc08dc3d1e558e191434d94840cd5 ]
Adjust size parameter in connect() to match the type of the parameter, to
fix "No such file or directory" error in selftests/net/af_unix/
test_oob_unix.c:127.
The existing code happens to work provided that the autogenerated pathname
is shorter than sizeof (struct sockaddr), which is why it hasn't been
noticed earlier.
Visible from the trace excerpt:
bind(3, {sa_family=AF_UNIX, sun_path="unix_oob_453059"}, 110) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fa6a6577a10) = 453060
[pid <child>] connect(6, {sa_family=AF_UNIX, sun_path="unix_oob_45305"}, 16) = -1 ENOENT (No such file or directory)
BUG: The filename is trimmed to sizeof (struct sockaddr).
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Cc: Florian Westphal <fw@strlen.de>
Reviewed-by: Florian Westphal <fw@strlen.de>
Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit c273289fac370b6488757236cd62cc2cf04830b7 upstream.
The kselftest framework uses a default timeout of 45 seconds for
all test scripts.
Increase the timeout to two minutes for the netfilter tests, this
should hopefully be enough,
Make sure that, should the script be canceled, the net namespace and
the spawned ping instances are removed.
Fixes: 25d8bcedbf43 ("selftests: add script to stress-test nft packet path vs. control plane")
Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 115d9d77bb0f9152c60b6e8646369fa7f6167593 upstream.
If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages()
only releases pages to the buddy allocator if they are not in the
deferred range. This is correct for free pages (as defined by
for_each_free_mem_pfn_range_in_zone()) because free pages in the
deferred range will be initialized and released as part of the deferred
init process. memblock_free_pages() is called by memblock_free_late(),
which is used to free reserved ranges after memblock_free_all() has
run. All pages in reserved ranges have been initialized at that point,
and accordingly, those pages are not touched by the deferred init
process. This means that currently, if the pages that
memblock_free_late() intends to release are in the deferred range, they
will never be released to the buddy allocator. They will forever be
reserved.
In addition, memblock_free_pages() calls kmsan_memblock_free_pages(),
which is also correct for free pages but is not correct for reserved
pages. KMSAN metadata for reserved pages is initialized by
kmsan_init_shadow(), which runs shortly before memblock_free_all().
For both of these reasons, memblock_free_pages() should only be called
for free pages, and memblock_free_late() should call __free_pages_core()
directly instead.
One case where this issue can occur in the wild is EFI boot on
x86_64. The x86 EFI code reserves all EFI boot services memory ranges
via memblock_reserve() and frees them later via memblock_free_late()
(efi_reserve_boot_services() and efi_free_boot_services(),
respectively). If any of those ranges happens to fall within the
deferred init range, the pages will not be released and that memory will
be unavailable.
For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI:
v6.2-rc2:
# grep -E 'Node|spanned|present|managed' /proc/zoneinfo
Node 0, zone DMA
spanned 4095
present 3999
managed 3840
Node 0, zone DMA32
spanned 246652
present 245868
managed 178867
v6.2-rc2 + patch:
# grep -E 'Node|spanned|present|managed' /proc/zoneinfo
Node 0, zone DMA
spanned 4095
present 3999
managed 3840
Node 0, zone DMA32
spanned 246652
present 245868
managed 222816 # +43,949 pages
Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Signed-off-by: Aaron Thompson <dev@aaront.org>
Link: https://lore.kernel.org/r/01010185892de53e-e379acfb-7044-4b24-b30a-e2657c1ba989-000000@us-west-2.amazonses.com
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cf129830ee820f7fc90b98df193cd49d49344d09 upstream.
When a match has been made to the nth duplicate symbol, return
success not error.
Example:
Before:
$ cat file.c
cat: file.c: No such file or directory
$ cat file1.c
#include <stdio.h>
static void func(void)
{
printf("First func\n");
}
void other(void);
int main()
{
func();
other();
return 0;
}
$ cat file2.c
#include <stdio.h>
static void func(void)
{
printf("Second func\n");
}
void other(void)
{
func();
}
$ gcc -Wall -Wextra -o test file1.c file2.c
$ perf record -e intel_pt//u --filter 'filter func @ ./test' -- ./test
Multiple symbols with name 'func'
#1 0x1149 l func
which is near main
#2 0x1179 l func
which is near other
Disambiguate symbol name by inserting #n after the name e.g. func #2
Or select a global symbol by inserting #0 or #g or #G
Failed to parse address filter: 'filter func @ ./test'
Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>]
Where multiple filters are separated by space or comma.
$ perf record -e intel_pt//u --filter 'filter func #2 @ ./test' -- ./test
Failed to parse address filter: 'filter func #2 @ ./test'
Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>]
Where multiple filters are separated by space or comma.
After:
$ perf record -e intel_pt//u --filter 'filter func #2 @ ./test' -- ./test
First func
Second func
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.016 MB perf.data ]
$ perf script --itrace=b -Ftime,flags,ip,sym,addr --ns
1231062.526977619: tr strt 0 [unknown] => 558495708179 func
1231062.526977619: tr end call 558495708188 func => 558495708050 _init
1231062.526979286: tr strt 0 [unknown] => 55849570818d func
1231062.526979286: tr end return 55849570818f func => 55849570819d other
Fixes: 1b36c03e356936d6 ("perf record: Add support for using symbols in address filters")
Reported-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Dmitry Dolgov <9erthalion6@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230110185659.15979-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6ea25770b043c7997ab21d1ce95ba5de4d3d85d9 upstream.
This tests PTRACE_SETREGSET with NT_X86_XSTATE modifying PKRU directly and
removing the PKRU bit from XSTATE_BV.
Signed-off-by: Kyle Huey <me@kylehuey.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20221115230932.7126-7-khuey%40kylehuey.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 71bdea6f798b425bc0003780b13e3fdecb16a010 upstream.
Adjust some MADV_XXX constants to be in sync what their values are on
all other platforms. There is currently no reason to have an own
numbering on parisc, but it requires workarounds in many userspace
sources (e.g. glibc, qemu, ...) - which are often forgotten and thus
introduce bugs and different behaviour on parisc.
A wrapper avoids an ABI breakage for existing userspace applications by
translating any old values to the new ones, so this change allows us to
move over all programs to the new ABI over time.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
non BPF mode
[ Upstream commit 54b353a20c7e8be98414754f5aff98c8a68fcc1f ]
The --for-each-cgroup can have the same cgroup multiple times, but this
confuses BPF counters (since they have the same cgroup id), making only
the last cgroup events to be counted.
Let's check the cgroup name before adding a new entry to the cgroups
list.
Before:
$ sudo ./perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1
Performance counter stats for 'system wide':
<not counted> msec cpu-clock /
<not counted> context-switches /
<not counted> cpu-migrations /
<not counted> page-faults /
<not counted> cycles /
<not counted> instructions /
<not counted> branches /
<not counted> branch-misses /
8,016.04 msec cpu-clock / # 7.998 CPUs utilized
6,152 context-switches / # 767.461 /sec
250 cpu-migrations / # 31.187 /sec
442 page-faults / # 55.139 /sec
613,111,487 cycles / # 0.076 GHz
280,599,604 instructions / # 0.46 insn per cycle
57,692,724 branches / # 7.197 M/sec
3,385,168 branch-misses / # 5.87% of all branches
1.002220125 seconds time elapsed
After it becomes similar to the non-BPF mode:
$ sudo ./perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1
Performance counter stats for 'system wide':
8,013.38 msec cpu-clock / # 7.998 CPUs utilized
6,859 context-switches / # 855.944 /sec
334 cpu-migrations / # 41.680 /sec
345 page-faults / # 43.053 /sec
782,326,119 cycles / # 0.098 GHz
471,645,724 instructions / # 0.60 insn per cycle
94,963,430 branches / # 11.851 M/sec
3,685,511 branch-misses / # 3.88% of all branches
1.001864539 seconds time elapsed
Committer notes:
As a reminder, to test with BPF counters one has to use BUILD_BPF_SKEL=1
in the make command line and have clang/llvm installed when building
perf, otherwise the --bpf-counters option will not be available:
# perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1
Error: unknown option `bpf-counters'
Usage: perf stat [<options>] [<command>]
-a, --all-cpus system-wide collection from all CPUs
<SNIP>
#
Fixes: bb1c15b60b981d10 ("perf stat: Support regex pattern in --for-each-cgroup")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: bpf@vger.kernel.org
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20230104064402.1551516-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2d656b0f81b22101db0447f890e39fdd736b745e ]
When --for-each-cgroup option is used, it fails when any of events is
not supported and exits immediately. This is not how 'perf stat'
handles unsupported events.
Let's ignore the failure and proceed with others so that the output is
similar to when BPF counters are not used:
Before:
$ sudo ./perf stat -a --bpf-counters -e L1-icache-loads,L1-dcache-loads --for-each-cgroup system.slice,user.slice sleep 1
Failed to open first cgroup events
$
After it shows output similat to when --bpf-counters isn't specified:
$ sudo ./perf stat -a --bpf-counters -e L1-icache-loads,L1-dcache-loads --for-each-cgroup system.slice,user.slice sleep 1
Performance counter stats for 'system wide':
<not supported> L1-icache-loads system.slice
29,892,418 L1-dcache-loads system.slice
<not supported> L1-icache-loads user.slice
52,497,220 L1-dcache-loads user.slice
$
Fixes: 944138f048f7d759 ("perf stat: Enable BPF counter with --for-each-cgroup")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20230104064402.1551516-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
"__sched_text_end" symbol on s/390
[ Upstream commit d8d85ce86dc82de4f88b821a78f533b9d5b22a45 ]
The test case perf lock contention dumps core on s390. Run the following
commands:
# ./perf lock record -- ./perf bench sched messaging
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 2.799 [sec]
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.073 MB perf.data (100 samples) ]
#
# ./perf lock contention
Segmentation fault (core dumped)
#
The function call stack is lengthy, here are the top 5 functions:
# gdb ./perf core.24048
GNU gdb (GDB) Fedora Linux 12.1-6.fc37
Core was generated by `./perf lock contention'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28, addr=1789230) at util/machine.c:3356
3356 machine->sched.text_end = kmap->unmap_ip(kmap, sym->start);
(gdb) where
#0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28, addr=1789230) at util/machine.c:3356
#1 0x000000000109f244 in callchain_id (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:957
#2 0x000000000109e094 in get_key_by_aggr_mode (key=0x3ffea4f7290, addr=27758136, evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:586
#3 0x000000000109f4d0 in report_lock_contention_begin_event (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:1004
#4 0x00000000010a00ae in evsel__process_contention_begin (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:1254
#5 0x00000000010a0e14 in process_sample_event (tool=0x3ffea4f8480, event=0x3ff85601ef8, sample=0x3ffea4f77d0, evsel=0x30313e0, machine=0x3029e28) at builtin-lock.c:1464
.....
The issue is in function machine__is_lock_function() in file
./util/machine.c lines 3355:
/* should not fail from here */
sym = machine__find_kernel_symbol_by_name(machine, "__sched_text_end", &kmap);
machine->sched.text_end = kmap->unmap_ip(kmap, sym->start)
On s390 the symbol __sched_text_end is *NOT* in the symbol list and the
resulting pointer sym is set to NULL. The sym->start is then a NULL pointer
access and generates the core dump.
The reason why __sched_text_end is not in the symbol list on s390 is
simple:
When the symbol list is created at perf start up with function calls
dso__load
+--> dso__load_vmlinux_path
+--> dso__load_vmlinux
+--> dso__load_sym
+--> dso__load_sym_internal (reads kernel symbols)
+--> symbols__fixup_end
+--> symbols__fixup_duplicate
The issue is in function symbols__fixup_duplicate(). It deletes all
symbols with have the same address. On s390:
# nm -g ~/linux/vmlinux| fgrep c68390
0000000000c68390 T __cpuidle_text_start
0000000000c68390 T __sched_text_end
#
two symbols have identical addresses and __sched_text_end is considered
duplicate (in ascending sort order) and removed from the symbol list.
Therefore it is missing and an invalid pointer reference occurs. The
code checks for symbol __sched_text_start and when it exists assumes
symbol __sched_text_end is also in the symbol table. However this is not
the case on s390.
Same situation exists for symbol __lock_text_start:
0000000000c68770 T __cpuidle_text_end
0000000000c68770 T __lock_text_start
This symbol is also removed from the symbol table but used in function
machine__is_lock_function().
To fix this and keep duplicate symbols in the symbol table, set
symbol_conf.allow_aliases to true. This prevents the removal of
duplicate symbols in function symbols__fixup_duplicate().
Output After:
# ./perf lock contention
contended total wait max wait avg wait type caller
48 124.39 ms 123.99 ms 2.59 ms rwsem:W unlink_anon_vmas+0x24a
47 83.68 ms 83.26 ms 1.78 ms rwsem:W free_pgtables+0x132
5 41.22 us 10.55 us 8.24 us rwsem:W free_pgtables+0x140
4 40.12 us 20.55 us 10.03 us rwsem:W copy_process+0x1ac8
#
Fixes: 0d2997f750d1de39 ("perf lock: Look up callchain for the contended locks")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20221230102627.2410847-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0a6564ebd953c4590663c9a3c99a3ea9920ade6f ]
In perf_data__open_dir(), opendir() opens the directory stream. Add
missing closedir() to release it after use.
Fixes: eb6176709b235b96 ("perf data: Add perf_data__open_dir_data function")
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221229090903.1402395-1-linmq006@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|