summaryrefslogtreecommitdiff
path: root/Documentation/project-docs/linux-performance-tracing.md
blob: 559c82f6ff54e6640a0ada3e7ae4fec4b30a6d1e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
Performance Tracing on Linux
============================

When a performance problem is encountered on Linux, these instructions can be used to gather detailed information about what was happening on the machine at the time of the performance problem.

# Required Tools #
- **perfcollect**: Bash script that automates data collection.
	- Available at <http://aka.ms/perfcollect>.
- **PerfView**: Windows-based performance tool that can also analyze trace files collected with Perfcollect.
	- Available at <http://aka.ms/perfview>.

# Preparing Your Machine #
Follow these steps to prepare your machine to collect a performance trace.

1. Download Perfcollect.

	> ```bash
	> curl -OL http://aka.ms/perfcollect
	> ```

2. Make the script executable.

	> ```bash
	> chmod +x perfcollect
	> ```

3. Install tracing prerequisites - these are the actual tracing libraries.  For details on prerequisites, see [below](#prerequisites).

	> ```bash
	> sudo ./perfcollect install
	> ```

# Collecting a Trace #
1. Have two shell windows available - one for controlling tracing, referred to as **[Trace]**, and one for running the application, referred to as **[App]**.
2. **[App]** Setup the application shell - this enables tracing configuration inside of CoreCLR.

	> ```bash 
	> export COMPlus_PerfMapEnabled=1
	> export COMPlus_EnableEventLog=1
	> ```

3. **[Trace]** Start collection. 

	> ```bash
	> sudo ./perfcollect collect sampleTrace
	> ```

	Expected Output:

	> ```bash
	> Collection started.  Press CTRL+C to stop.
	> ```

4. **[App]** Run the app - let it run as long as you need to in order to capture the performance problem.  Generally, you don't need very long.  As an example, for a CPU investigation, 5-10 seconds of the high CPU situation is usually enough.

	> ```bash
	> dotnet run
	> ```

5. **[Trace]** Stop collection - hit CTRL+C.

	> ```bash
	> ^C
	> ...STOPPED.
	>
	> Starting post-processing. This may take some time.
	>
	> Generating native image symbol files
	> ...SKIPPED
	> Saving native symbols
	> ...FINISHED
	> Exporting perf.data file
	> ...FINISHED
	> Compressing trace files
	> ...FINISHED
	> Cleaning up artifacts
	> ...FINISHED
	>
	> Trace saved to sampleTrace.trace.zip
	> ```

	The compressed trace file is now stored in the current working directory.

# Resolving Framework Symbols #
Framework symbols need to be manually generated at the time the trace is collected.  They are different than app-level symbols because the framework is pre-compiled while apps are just-in-time-compiled.

The easiest way to generate framework symbols is to have perfcollect do it for you automatically when the trace is collected.  To do this, crossgen must be present on the collection machine and located next to libcoreclr.so.  This can be done manually using the following steps:

1. Download the CoreCLR nuget package.  A simple way to do this is to create a self-contained version of your application using the dotnet CLI:
	> ```bash
	> dotnet publish --self-contained -r linux-x64
	> ```	

	This will create a bin/*/netcoreapp2.0/Linux-x64/publish directory which contains all of the files required to run your app including the .NET runtime and framework.  As a side-effect, the dotnet CLI downloads and extracts the CoreCLR nuget package.  You should be able to find crossgen at:

	> ```bash
	> ~/packages/runtime.linux-x64.microsoft.netcore.app/2.0.0/tools/crossgen
	> ``` 

2. Copy crossgen next to libcoreclr.so.  If you run your application out of the publish directory, you can copy it into the directory that you just created during step # 1.  If you run your application using the dotnet CLI, then you likely need to copy it to /usr/share/dotnet/shared/Microsoft.NETCore.App/<Version> where <Version> is the version number of CoreCLR.  This should match the version number in the path to crossgen from step # 1.

Now, traces containing apps that were run on the updated runtime should contain framework-level symbols.  To view the trace with framework-level symbols, you will need to use PerfView.  Details on using PerfView to view a trace are available below in this document.

# Collecting in a Docker Container #
Perfcollect can be used to collect data for an application running inside a Docker container.  The main thing to know is that collecting a trace requires elevated privileges because the [default seccomp profile](https://docs.docker.com/engine/security/seccomp/) blocks a required syscall - perf_events_open.

In order to use the instructions in this document to collect a trace, spawn a new shell inside the container that is privileged.

>```bash
>docker exec -it --privileged <container_name> /bin/bash
>```

Even though the application hosted in the container isn't privileged, this new shell is, and it will have all the privileges it needs to collect trace data.  Now, simply follow the instructions in [Collecting a Trace](#collecting-a-trace) using the privileged shell.

If you want to try tracing in a container, we've written a [demo Dockerfile](https://raw.githubusercontent.com/dotnet/corefx-tools/master/src/performance/perfcollect/docker-demo/Dockerfile) that installs all of the performance tracing pre-requisites, sets the environment up for tracing, and starts a sample CPU-bound app.

# Filtering #
Filtering is implemented on Windows through the latest mechanisms provided with the [EventSource](https://msdn.microsoft.com/en-us/library/system.diagnostics.tracing.eventsource(v=vs.110).aspx) class. 

On Linux those mechanisms are not available yet. Instead, there are two environment variables that exist just on linux to do some basic filtering. 

* COMPLUS_EventSourceFilter – filter event sources by name
* COMPLUS_EventNameFilter – filter events by name

Setting one or both of these variables will only enable collecting events that contain the name you specify as a substring. Strings are treated as case insensitive. 

# Viewing a Trace #
Traces are best viewed using PerfView on Windows.  Note that we're currently looking into porting the analysis pieces of PerfView to Linux so that the entire investigation can occur on Linux.

## Open the Trace File ##
1. Copy the trace.zip file from Linux to a Windows machine.
2. Download PerfView from <http://aka.ms/perfview>.
3. Run PerfView.exe

	> ```cmd
	> PerfView.exe <path to trace.zip file>
	> ```

## Select a View ##
PerfView will display the list of views that are supported based on the data contained in the trace file.

- For CPU investigations, choose **CPU stacks**.
- For very detailed GC information, choose **GCStats**.
- For per-process/module/method JIT information, choose **JITStats**.
- If there is not a view for the information you need, you can try looking for the events in the raw events view.  Choose **Events**. 

For more details on how to interpret views in PerfView, see help links in the view itself, or from the main window in PerfView choose **Help->Users Guide**.

# Extra Information #
This information is not strictly required to collect and analyze traces, but is provided for those who are interested.

## Prerequisites ##
Perfcollect will alert users to any prerequisites that are not installed and offer to install them.  Prerequisites can be installed automatically by running:

>```bash
>sudo ./perfcollect install
>```

The current prerequisites are:

1. perf: Also known as perf_event, the Linux Performance Events sub-system and companion user-mode collection/viewer application.  perf is part of the Linux kernel source, but is not usually installed by default.
2. LTTng: Stands for "Linux Tracing Toolkit Next Generation", and is used to capture event data emitted at runtime by CoreCLR.  This data is then used to analyze the behavior of various runtime components such as the GC, JIT and thread pool.