summaryrefslogtreecommitdiff
path: root/Documentation/project-docs/linux-performance-tracing.md
blob: da9f9d78f9258998237a1cb669feac376158e83e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
Performance Tracing on Linux
============================

When a performance problem is encountered on Linux, these instructions can be used to gather detailed information about what was happening on the machine at the time of the performance problem.

# Required Tools #
- **perfcollect**: Bash script that automates data collection.
	- Available at <http://aka.ms/perfcollect>.
- **PerfView**: Windows-based performance tool that can also analyze trace files collected with Perfcollect.
	- Available at <http://aka.ms/perfview>.

# Preparing Your Machine #
Follow these steps to prepare your machine to collect a performance trace.

1. Download Perfcollect.

	> ```bash
	> curl -OL http://aka.ms/perfcollect
	> ```

2. Make the script executable.

	> ```bash
	> chmod +x perfcollect
	> ```

3. Install tracing prerequisites - these are the actual tracing libraries.  For details on prerequisites, see [below](#prerequisites).

	> ```bash
	> sudo ./perfcollect install
	> ```

# Collecting a Trace #
1. Have two shell windows available - one for controlling tracing, referred to as **[Trace]**, and one for running the application, referred to as **[App]**.
2. **[App]** Setup the application shell - this enables tracing configuration inside of CoreCLR.

	> ```bash 
	> export COMPlus_PerfMapEnabled=1
	> export COMPlus_EnableEventLog=1
	> ```

3. **[Trace]** Start collection. 

	> ```bash
	> sudo ./perfcollect collect sampleTrace
	> ```

	Expected Output:

	> ```bash
	> Collection started.  Press CTRL+C to stop.
	> ```

4. **[App]** Run the app - let it run as long as you need to in order to capture the performance problem.  Generally, you don't need very long.  As an example, for a CPU investigation, 5-10 seconds of the high CPU situation is usually enough.

	> ```bash
	> dotnet run
	> ```

5. **[Trace]** Stop collection - hit CTRL+C.

	> ```bash
	> ^C
	> ...STOPPED.
	>
	> Starting post-processing. This may take some time.
	>
	> Generating native image symbol files
	> ...SKIPPED
	> Saving native symbols
	> ...FINISHED
	> Exporting perf.data file
	> ...FINISHED
	> Compressing trace files
	> ...FINISHED
	> Cleaning up artifacts
	> ...FINISHED
	>
	> Trace saved to sampleTrace.trace.zip
	> ```

	The compressed trace file is now stored in the current working directory.

# Resolving Framework Symbols #
Framework symbols need to be manually generated at the time the trace is collected.  They are different than app-level symbols because the framework is pre-compiled while apps are just-in-time-compiled.  For code like the framework that was precompiled to native
code, you need a special tool called crossgen that knows how to generate the mapping from the native code to the name of the
methods.  

Perfcollect can handle most of the details for you, but it needs to have the crossgen tool and by default this is NOT part of
the standard .NET distribution.   If it is not there it warns you and refers you to these instructions.   To fix things you 
need to fetch EXACTLY the right version of crossgen for the runtime you happen to be
using.  If you place the crossgen tool in the same directory as the .NET Runtime DLLs (e.g. libcoreclr.so), then perfcollect can
find it and add the framework symbols to the trace file for you.

Normally when you create a .NET application, it just generates the DLL for the code you wrote, using a shared copy of the runtime
for the rest.   However you can also generate what is called a 'self-contained' version of an application and this contains all 
runtime DLLs.  It turns out that the crossgen tool is part of the Nuget package that is used to create these self-contained apps, so
one way of getting the right crossgen tool is to create a self contained package of any application.   

So you could do the following 
   >```bash
   > mkdir helloWorld
   > cd helloWorld
   > dotnet new console
   > dotnet publish --self-contained -r linux-x64
   >```
Which creates a new helloWorld application and builds it as a self-contained app.    The only subtlety here is that if you have
multiple versions of the .NET Runtime installed, the instructions above will use the latest.  As long as your app also uses
the latest (likely) then these instructions will work without modification.   

As a side effect of creating the self-contained application the dotnet tool will download a nuget package 
called runtime.linux-x64.microsoft.netcore.app and place it in 
the directory ~/.nuget/packages/runtime.linux-x64.microsoft.netcore.app/VERSION, where VERSION is the version number of 
your .NET Core runtime (e.g. 2.1.0).   Under that is a tools directory and inside there is the crossgen tool you need.

The crossgen tool needs to be put next to the runtime that is actually used by your application.   Typically your app uses the shared
version of .NET Core that is installed at /usr/share/dotnet/shared/Microsoft.NETCore.App/VERSION where VERSION is the 
version number of the .NET Runtime.   This is a shared location, so you need to be super-user to modify it.   If the
VERSION is 2.1.0 the commands to update crossgen would be 
   >```bash
   > sudo bash
   > cp ~/.nuget/packages/runtime.linux-x64.microsoft.netcore.app/2.1.0/tools/crossgen /usr/share/dotnet/shared/Microsoft.NETCore.App/2.1.0
   >```
  
Once you have done this, perfcollect will use crossgen to include framework symbols.  The warning that perfcollect used to
issue should go away.   This only has to be one once per machine (until you update your runtime). 

### Alternative: Turn off use of precompiled code

If you don't have the abiltiy to update the .NET Runtime (to add crossgen), or if the above procedure did not work
for some reason, there is another approach to getting framework symbols.   You can tell the runtime to simply 
not use the precompiled framework code.   The code will be Just in time compiled and the special crossgen tool
is not needed.   This works, but will increase startup time for your code by something like a second or two.  If you 
can tolerate that (you probably can), then this is an alternative.   You were already setting environment variables
in order to get symbols, you simply need to add one more.
	> ```bash 
	> export COMPlus_ZapDisable=1
	> ```
With this change you should get the symbols for all .NET code. 

## Getting Symbols For the Native Runtime

Most of the time you are interested in your own code, which perfcollect resolves by default.   Sometimes it is very
useful to see what is going on inside the .NET Framework DLLs (which is what the last section was about), but sometimes
what is going on in the NATIVE runtime dlls (typically libcoreclr.so), is interesting.  perfcollect will resolve the
symbols for these when it converts its data, but ONLY if the symbols for these native DLLs are present (and are beside
the library they are for).   

There is a global command called [dotnet symbol](https://github.com/dotnet/symstore/blob/master/src/dotnet-symbol/README.md#symbol-downloader-dotnet-cli-extension) which does this.   This tool was mostly desiged to download symbols
for debugging, but it works for perfcollect as well.  There are three steps to getting the symbols

   1. Install dotnet symbol
   2. Download the symbols.
   3. Copy the symbols to the correct place 

To install dotnet symbol issue the command 
```
     dotnet tool install -g dotnet-symbol
```
With that installed download the symbols to a local directory.  if your installed version of the .NET Core runtime is
2.1.0 the command to do this is 
```
    mkdir mySymbols
    dotnet symbol --symbols --output mySymbols  /usr/share/dotnet/shared/Microsoft.NETCore.App/2.1.0/lib*.so
```
Now all the symbols for those native dlls are in mySymbols.   You then have to copy them (as super user next to the 
dlls that they are for.
```
    sudo cp mySymbols/* /usr/share/dotnet/shared/Microsoft.NETCore.App/2.1.0
```
After this, you should get symbolic names for the native dlls when you run perfcollect.  
# Collecting in a Docker Container #
Perfcollect can be used to collect data for an application running inside a Docker container.  The main thing to know is that collecting a trace requires elevated privileges because the [default seccomp profile](https://docs.docker.com/engine/security/seccomp/) blocks a required syscall - perf_events_open.

In order to use the instructions in this document to collect a trace, spawn a new shell inside the container that is privileged.

>```bash
>docker exec -it --privileged <container_name> /bin/bash
>```

Even though the application hosted in the container isn't privileged, this new shell is, and it will have all the privileges it needs to collect trace data.  Now, simply follow the instructions in [Collecting a Trace](#collecting-a-trace) using the privileged shell.

If you want to try tracing in a container, we've written a [demo Dockerfile](https://raw.githubusercontent.com/dotnet/corefx-tools/master/src/performance/perfcollect/docker-demo/Dockerfile) that installs all of the performance tracing pre-requisites, sets the environment up for tracing, and starts a sample CPU-bound app.

# Filtering #
Filtering is implemented on Windows through the latest mechanisms provided with the [EventSource](https://msdn.microsoft.com/en-us/library/system.diagnostics.tracing.eventsource(v=vs.110).aspx) class. 

On Linux those mechanisms are not available yet. Instead, there are two environment variables that exist just on linux to do some basic filtering. 

* COMPlus_EventSourceFilter – filter event sources by name
* COMPlus_EventNameFilter – filter events by name

Setting one or both of these variables will only enable collecting events that contain the name you specify as a substring. Strings are treated as case insensitive. 

# Viewing a Trace #
Traces are best viewed using PerfView on Windows.  Note that we're currently looking into porting the analysis pieces of PerfView to Linux so that the entire investigation can occur on Linux.

## Open the Trace File ##
1. Copy the trace.zip file from Linux to a Windows machine.
2. Download PerfView from <http://aka.ms/perfview>.
3. Run PerfView.exe

	> ```cmd
	> PerfView.exe <path to trace.zip file>
	> ```

## Select a View ##
PerfView will display the list of views that are supported based on the data contained in the trace file.

- For CPU investigations, choose **CPU stacks**.
- For very detailed GC information, choose **GCStats**.
- For per-process/module/method JIT information, choose **JITStats**.
- If there is not a view for the information you need, you can try looking for the events in the raw events view.  Choose **Events**. 

For more details on how to interpret views in PerfView, see help links in the view itself, or from the main window in PerfView choose **Help->Users Guide**.

# Extra Information #
This information is not strictly required to collect and analyze traces, but is provided for those who are interested.

## Prerequisites ##
Perfcollect will alert users to any prerequisites that are not installed and offer to install them.  Prerequisites can be installed automatically by running:

>```bash
>sudo ./perfcollect install
>```

The current prerequisites are:

1. perf: Also known as perf_event, the Linux Performance Events sub-system and companion user-mode collection/viewer application.  perf is part of the Linux kernel source, but is not usually installed by default.
2. LTTng: Stands for "Linux Tracing Toolkit Next Generation", and is used to capture event data emitted at runtime by CoreCLR.  This data is then used to analyze the behavior of various runtime components such as the GC, JIT and thread pool.