Changes between Version 31 and Version 32 of RTSsummaryEvents
- Timestamp:
- 01/24/12 14:09:47 (16 months ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
RTSsummaryEvents
v31 v32 149 149 == The list of needed new events or even parameters == 150 150 151 === New memory stats events === 151 There is a wealth of statistics around heaps and GC. Some of the stats are reasonably common, shared by different implementations while many more are highly specific to a particular implementation. Even if we ignore RTSs other than GHC, we still have an issue of flexibility for future changes in GHC's RTS. 152 152 153 * `EVENT_HEAP_ALLOCATED (bytes)`: is the total bytes allocated over the whole run by this HEC. That is we count allocation on a per-HEC basis. This event is in principle not tied to GC, it could be emitted any time.153 Our solution is to split the stats into two groups, a general group that make sense in many implementations, and a second that are highly GHC-specific. Analyses and visualisations based on the first group are likely to be portable to other RTS instances and changes in GHC's RTS. The second is likely to have to change when GHC changes, but it does at least contain the less frequently used info and does not need so much visualisation. 154 154 155 * `EVENT_HEAP_SIZE (bytes)`: is the current bytes allocated from the OS to use for the heap. For the current GHC RTS this is the `MBlock`s, kept in the `mblocks_allocated` var. Again, this in principle could be emitted any time. The maximum accuracy would be to emit the event exactly when MBlocks are allocated or freed. 155 === New general memory stats events === 156 156 157 * `EVENT_HEAP_ LIVE (bytes)`: is the current amount of live/reachable data in the heap. This is almost certainly only known after a major GC.157 * `EVENT_HEAP_ALLOCATED (alloc_bytes)`: is the total bytes allocated over the whole run by this HEC. That is we count allocation on a per-HEC basis. This event is in principle not tied to GC, it could be emitted any time. 158 158 159 * `EVENT_GC_STATS (copied, slop, fragmentation)`: various less used GC stats (probably GHC specific, and specific to current GC design) 159 * `EVENT_HEAP_SIZE (heap_capset, size_bytes)`: is the current bytes allocated from the OS to use for the heap. For the current GHC RTS this is the `MBlock`s, kept in the `mblocks_allocated` var. Again, this in principle could be emitted any time. The maximum accuracy would be to emit the event exactly when MBlocks are allocated or freed. 160 161 * `EVENT_HEAP_LIVE (heap_capset, live_bytes)`: is the current amount of live/reachable data in the heap. This is almost certainly only known after a major GC. 162 163 164 === New GHC-specific general memory stats events === 165 166 * `EVENT_HEAP_INFO_GHC (heap_capset, gens, max_size, nursary_size)`: various static parameters relating to the heap. In particular it tells us the number of generations. 167 168 * `EVENT_GC_STATS_GHC (heap_capset, gen, copied, slop, frag, was_par, max_copied, avg_copied)`: various less used GC stats. These are all GHC-specific, and specific to current GC design. It includes the generation of this GC so along with knowing the total number of generations we can identify major/minor GCs. We also include memory lost to slop and fragmentation. The final three are to do with parallel GC: the first is just a flag to indicate if this GC was a parallel GC, the ratio of the other two gives the parallel work balance (this ratio can be extended to multiple GCs to get an average figure). 169 160 170 161 171 === Identifying heaps in eventlogs === 162 172 163 In the above events, the "allocated since prog start" is done per-HEC, but the heap total size and live data sizeapply to the heap as a whole, not a particular HEC.173 In the above events, the "allocated since program start" is done per-HEC, but the others apply to the heap as a whole, not a particular HEC. 164 174 165 For completeness / future-proofing it may be wise to explicitly identify heaps and to have the heap size/live events tag the heap to which they apply. Remember that we can merge event logs from multiple processes, so there is already no truly global notion of heap, implicitly it would be the single heap belonging to the HEC that emits the event. We would also have to make the assumption that there is a single heap per OS process (we can already identify which HECs belong to the same OS process). Alternatively we can explicitly identify heaps using the existing capset (capability set) mechanism. We would add a new capset type:175 For completeness explicitly identify heaps by identifying the heap to which the events apply. (Remember that we can merge event logs from multiple processes, so there is already no truly global notion of heap, implicitly it would be the single heap belonging to the HEC that emits the event. We would also have to make the assumption that there is a single heap per OS process (we can already identify which HECs belong to the same OS process). Alternatively we can explicitly identify heaps using the existing capset (capability set) mechanism.) 166 176 167 {{{ 168 /* 169 * Capset type values for EVENT_CAPSET_CREATE 170 */ 171 #define CAPSET_TYPE_CUSTOM 1 /* reserved for end-user applications */ 172 #define CAPSET_TYPE_OSPROCESS 2 /* caps belong to the same OS process */ 173 #define CAPSET_TYPE_CLOCKDOMAIN 3 /* caps share a local clock/time */ 174 #define CAPSET_TYPE_GCHEAP 4 /* caps share a GC'd heap */ 175 }}} 177 The GHC RTS already puts all of its HECs into a capset for the OS process. We can reuse this capset. 176 178 177 We would then make a capset for the main heap and add all HECs to it. The `EVENT_HEAP_SIZE` and `EVENT_HEAP_LIVE` events would then have a capset argument to indicate the heap.179 If in future we allow multiple independent heaps in the same OS process (e.g. separate RTS instances) then this scheme would let us cope by making a separate capset. Similarly it'd cope with implementations like GdH which use a global heap spanning multiple OS processes. 178 180 179 If in future we allow multiple independent heaps in the same OS process (e.g. separate RTS instances) then this would let us cope. Similarly it'd cope with implementations like GdH which use a global heap spanning multiple OS processes. Would it be useful for talking about per-HEC local heaps? 180 181 === New parameters for GC stats events === 182 183 * modify `EVENT_GC_START` to add a `(generation)` field. The generation number in a generational GC scheme. Use -1 if not applicable. 184 185 When the RequestSeqGC and RequestParGC events are emitted, it's not yet know if the GC will be major or minor, so no extra parameters should be added to them. 181 === Tinkering === 186 182 187 183 While we tinker with these events, we could try to ensure
![(please configure the [header_logo] section in trac.ini)](/ThreadScope/chrome/site/your_project_logo.png)