Changes between Initial Version and Version 1 of HowToAddNewGraphs

03/06/12 11:37:56 (6 years ago)
MikolajKonarski (IP:



  • HowToAddNewGraphs

    v1 v1  
     1== How to add new graphs to TS == 
     3I'm afraid this is not understandable without the context of Threadscope internals and without some extra explanation. Sorry. Please ask questions. 
     7To draw a graph from an eventlog I need the data preprocessed and then I just pick a portion to show on-screen. 
     8Best if I can take the data from a validation profile and then just process some more, as we do for Histogram (from spark profile). 
     9In this case we know the data makes sense and we can use the finite automaton validation engine  
     10for all the FA mangling of the list of events. 
     11So the workflow for drawing a graph (and implementing a new one), as seen in Histogram, is: 
     131. parse the events in ghc-events 
     142. optionally validate them in ghc-events 
     153. generate a profile using a specific profile function for validator 
     164. preprocess the profile some more and store until eventlog reloaded 
     175. select the data for the required interval (or zoom/pan factor) 
     186. process yet more 
     197. draw 
     21If the data is very dense, storing it in step 4 needs to use a zoom-tree of some kind. 
     22For speedup under certain usage scenarios (many redraws with only some parameters changing),  
     23the data can be cached in step 5 (for user-defined graphs) or step 6 (for fixed graphs),  
     24until the relevant interval or zoom/pan or other parameter changes. 
     26For simple events that do not require a lot of finite automaton mangling, we may skip steps 2 and 3. 
     27If our spark graphs were built from the detailed spark events, we'd best use the validator profiles,  
     28but instead we use the spark counters, so the profiling work is actually done in GHC, so steps 2 and 3 are not needed. 
     29For such simple graphs without FA (allocation rate is an example), the existing zoom-trees or the zoom-cache library suffice. 
     30Allocation rate happens to be sampled as often as spark counters,  
     31so it actually fits best into the spark trees, no new tree kind is needed. 
     32But generally, we may be best off to set up a single zoom tree with very small sampling interval  
     33and resample all data (sparks included) into that tree. 
     34We'd lose some data, unless the sampling interval is 1, but we'd gain flexibility  
     35and the accuracy of visualization of spark rates would actually improve  
     36(except at very high zoom levels where the resampling noise overweights  
     37the more accurate rate of change calculation due to equal sample intervals). 
     39GC is a border case; there's clearly a FA, but now that we don't have to track RequestParGC,  
     40it only has 6 states and the transitions are simple compared to the actual data processing that is triggered by transitions. 
     41So if we don't want to validate GC events just for validation sake,  
     42it's IMHO not mandatory to encode the FA rules in the validator profile and rewrite the code to use that. 
     43But if we already validate GC, then we should also make use of the validation profile,  
     44if only to ensure consistency between validation and visualization. 
     46Totally new kinds of graphs for old events require changes from step 3 onward. 
     47For user-defined graphs, if we gather enough data in an efficient format (zoom trees) in steps 3--5,  
     48we may just recompute 6 and 7 for each drawing, based on the current user graph definitions.