
Introduction
The GraphNavigator is an interactive tool for rapid visualisation of performance traces. Simulations of the LIL machine using various caches may generate traces for memory access. These traces are visualised for direct comparison and overview with the GraphNavigator tool.The GraphNavigator tool visualises instantaneous hit rate, and instantaneous locality as defined by Dee Weikle et al. On this page we see applets to demonstrate navigation of an instruction trace. The program under investigation is a quicksort of 100 elements. The navigator tools allows simultaneous comparison of multiple traces (multiple cache configurations). There are visualisations of the locality and hit rate of the address streams.
From a partitioned caching perspective, it must be noted that the current compiler defined a partition size of 32 lines (in a 4 word per line cache) for this program. This selection is plotted with surrounding cache sizes, and in the case of locality, the locality of the original trace in included.
This page demonstrates two key aspects. Firstly (since our research is in the partitioned caching area) that we can automatically partition the instruction caching requirements of programs (recursive or not). Secondly, it is a technology demonstration of the cache efficiency measurements by Dee Weikle et al.
How To Make It Work
The GraphNavigator is written in Java and so should be visible on the page as an applet. If this fails to work, you can download a copy of the system to play with locally. In either case, you will some sort of Java 1.2 runtime environment installed for your system either in plugin form, for your browser, or JDK/JRE form for you local system. In the latter case, you need to be able to issue the command :In terms of browser compatibility, the applets will probably only work with Netscape systems. This is because IE only understands OBJECT tags, Netscape only understants EMBED tags and HotJava only understands basic APPLET tags. This is wrong and breaks what was a nice platform independent system. Do something about it by checking out The Java Lobby.
Performance overview
To briefly offer some traditional performance information, the 16 line, 32 line, and 64 line caches each achieve 85.5% (4571 misses), 92.7% (2295 misses), and 99.8% (78 misses) in overall hit ratio respectively.Navigation landmarks
The program executes a simple loop at the beginning up to trace point 950(ish) that fills an array with unordered data. Then the quicksort starts. We see a regular winged locality for the first, long descent of the quicksort recursion where each small loop hump is an itteration of a loop in the partition function of the quicksort. There follows a dispersed set of executions of partition and unravelling of the nested quicksort calls.Aligning both instantaneous hit rate and locality visualisations offers two perspectives, almost validation, of performance with the opportunity for discussion (or resoning) about performance.
Instantaneous hit rate
This is a navigable graph of instruction traces generated from a complete quicksort of 100 elements. There are three traces on this graph. Direct mapped caches of 16 lines in blue, 32 lines in red, and 64 lines in green.
Instantaneous locality
This is a navigable graph of instruction traces generated from a complete quicksort of 100 elements. There are four traces on this graph. Direct mapped caches of 16 lines in blue, 32 lines in red, 64 lines in green, and the original instruction address trace (as emitted by the CPU) in magenta.
Dan Page, page@cs.bris.ac.uk, James Irwin, jimbob@cs.bris.ac.uk, Henk Muller, Henk.Muller@bristol.ac.uk, David May, David.May@bristol.ac.uk. Last modified on Friday 6 August 1999 at 14:19. © 1999 University of Bristol

