Paper #5 Authors: Gibert E., Sanchez J., Gonzalez A. Title:Local Scheduling Techniques for Memory Coherence in a Clustered VLIW Processor with a Distributed Data Cache Reviewer name (here not confidential) Adrian Pop, adrpo@ida.liu.se Short summary (max 10 lines) The paper presents two software methods of local scheduling techniques for architectures consisting of clustered VLIW processor and distributed data caches. These techniques are improving local instruction scheduling that take into consideration memory coherence. One solution called Memory Dependent Chains is scheduling all instructions from one set of memory dependent instructions to the same cluster. The second solution is based on two transformation of the Data Dependence Graph: store replication and load store synchronizations. Evaluation of the two methods are detailed. The same solutions are evaluated on the same architecture that use attraction buffers. The main contributions Local scheduling algorithms that take into consideration the memory coherence and are not based on additional hardware. Merits and weaknesses + The instruction scheduling is more in depth considered. - Some things are not explained enough, they only make references to their previous work Numerical rating in the interval from 0 (very bad) to 10 (excellent) * Significance: 8 * Originality: 9 * Interest to a journal on programming languages and compiler technology: 5 * Quality of experimental evaluation: 7 * Overall organization: 6 * Presentation (language and style): 7 * Length appropriate: 6 * References appropriate: 8 * Overall evaluation (0..10): 7 * Recommendation: weak accept * Your confidence in your review (1=novice, 10=leading expert): 7 Comments to the authors In section 3.2 and 3.3 a picture of how the scheduling works would be better than the long description. Suggestions for improvement (for the authors) (short - no minor details here) - organization of the paper could be improved - more emphasis on the problem formulation - explain how in the PrefClus method the preferred cluster is obtained, explain the profiling method