corg image

The CORG group is conducting research in programming language implementations, compiler optimisations, computer architecture and compiler support for embedded systems. Our research has been supported by a number of competitive national research and industry grants.

The group has weekly meetings during which we discuss research topics in the areas of programming languages and compilers. Often, the topics are related to our current research projects consisting of student presentations regarding their recent progress. From time to time, there will be seminars centred around discussing one or more papers related to a given topic, with a "discussion leader" presenting the papers and answering the questions.

If you would like to be added to the group's mailing list or are interested in giving a talk, please send an email to me.

Staff

PhD Students

Recent Publications

Compiler Techniques for Multicores

  1. Jiacheng Zhao, Huimin Cui, Jingling Xue, Xiaobing Feng, Youliang Yan and Wensen Yang. An Empirical Model for Predicting Cross-Core Performance Interference on Multicore Processors. In 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT'13) , pages ? -- ?, Edinburgh, 2013. (PDF)

  2. Xuejun Yang, Zhiyuan Wang, Jingling Xue and Yun Zhou. The Reliability Wall for Exascale Supercomputing. IEEE Transactions on Computers (TC), 61(6):767 -- 779, 2012. (PDF)

  3. Wei Mi, Xiaobing Feng, Jingling Xue and Yao-Cang Jia. Software-Hardware Cooperative DRAM Bank Partitioning for Chip Multiprocessors. In 7th IFIP International Conference on Network and Parallel Computing (NPC'10) , pages 329-343, ZhengZhou, 2010.

  4. Lin Gao, Jingling Xue and Tin-Fook Ngai. Loop Recreation for Thread-Level Speculation on Multicore Processors. Software -- Practice and Engineering (SPE), 40(1):45 -- 72, 2010. (PDF)

  5. Lin Gao, Lian Li, Jingling Xue and Tin-Fook Ngai. Exploiting Speculative TLP in Recursive Programs by Dynamic Thread Prediction. In 2009 International Conference on Compiler Construction (CC'09), pages 78 -- 93, York, UK, 2009. (PDF)

  6. Lin Gao, Quan Hoang Nyugen, Lian Li, Jingling Xue and Tin-Fook Ngai. Thread-Sensitive Modulo Scheduling for Multicore Processors. In 2008 International Conference on Parallel Processing (ICPP'08), pages 132 -- 140, Portland, Oregon, 2008. (PDF)

  7. L. Gao, L. Li, J. Xue and T.K  Ngai. Loop Recreation for Thread-Level Speculation. In 2007 International Conference on Parallel and Distributed Systems (ICPADS'07), Hsingchu, Taiwan, 2007. (PDF)

Backend Optimisations

  1. Huimin Cui, Qing Yi, Jingling Xue and Xiaobing Feng. Layout-oblivious Compiler Optimization for Matrix Computations. The 8th International Conference on High-Performance Embedded Architectures and Compilers (HiPEAC'13), Berlin, Germany, 2013. (PDF)

  2. Huimin Cui, Qing Yi, Jingling Xue and Xiaobing Feng. Layout-oblivious Compiler Optimization for Matrix Computations. The 8th International Conference on High-Performance Embedded Architectures and Compilers (HiPEAC'13), Berlin, Germany, 2013. (PDF)

  3. Huimin Cui, Jingling Xue, Lei Wang, Yang Yang, Xiaobing Feng and Dongrui Fan. Extendable Pattern-Oriented Optimization Directives. ACM Transactions on Architecture and Code Optimization (TACO), 9(3):14:1--14:37, 2012. (PDF)

  4. Huimin Cui, Jingling Xue, Lei Wang, Yang Yang, Xiaobing Feng and Dongrui Fan. Extendable Pattern-Oriented Optimization Directives. ACM Transactions on Architecture and Code Optimization (TACO), 9(3):14:1 -- 14:37, 2013. (PDF)

  5. Huimin Cui, Jingling Xue, Lei Wang, Yang Yang, Xiaobing Feng and DongRui Fan. Extendable Pattern-Oriented Optimization Directives. In 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO'11), pages 107 -- 118, Chamonix, France, 2011. (PDF)

  6. J. Xue and Q. Cai. A lifetime optimal algorithm for speculative PRE. ACM Transaction on Architecture and Code Generation, 3(2):115-155, 2006. (PDF)

  7. J. Xue and J. Knoop. A fresh look at PRE as a maximum flow problem. In 2006 International Conference on Compiler Construction (CC'06), pages 139 -- 154, Vienna, Austria, 2006. (PDF)

  8. J. Xue, Q. Cai and L. Gao. Partial dead code elimination on predicated code regions. Software -- Practice and Engineering, 36(15): 1655-1685, 2006.

  9. C. Yang, X. Yang and J. Xue. Improving the Performance of GCC by Exploiting IA-64 Architectural Features. In 10th Asia-Pacific Computer Systems Architecture Conference (ACSAC'05) , pages 236 -- 251, Singapore, 2005. (Postscript)

  10. Q. Cai and L. Gao and J. Xue. Region-based partial dead code elimination on predicated code. In 2004 International Conference on Compiler Construction (CC'04) , pages 150 -- 166, Barcelona, Spain, 2004. (Postscript)

  11. Q. Cai and J. Xue. Optimal and efficient speculation-based partial redundancy elimination. In 1st Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO'03) , pages 91 -- 102, San Francisco, 2003. (Postscript)

Cache Analyses and Optimisations

  1. Wei Mi, Xiao-Bing Feng, Yao-Cang Jia, Li Chen and Jingling Xue. PARBLO: Page-Allocation-Based DRAM Row Buffer Locality Optimization. Journal of Computer Science and Technology (JCST), 24(6): 1086 -- 1097, 2009.

  2. X. Vera, B. Lisper and J. Xue. Data Cache Locking for Tight Timing Calculations. ACM Transactions on Embedded Systems (TECS), 7(1):14:1 -- 14:38, 2007.

  3. J.  Xue, Q. Huang and M. Guo. Enabling Loop Fusion and Tiling for Cache Performance by Fixing Fusion-Preventing Data Dependences. In 2005 International Conference on Parallel Processing (ICPP'05), pages 107 - 115, Oslo, Norway, 2005. (Postscript)

  4. J. Xue and Q. Huang. Code Tiling: One Size Fits All. In G. T. Yang and M. Guo, editors, High Performance Computing: Paradigm and Infrastructure , Chapter 11, pages 219--240. John Wiley & Sons Inc., 2004.

  5. J. Xue and X. Vera. Efficient and accurate analytical modeling of whole-program data cache behavior. IEEE Transactions on Computers, 53(5):547--566, 2004.

  6. X. Vera, B. Lisper and J. Xue. Data caches in multitasking hard real-time systems. In 24th IEEE International Real-Time Systems Symposium (RTSS'03), pages 154 -- 165, Cancun, Mexico, 2003. (Postscript)

  7. Q. Huang, J. Xue and X. Vera. Code tiling for improving the cache performance of PDE solvers. In 2003 International Conference on Parallel Processing (ICPP'03), pages 615 -- 626, Kaohsiung, Taiwan, 2003. (Postscript)

  8. X. Vera, B. Lisper and J. Xue. Data cache locking for higher program predictability. In 2003 ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'03) , pages 272 - 282, San Diego, 2003. (Postscript)

  9. X. Vera and J. Xue. Let's study whole-program cache behaviour analytically. In 8th International Symposium on High-Performance Computer Architecture (HPCA-8), pages 175 -- 186, Boston, MA, 2002 (Postscript)

  10. X. Vera and J. Xue. Efficient compile-time analysis of cache behavior for programs with IF statements. In 5th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP'02) , pages 396-407, Beijing, 2002.

  11. J. Xue and C.-H. Huang. Reuse-driven tiling for improving data locality. International Journal of Parallel Programming, 26(6):671-696, 1998. (Postscript)

Compiler Techniques for Embedded Systems

  1. Jianli Li, Jingling Xue, Xinwei Xie, Qing Wan, Qingping Tan, Lanfang Tan. Epipe: a Low-Cost Fault-Tolerance Technique Considering WCET Constraints. Journal of Systems Architecture, 2013.

  2. Xuemeng Zhang, Hui Wu and Jingling Xue. Instruction scheduling with k-successor tree for clustered VLIW processors. Design Automation for Embedded Systems, 1 -- 20, 2013. (PDF)

  3. Qing Wan, Hui Wu and Jingling Xue. Scratchpad Memory Aware Task Scheduling with Minimum Number of Preemptions on a Single Processor. In 8th Asia and South Pacific Design Automation Conference" (ASP-DAC 2013), pages 741 -- 748. Yokohama, Japan, 2013. (PDF)

  4. Qing Wan, Hui Wu and Jingling Xue. WCET-Aware Data Selection and Allocation for Scratchpad Memory. In ACM SIGPLAN/SIGBED 2012 International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'12), pages 41 -- 50, Beijing, 2012. (PDF)

  5. Xuejun Yang, Li Wang and Jingling Xue. Comparability Graph Coloring for Optimizing Utilization of Software-Managed Stream Register Files for Stream Processors. ACM Transactions on Architecture and Code Optimization (TACO), 9(1), 2012. (PDF)

  6. Xuemeng Zhang, Hui Wu and Jingling Xue. An Efficient Heuristic for Instruction Scheduling on Clustered VLIW Processors. In International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES'11), Taipei, 2011. (PDF)

  7. Lian Li, Jingling Xue and Jens Knoop. Scratchpad Memory Allocation for Data Aggregates via Interval Coloring in Superperfect Graphs. ACM Transactions on Embedded Computing Systems (TECS), 10(2):28-1 -- 28:48, 2011. (PDF)

  8. Meng Wang, Zili Shao and Jingling Xue. On Reducing Hidden Redundant Memory Accesses for DSP Applications. IEEE Transactions on Very Large Scale Integration Systems (TVLSI), 19(6):997--1010, 2011. (PDF)

  9. Lian Li, Jingling Xue and Jens Knoop. Scratchpad Memory Allocation for Data Aggregates via Interval Coloring in Superperfect Graphs. ACM Transactions on Embedded Computing Systems (TECS), 10(2):28-1 -- 28:48, 2011. (PDF)

  10. Hui Wu, Jingling Xue, and Sridevan Parameswaran. Optimal WCET-Aware Code Selection for Scratchpad Memory. In 2010 International Conference on Embedded Software (EMSOFT'10), Scottsdale, AZ, 2010. (PDF)

  11. Xuejun Yang, Ying Zhang, Xicheng Lu, Jingling Xue, Ian Rogers, Gen Li and Xudong Fang. Exploiting the Reuse Supplied by Loop-Dependent Stream References for Stream Processors. ACM Transactions on Architecture and Code Optimization (TACO), 2010. (PDF)

  12. Li Wang, Jingling Xue and Xuejun Yang. Reuse-Aware Modulo Scheduling for Stream Processors. In International Conference on Design, Automation and Test in Europe (DATE'10) , Dresden, 2010. (PDF)

  13. Xuejun Yang, Li Wang, Jingling Xue, Yu Deng and Ying Zhang. Comparability Graph Coloring for Optimizing Utilization of Stream Register Files in Stream Processors. In 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'09) , pages 111 -- 120, North Carolina, 2009. (PDF)

  14. Lian Li, Hui Feng and Jingling Xue. Compiler-directed scratchpad memory management via graph coloring. ACM Transactions on Architecture and Code Optimization (TACO), 6(3), 2009. (PDF)

  15. Xuejun Yang, Ying Zhang, Jingling Xue, Ian Rogers, Gen Li and Guibin Wang. Exploiting Loop-Dependent Stream Reuse for Stream Processors. In 17th International Conference on Parallel Architectures and Compilation Techniques (PACT'08) , pages 28 -- 37, Toronto, 2008. (PDF)

  16. L. Wang, X. Yang, J. Xue, Y. Deng, X. Yan, T. Tang and Q. H. Nguyen. Optimizing Scientific Application Loops on Stream Processors. In ACM SIGPLAN/SIGBED 2008 International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'08), pages 161 -- 170, Tucson, AZ, 2008.

  17. L. Li, H. Wu, H. Feng and J. Xue. Towards Data Tiling for Whole Programs in Scratchpad Memory Allocation. In 12th Asia-Pacific Computer Systems Architecture Conference (ACSAC'07), pages 63 -- 74, Seoul, Korea, 2007.

  18. B. Scholz, B. Burgstaller and J. Xue. Minimizing Bank Selection Instructions for Partitioned Memory Architectures. ACM Transactions on Embedded Computing Systems (TECS) , 7(2), 2008.

  19. L. Li, Q. H. Nguyen and J. Xue. Scratchpad Allocation for Data Aggregates in Superperfect Graphs. In ACM SIGPLAN/SIGBED 2007 International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'07), pages 207 -- 216, San Diego, 2007. (PDF)

  20. L. Li and J. Xue. Trace-based Leakage Energy Optimisations at Link Time. Journal of Systems Architecture, 53(1):1--20, 2007.

  21. B. Scholz, B. Burgstaller and J. Xue. Minimizing Bank Selection Instructions for Partitioned Memory Architectures} In International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES'06), pages 201--211, Seoul, Korea, 2006. (One of the Four Best Paper Candidates) (PDF)

  22. H. Wu, J. Jaffar and J. Xue. Instruction Scheduling with Release Times and Deadlines on ILP Processors In 12th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA'06), pages 51--60, Sydney, Australia, 2006. (PDF)

  23. L. Li and J. Xue. Trace-based Data Cache Leakage Reduction at Link Time. In 11th Asia-Pacific Computer Systems Architecture Conference (ACSAC'06), pages 175--188, Shanghai, China, 2006.

  24. L.  Li, L. Gao and J. Xue. Memory coloring: a compiler approach for automatic scratchpad memory management. In 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05) , pages 329 -- 338, Saint Louis, Missouri, 2005. (PDF)

  25. L. Li and J. Xue. A trace-based binary compilation framework for energy-aware computing. In ACM SIGPLAN/SIGBED 2004 International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'04) , pages 95 -- 106, Warshington, DC, 2004. (PDF)

Compiler Techniques for GPGPUs

  1. Huimin Cui, Lei Wang, Jingling Xue, Xiaobing Feng and Yang Yang. Automatic Library Generation for BLAS3 on GPUs. In 25th IEEE International Parallel and Distributed Processing Symposium (IPDPS'11)), pages 1080 -- 1092, Anchorage (Alaska), USA, 2011. (PDF)

  2. Peng Di, Hui Wu, Jingling Xue, Feng Wang and Canqun Yang. Parallelizing SOR for GPGPUs Using Alternate Loop Tiling Parallel Computing, 38(6-7):310 -- 328, 2012. (PDF)

  3. Xinhai Xu, Xuejun Yang, Jingling Xue, Yufei Lin and Yisong Lin. PartialRC: A Partial Recomputing Method for Efficient Fault Recovery on GPGPUs. Journal of Computer Science and Technology (JCST), V(2):240--255, 2012.

  4. Yang Yang, Huimin Cui, Xiao-Bing Feng and Jingling Xue. A Hybrid Circular Queue Method for Iterative Stencil Computations on GPUs. Journal of Computer Science and Technology (JCST), 27(1):57-- 74, 2012.

  5. Huimin Cui, Qing Yi, Jingling Xue, Lei Wang, Yang Yang and Xiaobing Feng. A highly-parallel reuse distance analysis algorithm on GPUs. In 26th IEEE International Parallel and Distributed Processing Symposium (IPDPS'12)), pages 1080 -- 1092, Shanghai, China 2012. (PDF)

  6. Peng Di and Jingling Xue. Model-Driven Tile Size Selection for DOACROSS Loops on GPUs. In 17th International European Conference on Parallel and Distributed Computing (Euro-Par'11), Bordeaux, France, 2011. (PDF)

  7. Peng Di, Qing Wan, Xuemeng Zhang, Hui Wu and Jingling Xue. Toward Harnessing DOACROSS Parallelism for Multi-GPGPUs. In 2010 International Conference on Parallel Processing (ICPP'10), San Diego, 2010. (PDF)

Static and Dynamic Program Analysis

  1. Sen Ye, Yulei Sui and Jingling Xue. Region-based Selective Flow-Sensitive Pointer Analysis. In 21th International Static Analysis Symposium (SAS'14), pages ?? -- ??, Munich, 2014. (PDF)

  2. Yue Li, Tian Tan, Yulei Sui and Jingling Xue. Self-Inferencing Reflection Resolution for Java. In 28th European Conference on Object-Oriented Programming (ECOOP'14), pages ?? -- ??, Uppsala, 2014. (PDF)

  3. Yulei Sui, Ding Ye and Jingling Xue. Detecting Memory Leaks Statically with Full-Sparse Value-Flow Analysis. IEEE Transactions on Software Engineering (TSE), ??(?):??? -- ???, 2014. To appear. (PDF)

  4. Yulei Sui, Sen Ye, Jingling Xue and Jie Zhang. Making Context-Sensitive Inclusion-based Pointer Analysis Practical for Compilers Using Parameterised Summarisation. Software -- Practice and Engineering (SPE), ??(?):??? -- ???, 2014. To appear. (PDF)

  5. Ding Ye, Yulei Sui and Jingling Xue. Accelerating Dynamic Detection of Uses of Undefined Variables with Static Value-Flow Analysis. In 11th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO'14), pages ?? -- ??, Orlando, Florida, 2014. (PDF)

  6. Yu Su, Ding Ye and Jingling Xue. Accelerating Inclusion-based Pointer Analysis on Heterogeneous CPU-GPU Systems. In 2013 IEEE International Conference on High Performance Computing (HiPC'13) , pages 149 -- 158, 2013. (PDF)

  7. Yi Lu, Lei Shang, Xinwei Xie and Jingling Xue. An Incremental Points-to Analysis with CFL-Reachability. In 2013 International Conference on Compiler Construction (CC'13), pages 61 -- 81, Rome, Italy, 2013. (PDF)

  8. Yulei Sui, Yue Li and Jingling Xue. Query-Directed Adaptive Heap Cloning For Optimizing Compilers. In 11th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO'13), pages 1 -- 11, ShenZhen, China, 2013. (PDF)

  9. Yian Zhu, Yue Li, Jingling Xue, Tian Tan, Jialong Shi, Yang Shen and Chunyan Ma. What is System Hang and How to Handle it. In 23rd IEEE International Symposium on Software Reliabilty Engieering (ISSRE'12), pages 141 -- 150, Dallas, TX, 2012. (PDF)

  10. Lei Shang, Yi Lu and Jingling Xue. Fast and Precise Points-to Analysis with Incremental CFL-Reachability Summarisation. In 27th IEEE/ACM International Conference on Automated Software Engineering (ASE'12), pages 270 -- 273, Essen, Germany, 2012. (PDF)

  11. Yulei Sui, Ding Ye and Jingling Xue. Static Memory Leak Detecttion Using Full-Sparse Value-Flow Analysis. In International Symposium on Software Testing and Analysis (ISSTA'12), pages 254 -- 264, Minneapolis, MN, 2012. (PDF)

  12. Xinwei Xie, Jingling Xue and Jie Zhang. AccuLock: Accurate and Efficient Detection of Data Races. Software -- Practice and Engineering (SPE), 43(5):543 -- 576, 2013. (PDF)

  13. Lei Shang, Xinwei Xie and Jingling Xue. On-Demand Dynamic Summary-Based Points-to Analysis. In 10th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO'12), pages 264 -- 274, San Jose, California, 2012. (PDF)

  14. Yulei Sui, Sen Ye, Jingling Xue and Pen-Chung Yew. SPAS: Scalable Path-Sensitive Pointer Analysis on Full-Sparse SSA. In 9th Asian Symposium on Programming Languages and Systems (APLAS'11) . Kenting, Taiwan, 2011. (PDF)

  15. Xinwei Xie and Jingling Xue. AccuLock: Accurate and Efficient Detection of Data Races. In 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO'11), pages 201 -- 212, Chamonix, France, 2011. (PDF)

  16. Hongtao Yu, Jingling Xue, Wei Huo, Xiaobing Feng, Zhaoqing Zhang. Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code. In 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO'10) , pages 218 -- 229, Toronto, 2010. (PDF)

  17. J. Xue, P. Nguyen and J. Potter. Interprocedural Side-Effect Analysis for Incomplete Object-Oriented Software Modules. Journal of Systems and Software, 80(1):92-105, 2007.

  18. P. Nguyen and J. Xue. Interprocedural side-effect analysis for Java programs in the presence of dynamic class loading. In 28th Australasian Computer Science Conference (ACSC'05) , pages 9 -- 18, Newcastle, Australia, 2005. (Best Paper Award)

  19. J. Xue and P. Nguyen. Completeness analysis for incomplete object-oriented programs. In 2005 International Conference on Compiler Construction (CC'05), Edinburgh, UK, 2005. (PDF)

  20. P. Nguyen and J. Xue. Strength Reduction for Loop-Invariant Types. In 27th Australasian Computer Science Conference (ACSC'04) , Dunedin, New Zealand, 2004. (Best Student Paper Award)

Programming Languages and Models

  1. Yi Lu, John Potter and Jingling Xue. Structural Lock Correlation with Ownership Types. In 2013 European Symposium of Programming (ESOP'13), pages 391 -- 410, Rome, Italy, 2013. (PDF)

  2. Yi Lu, John Potter and Jingling Xue. Ownership Types for Object Synchronisation. In 10th Asian Symposium on Programming Languages and Systems (APLAS'12), pages 18 -- 33. Kyoto, Japan, 2012. (PDF)

  3. Lin Gao, Lian Li, Jingling Xue and Pen-Chung Yew. SEED: A Statically-Greedy and Dynamically-Adaptive Approach for Speculative Loop Execution. IEEE Transactions on Computers (TC), 62(5):1004--1016, 2013. (PDF)

  4. Yi Lu, John Potter, Chenyi Zhang and Jingling Xue. A Type and Effect System for Determinism in Multithreaded Programs. In 2012 European Symposium of Programming (ESOP), pages 518 -- 538, Tallinn, Estonia, 2012. (PDF)

  5. Yi Lu, John Potter and Jingling Xue. Ownership Downgrading for Ownership Types In The Seventh Asian Symposium on Programming Languages and Systems (APLAS'09) , pages 144 -- 160, Seoul, 2009. (PDF)

  6. Y. Lu, J. Potter and J. Xue. Validity Invariants and Effects. In 21st European Conference on Object-Oriented Programming (ECOOP'07), pages 202 -- 226, Berlin, 2007. (PDF)

Parallelising Compiler Techniques

  1. Duo Liu, Yi Wang, Zili Shao, Minyi Guo and Jingling Xue. Optimally Maximizing Iteration-Level Loop Parallelism. IEEE Transactions on Parallel and Distributed Systems (TPDS), 23(3):564 -- 572, 2011. (PDF)

  2. Duo Liu, Zili Shao, Meng Wang, Minyi Guo and Jingling Xue. Optimal Loop Parallelization for Maximizing Iteration-Level Parallelism. In International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES'09), pages 67 -- 76, Grenoble, 2009. (PDF)

  3. J. Xue, M. Guo and D. Wei. Improving the Parallelism of Iterative Methods by Aggressive Loop Fusion. Journal of Supercomputing, 43(2):147-164, 2008.

  4. L.  Pan, J. Xue, M. Lai, M. Dillencourt and L. Bic. Toward Automatic Data Distribution for Migrating Computations. In 2007 International Conference on Parallel Processing (ICPP'07), Xian, 2007. (PDF)

  5. J. Xue. Aggressive loop fusion for improving locality and parallelism of iterative methods. In 3rd International Symposium on Parallel and Distributed Processing and Applications (ISPA'05) , pages 224 -- 238, Nanjing, China, 2005. (Postscript)

  6. J. Xue and W. Cai. Time-minimal tiling when rise is larger than zero. Parallel Computing, 28(6):915--939, 2002. (Postscript)

  7. P. Lenders and J. Xue. Eigenvectors-based parallelisation of nested loops with affine dependences. Parallel Algorithms and Applications, 17(3):227--248, 2002. (Postscript)

  8. P. Tang and J. Xue. Generating efficient tiled code for distributed memory machines. Parallel Computing, 26(11):1369--1410, 2000. (Postscript)

  9. J. Xue. Loop Tiling for Parallelism. Kluwer Academic Publishers, Boston, August 2000.