Topic ID: |
1074 | |
Title: |
Understanding and Improving Operational Processes in Large-scale Distributed Systems | |
Supervisor: |
Liming Zhu | |
Research Area: |
Software Engineering, Distributed Systems | |
| Associated Staff | ||
|---|---|---|
Assessor: |
Xiwei Xu | |
| Topic Details | ||
Status: |
Active | |
Type: |
R & D | |
Programs: |
CS CE BIOM BINF SE | |
Group Suitable: |
Yes | |
Industrial: |
Yes | |
Pre-requisites: |
-- | |
Description: |
Most failures of modern large-scale distributed systems happen during sporadic operational processes (such as upgrade, backup, recovery and configuration changes). The overall goal of the project is to understand the characteristics of these complex operational processes so that we can improve the overall dependability of the systems especially during sporadic operational activities. These operational processes are always a combination of automated processes and human-intensive processes which require different types of resources (software artifacts, computation power and humans) to carry out. These resources show a wide range of different characteristics in terms of their error-proneness, undoability, availability over time and dependency and etc. These put significant challenges in choosing and scheduling these resources appropriately and their impact on overall system dependability. The project will expose students to a set of real world operational processes for large-scale distributed systems. The project involves the modelling these processes in (semi-) formal process languages and investigation of analysis methods (and tools) to determine the impact of an operational processes on system dependability. | |
Comments: |
email limingz@cse.unsw.edu.au | |
| Past Student Reports | ||
| Ping AN in s2, 2009 Adaptive Software Process Engineering |
||
Download report from the CSE Thesis Report Library NOTE: only current CSE students can login to view and select reports to download. | ||