Techniques for Classifying Executions of Deployed Software to Support Software Engineering Tasks

Murali Haran; Alan F. Karr; Michael Last; Alessandro Orso; Adam Porter; Ashish Sanil; Sandro Fouché

doi:10.1109/tse.2007.1004

Abstract

There is an increasing interest in techniques that support analysis and measurement of fielded software systems. These techniques typically deploy numerous instrumented instances of a software system, collect execution data when the instances run in the field, and analyze the remotely collected data to better understand the system's in-the-field behavior. One common need for these techniques is the ability to distinguish execution outcomes (e.g., to collect only data corresponding to some behavior or to determine how often and under which condition a specific behavior occurs). Most current approaches, however, do not perform any kind of classification of remote executions and either focus on easily observable behaviors (e.g., crashes) or assume that outcomes' classifications are externally provided (e.g., by the users). To address the limitations of existing approaches, we have developed three techniques for automatically classifying execution data as belonging to one of several classes. In this paper, we introduce our techniques and apply them to the binary classification of passing and failing behaviors. Our three techniques impose different overheads on program instances and, thus, each is appropriate for different application scenarios. We performed several empirical studies to evaluate and refine our techniques and to investigate the trade-offs among them. Our results show that 1) the first technique can build very accurate models, but requires a complete set of execution data; 2) the second technique produces slightly less accurate models, but needs only a small fraction of the total execution data; and 3) the third technique allows for even further cost reductions by building the models incrementally, but requires some sequential ordering of the software instances' instrumentation.

Keywords

Computer scienceField (mathematics)SoftwareFocus (optics)Set (abstract data type)Software systemData miningSoftware engineeringDistributed computingProgramming language

Affiliated Institutions

Related Publications

Adversarial Attacks on Neural Networks for Graph Data

Daniel Zügner , Amir Akbarnejad , Stephan Günnemann

Deep learning models for graphs have achieved strong performance for the task of node classification. Despite their proliferation, currently there is no study of their robustnes...

2019 282 citations

Trinity RNA-Seq assembler performance optimization

Robert Henschel , Matthias Lieber , Le‐Shin Wu +3 more

RNA-sequencing is a technique to study RNA expression in biological material. It is quickly gaining popularity in the field of transcriptomics. Trinity is a software tool that w...

2012 80 citations

The AppLeS Parameter Sweep Template: User-Level Middleware for the Grid

Henri Casanova , Fran Berman , Graziano Obertelli +1 more

The Computational Grid is a promising platform for the efficient execution of parameter sweep applications over large parameter spaces. To achieve performance on the Grid, such ...

2000 127 citations

Remarks on the taking and recording of biometric measurements in bird ringing

John Morgan

Remarks on the taking and recording of biometric measurements in bird ringing Ringing operations hold opportunities for introducing error into biometric recording. This situatio...

2004 Ring 13 citations

Modelling age variation in survival and reporting rates for recovery models

E. A. Catchpole , Stephen N. Freeman , Byron J. T. Morgan

In this paper, we focus on models for recovery data from birds ringed as young. In some cases, it is important to be able to include in these models a degree of age variation in...

1995 Journal of Applied Statistics 34 citations

Publication Info

Year: 2007
Type: article
Volume: 33
Issue: 5
Pages: 287-304
Citations: 47
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Techniques for Classifying Executions of Deployed Software to Support Software Engineering Tasks

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

OpenAlex

Cite This

APA Style

                            
                                
                                    Murali Haran, 
                                
                                    Alan F. Karr, 
                                
                                    Michael Last
                                
                                et al.
                            
                            (2007). 
                            Techniques for Classifying Executions of Deployed Software to Support Software Engineering Tasks. 
                            IEEE Transactions on Software Engineering
                            , 33
                            (5)
                            , 287-304.
                            https://doi.org/10.1109/tse.2007.1004
                        

Identifiers

DOI: 10.1109/tse.2007.1004