DOKK Library

Adaptive Context-sensitive Analysis for JavaScript

Authors Barbara G. Ryder, Shiyi Wei,

License CC-BY-3.0

Plaintext
Adaptive Context-sensitive Analysis for JavaScript
Shiyi Wei and Barbara G. Ryder

Department of Computer Science
Virginia Tech
Blacksburg, VA, USA
{wei, ryder}@cs.vt.edu

        Abstract
Context sensitivity is a technique to improve program analysis precision by distinguishing between
function calls. A specific context-sensitive analysis is usually designed to accommodate the pro-
gramming paradigm of a particular programming language. JavaScript features both the object-
oriented and functional programming paradigms. Our empirical study suggests that there is no
single context-sensitive analysis that always produces precise results for JavaScript applications.
This observation motivated us to design an adaptive analysis, selecting a context-sensitive ana-
lysis from multiple choices for each function. Our two-staged adaptive context-sensitive analysis
first extracts function characteristics from an inexpensive points-to analysis and then chooses
a specialized context-sensitive analysis per function based on the heuristics. The experimental
results show that our adaptive analysis achieved more precise results than any single context-
sensitive analysis for several JavaScript programs in the benchmarks.

1998 ACM Subject Classification D.3.4 Processors, F.3.2 Semantics of Programming Languages,
D.2.5 Testing and Debugging

Keywords and phrases Context Sensitivity, JavaScript, Static Program Analysis

Digital Object Identifier 10.4230/LIPIcs.ECOOP.2015.712


    1    Introduction
JavaScript is the lingua franca of client-side web applications and is becoming the most
popular programming language choice for open source projects [13]. Its flexible programming
paradigm is one of the reasons behind its popularity. JavaScript is known as a dynamic
object-oriented language, supporting prototype-based programming [9, 20], a different model
from class-based languages. However, it also supports features of functional programming
(e.g., first class functions). This flexibility helps in programming JavaScript applications
for different requirements and goals, but it also renders program analysis techniques more
complicated and often ineffective.
    Context sensitivity is a general technique to achieve more precise program analysis by
distinguishing between calls to a specific function. Historically, call-strings and functional
are the two approaches to enable context sensitivity in an analysis [14]. The call-strings
approach, using the information on the call stack as a context element1 , originally was
explored by Shivers (i.e., known as call-site sensitivity or k-CFA) for functional programming
languages [15] and was later adapted to object-oriented languages such as Java [2]. The
functional approach, using information about the computation state as a context element,
was investigated in several variations for object-oriented languages (e.g., [1, 10]). Several


1
    A context element refers to the label that is used to distinguish between different function calling
    contexts.
            © Shiyi Wei and Barbara G. Ryder;
            licensed under Creative Commons License CC-BY
29th European Conference on Object-Oriented Programming (ECOOP’15).
Editor: John Tang Boyland; pp. 712–734
                   Leibniz International Proceedings in Informatics
                   Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany
S. Wei and B. G. Ryder                                                                                 713


studies were performed to compare the precision of different context-sensitive analyses for
Java [7, 16]. These studies revealed that object sensitivity [10], an functional approach that
uses the receiver object represented by its allocation site as a context element, often produced
more precise results than call-site sensitivity in Java applications.
    Despite the challenges of effective analysis for JavaScript programs, different context-
sensitive analyses have been developed to improve precision. Jensen et al. implemented
object sensitivity in the TAJS analysis framework [4]. Several other context-sensitive analyses
were designed for specific features in JavaScript programs (e.g., dynamic property accesses
[19] and object property changes [22]). These approaches have been demonstrated effective
when certain program features are present. Recently, Kashyap et al. conducted a study to
compare the performance and precision of different context-sensitive analyses for JavaScript
[5]. Unlike the results for Java analyses [7, 16], the authors concluded that there was no
clear winner context-sensitive analysis for JavaScript across all benchmarks [5].
    Because JavaScript features flexible programming paradigms and no single context-
sensitive analysis seems best for analyzing JavaScript programs, there are opportunities for
an adaptive (i.e., multi-choice) analysis to improve precision. In this work, we first performed
a fine-grained study that compares the precision of four different JavaScript analyses on
the function level (Section 2.2) on the same benchmarks used by Kashyap et al. [5]. We
observed that JavaScript functions in the same program may benefit from use of different
context-sensitive analyses depending on specific characteristics of the functions.
    The results of our empirical study guided us to design a novel adaptive context-sensitive
analysis for JavaScript that selectively applies a specialized context-sensitive analysis per
function chosen from call-site, object and parameter sensitivity. This two-staged analysis
first applied an inexpensive points-to analysis (i.e., mostly context-insensitive, Section 2.2)
to a JavaScript program to extract function characteristics. Each function characteristic
is relevant to the precision of a context-sensitive analysis of the function (e.g., the call
sites that invoke function foo determine the context elements to be generated if call-site
sensitivity is applied to foo). We designed heuristics according to our observations on the
relationship between function characteristics and the precision of a specific analysis using
the empirical results on the benchmark programs. Finally, an adaptive analysis based on
the heuristics-based selection of a context-sensitive analysis per function was performed.
We have evaluated our new approach on two sets of benchmarks. The experimental results
show that our adaptive context-sensitive analysis produces more precise results than any
single context-sensitive analysis we evaluated for several JavaScript programs and that the
heuristics for selecting a context-sensitive analysis per function are fairly accurate.
    The major contributions of this work are:
   A new empirical study on JavaScript benchmarks to compare the precision of different
   context-sensitive analyses (i.e., 1-call-site, 1-object and 1st-parameter as defined in Section
   2.1). We measure the precision per function on two simple clients of points-to analysis (i.e.,
   Pts-Size and REF ). We have made several observations: (i) any specific context-sensitive
   analysis is precise only for a subset of programs in the benchmarks. More interestingly,
   any specific context-sensitive analysis is often effective only on a portion of a program.
   (ii) The precision of a context-sensitive analysis also is dependent on the analysis client.
   These findings motivated and guided us to design a novel JavaScript analysis.
   An adaptive context-sensitive analysis for JavaScript. The heuristics to select from various
   context-sensitive analyses (i.e., 1-call-site, 1-object and ith-parameter) for a function are
   based on the function characteristics computed from an inexpensive points-to solution.
   This adaptive analysis is the first analysis for JavaScript that automatically and selectively



                                                                                                     ECOOP’15
714   Adaptive Context-sensitive Analysis for JavaScript


           applies a context-sensitive analysis depending on the programming paradigms (i.e., coding
           styles) of a function.
           An empirical evaluation of the adaptive context-sensitive analysis on two sets of benchmark
           programs. The experimental results show that our adaptive analysis was more precise
           than any single context-sensitive analysis for several applications in the benchmarks,
           especially for those using multiple programming paradigms. Our results also show that
           the heuristics were accurate for selecting appropriate context-sensitive analysis on the
           function level.

      Overview. Section 2 introduces various context-sensitive analyses and then presents our
      motivating empirical study. Section 3 discusses heuristics that represent the relationship
      between function characteristics and analysis precision. Section 4 describes the design of our
      new analysis. Section 5 presents experimental results. Section 6 further discusses related
      work. Section 7 offers conclusions.

       2      Background and Empirical Study
      In this section, we first introduce the context-sensitive analyses used in our study. We then
      present the new comparisons among different context-sensitive analyses with empirical results
      and observations.

      2.1     Context Sensitivity
      As discussed in Section 1, various context-sensitive analyses have been designed for different
      programming languages. In this section, we only discuss the approaches that are most
      relevant to our empirical study of context-sensitive analysis for JavaScript. We provide more
      discussions on related context-sensitive analyses in Section 6.

      Call-strings approach. A call-strings approach distinguishes function calls using information
      on the call stack. The most widely known call-strings approach is call-site-sensitive (k-CFA)
      analysis [15]. A k-call-site sensitive analysis uses a sequence of the top k call sites on the call
      stack as the context element. k is a parameter that determines the maximum length of the call
      string maintained to adjust the precision and performance of call-site-sensitive analysis. We
      used 1-call-site sensitivity for comparison. 1-call-site-sensitive analysis separately analyzes
      each different call site of a function. Intuitively in the code example below, 1-call-site-sensitive
      analysis will analyze function foo in two calling contexts L1 and L2, such that local variables
      (including parameters) of foo will be analyzed independently for each context element.
      L1 : x . foo ( p1 , p3 );
      L2 : y . foo ( p2 , p4 );



      Functional approach. A functional approach distinguishes function calls using information
      about the computation state at the call. Object sensitivity analyzes a function separately for
      each of the abstract object names on which this function may be invoked [10]. Milanova et al.
      presented object sensitivity as a parameterized k-object-sensitive analysis, where k denotes
      the maximum sequence of allocation sites to represent an object name. We used 1-object
      sensitivity for comparison. 1-object-sensitive analysis separately analyzes a function for each
      of its receiver objects with a different allocation site. Intuitively in the code example above,
      1-object-sensitive analysis will analyze function foo separately if x and/or y may point to
S. Wei and B. G. Ryder                                                                                 715


different abstract objects. If x points to objects O1 and O2 , while y points to object O3 ,
1-object-sensitive analysis will analyze function foo for three context elements differentiated
as O1 , O2 and O3 .
    Other functional approaches presented use the computation state of the parameter instead
of the receiver object as a context element [1, 19]. The Cartesian Product Algorithm (CPA)
uses tuples of parameter types as a context element for Self [1]. The context-sensitive analysis
presented by Sridharan et al., designed specifically for JavaScript programs, analyzes a
function separately using the values of a parameter p if p is used as the property name in a
dynamic property access (e.g., v[p]) [19]. To capture these approaches, we define a simplified,
parameterized ith-parameter-sensitive analysis, where i means we use the abstract objects
corresponding to the ith parameter as a context element. We used 1st-parameter sensitivity
for comparisons. Intuitively in the code example above, 1st-parameter-sensitive analysis will
analyze function foo separately if p1 and/or p2 may point to different abstract objects. If p1
points to object O4 , while p2 points to object O4 and O5 , 1st-parameter-sensitive analysis
will analyze function foo for two context elements distinguished as O4 and O5 .


2.2      Empirical Study
There is no theoretical comparison between the functional and call-strings approaches to
context sensitivity that proves one better than the other. Therefore, we study the precision
of different context-sensitive analyses in practice based on experimental observations. Such
comparisons have been conducted for call-site and object sensitivity on Java [7, 16] as well
as JavaScript [5] applications. Object sensitivity produced more precise results for Java
benchmarks [7, 16], while there was no clear winner across all benchmarks for JavaScript
[5]. The latter observation motivated us to perform an in-depth, fine-grained, function-level
study which led to our design of a new context-sensitive analysis for JavaScript.

Hypothesis. Our hypothesis is that a specific context-sensitive analysis may be more
effective on a portion of a JavaScript program (i.e., some functions), while another kind
of context sensitivity may produce better results for other portions of the same program.
To test this hypothesis, we compared the precision of different context-sensitive analyses at
the function level (i.e., we collect the results of different analyses for a specific function and
compare their precision). In contrast, all previous work [7, 16, 5] reported overall precision
results per benchmark program. The results of our empirical study were obtained on a 2.4
GHz Intel Core i5 MacBook Pro with 8GB memory running the Mac OS X 10.10 operating
system.

Analyses for comparisons. We compared across four different flow-insensitive analyses to
study their precision. The baseline analysis is an analysis that applies the default context-
sensitive analysis for JavaScript in WALA2 (i.e., only uses 1-call-site-sensitive analysis for
the constructors to name abstract objects by their allocation sites and for nested functions
to properly access variables accessible through lexical scoping [19]). In the implementation,
the other three analyses (i.e., 1-call-site, 1-object and 1st-parameter) all apply this default
analysis. In principle, these analyses should be at least as precise as the baseline analysis for
all functions.


2
    Our implementation is based on the IBM T.J. Watson Libraries for Analysis (WALA). http://wala.
    sourceforge.net/




                                                                                                     ECOOP’15
716   Adaptive Context-sensitive Analysis for JavaScript


      Analysis clients and precision metrics. We compare precision results on two simple clients
      of points-to analysis. Points-to analysis calculates the set of values a reference property or
      variable may have during execution, an important enabling analysis for many clients. The
      first client, Pts-Size, is a points-to query returning the cardinality of the set of all values
      of all local variables in a function (i.e., the total number of abstract objects pointed to by
      local variables). The second client, REF [22], is a points-to query returning the cardinality
      of the set of all property values in all the property read (e.g., x=y.p) or call statements
      (e.g., x = y.p(...)) in a function (i.e., the total number of abstract objects returned by all
      the property lookups). Because both clients count the number of abstract objects returned
      and object-naming scheme is unified across different analyses, if an analysis A1 produces a
      smaller result than another analysis A2 for a function foo, we say that A1 is more precise
      than A2 for foo.

      Benchmarks. We conduct our comparisons on the same set of benchmarks used for the
      study performed by Kashyap et al. [5]. There are in total 28 JavaScript programs divided
      into four categories: standard (i.e., from SunSpider3 and V84 ), addon (i.e., Firefox browser
      plugins), generated (i.e., from the Emscripten LLVM test suite5 ) and opensrc (i.e., open
      source JavaScript frameworks). There are seven programs in each benchmark category.6 In
      our study, all benchmark programs are transformed with function extraction for correlated
      property accesses [19] before running any analysis. This program transformation preserves
      the semantics of the original programs and may help handle dynamic property accesses more
      accurately [19].

      Results. We ran each analysis of a benchmark program under a time limit of 10 minutes.
      The baseline and 1-call-site-sensitive analyses finished analyzing all 28 programs under
      the time limit. 1-object-sensitive analysis timed out on 4 programs (i.e., linq_aggregate,
      linq_enumerable and linq_functional in the opensrc benchmarks and fourinarow in the
      generated benchmarks), while 1st-parameter-sensitive analysis timed out on 2 programs (i.e.,
      fasta and fourinarow in the generated benchmarks).
           Figures 1a and 1b show the relative precision results for Pts-Size and REF, respectively.
      In both figures, the horizontal axis represents the results from four benchmark categories
      (i.e., standard, addon, generated and opensrc) and the vertical axis represents the percentages
      of functions in each benchmark category on which an analysis produces the best results (i.e.,
      more precise results than those from all other three analyses) or equally precise results. We
      consider the results of an analysis as equally precise as follows. (i) Baseline analysis is equally
      precise on a function if its results are as precise as each of the other three context-sensitive
      analyses. (ii) 1-call-site-sensitive, 1-object-sensitive or 1st-parameter-sensitive analysis is
      equally precise on a function if the results are more precise results than baseline analysis, and
      if the analysis does not produce the best results but the results are at least as precise as the
      other two context-sensitive analyses. This definition indicates that multiple context-sensitive
      analyses (e.g., 1-call-site-sensitive and 1-object-sensitive analyses) may produce equally precise
      results on a function.
           For example, the left four bars in Figure 1a present the precision results for the addon


      3
        https://www.webkit.org/perf/sunspider/sunspider.html
      4
        http://v8.googlecode.com/svn/data/benchmarks/v7/run.html
      5
        http://kripken.github.io/emscripten-site/
      6
        Details of the benchmark programs were provided in Kashyap et al. [5].
S. Wei and B. G. Ryder                                                                                                                                                                                                                                                                    717


                                        1	
             2	
             3	
             4	
             5	
             6	
             7	
             8	
         9	
   10	
   11	
   12	
   13	
   14	
   15	
   16	
   17	
   18	
   19	
   20	
   21	
  
                                  70%

                                  60%
    Percentage of functions




                                  50%                                                                                                                                                                                                                             1st-parameter_equal
                                                                                                                                                                                                                                                                  1st-parameter_best
                                  40%
                                                                                                                                                                                                                                                                  1-object_equal
                                  30%                                                                                                                                                                                                                             1-object_best
                                                                                                                                                                                                                                                                  1-call-site_equal
                                  20%
                                                                                                                                                                                                                                                                  1-call-site_best
                                  10%                                                                                                                                                                                                                             baseline_equal

                                  0%
                                                                    addon                                                                         opensrc        standard                                                      generated
                                                                                                                                                    Benchmark category

 (a) Relative precision results for the Pts-Size client.

                                                1	
             2	
             3	
             4	
             5	
             6	
             7	
         8	
       9	
   10	
   11	
   12	
   13	
   14	
   15	
   16	
   17	
   18	
   19	
   20	
   21	
  
                                  100%
                                   90%
                                   80%
        Percentage of functions




                                   70%                                                                                                                                                                                                                            1st-parameter_equal
                                   60%                                                                                                                                                                                                                            1st-parameter_best
                                   50%                                                                                                                                                                                                                            1-object_equal
                                   40%                                                                                                                                                                                                                            1-object_best
                                   30%                                                                                                                                                                                                                            1-call-site_equal
                                   20%                                                                                                                                                                                                                            1-call-site_best
                                   10%                                                                                                                                                                                                                            baseline_equal
                                   0%
                                                                            addon                                                                       opensrc