Authors Anders Møller, Benjamin Barslev Nielsen,
License CC-BY-3.0
Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript Benjamin Barslev Nielsen Aarhus University, Denmark barslev@cs.au.dk Anders Møller Aarhus University, Denmark amoeller@cs.au.dk Abstract In static analysis of modern JavaScript libraries, relational analysis at key locations is critical to provide sound and useful results. Prior work addresses this challenge by the use of various forms of trace partitioning and syntactic patterns, which is fragile and does not scale well, or by incorporating complex backwards analysis. In this paper, we propose a new lightweight variant of trace partitioning named value partitioning that refines individual abstract values instead of entire abstract states. We describe how this approach can effectively capture important relational properties involving dynamic property accesses, functions with free variables, and predicate functions. Furthermore, we extend an existing JavaScript analyzer with value partitioning and demonstrate experimentally that it is a simple, precise, and efficient alternative to the existing approaches for analyzing widely used JavaScript libraries. 2012 ACM Subject Classification Theory of computation → Program analysis Keywords and phrases JavaScript, dataflow analysis, abstract interpretation Digital Object Identifier 10.4230/LIPIcs.ECOOP.2020.16 Funding This work was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement No 647544). 1 Introduction JavaScript programs are challenging to analyze statically due to the dynamic nature of the language. One of the main obstacles is the presence of dynamic property access operations that allow objects to be manipulated using object property names that are dynamically computed strings. A typical pattern that has received much attention is correlated read/write pairs [25], a simple variant of which looks as follows: t = x[p]; ... y[p] = t; At run-time, this code copies a property whose name is the value of p from the x object to the y object. If the static analysis does not know precisely the string value of p, then the properties of x will be mixed together in y. Experience with analyzers such as WALA [25, 24, 28], SAFE [17, 22], JSAI [13], and TAJS [11, 2, 26] has shown that when analyzing real-world JavaScript code, including jQuery, Lodash, Underscore and other widely used libraries, such situations often cause an avalanche of spurious dataflow that makes the analysis results useless. If, for example, x is the object {m1: f1, m2: f2, ..., m10: f10} where f1, f2, . . . , f10 are functions, then any subsequent function call, for example y.m3(...), will be treated by the analysis as a call to any of the 10 functions. Several analysis techniques have been proposed to address this challenge. The techniques based on correlation tracking [25], static/dynamic determinacy [24, 2], and loop sensitivity [22] aim to increase precision by the use of context sensitivity or loop unrolling to ensure that © Benjamin Barslev Nielsen and Anders Møller; licensed under Creative Commons License CC-BY 34th European Conference on Object-Oriented Programming (ECOOP 2020). Editors: Robert Hirschfeld and Tobias Pape; Article No. 16; pp. 16:1–16:28 Leibniz International Proceedings in Informatics Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany 16:2 Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript the analysis has precise information about p in the example above. Although this approach works well in many cases, the aggressive use of context sensitivity or loop unrolling can be expensive on analysis time. Even more importantly, it falls short when p is not determinate (i.e., when its value is not fixed even when the call context is known). An important step forward is the approach used in the CompAbs analyzer, which is built on SAFE [16]: Even if p is imprecise, the loss of precision at the property write operation can be avoided by applying trace partitioning [23] at the property read operation, based on which properties exist on x. Intuitively, it is often not necessary to have precise information about p; instead we can refine the current abstract state into a collection of more precise partitions, one for each of the 10 properties of x (plus one extra for the case where p is none of those strings, but let us ignore that for now), and after the property write operation merge them again. (The same idea was used earlier in TAJS, but only at for-in loops, not at dynamic property reads [2].) This approach, however, also has drawbacks. Trace partitioning is expensive, so it must be used scarcely: in the example, the code between the dynamic property read and the dynamic property write is essentially analyzed 10 times. For this reason, CompAbs relies on a syntactic pre-analysis to recognize different kinds of correlated read/write pairs for guiding the creation and merging of partitions. Recent work [26] has shown that the syntactic pre-analysis approach of CompAbs is too fragile, for example, it is incapable of analyzing the Lodash library (see Section 2), and demand-driven value refinement has been proposed as an alternative. Instead of relying on context sensitivity, loop unrolling, or trace partitioning, that approach applies, during the analysis when encountering a dynamic property write operation with an imprecise property name, a separate backwards analysis to regain the relation between the property name and the value to be written. Although demand-driven value refinement has been shown to work quite well in practice, building a backwards analysis for the full JavaScript language and its standard library is a major endeavor, so developing simpler alternatives is desirable. Our approach builds upon the observation from CompAbs that sufficient precision can be obtained using trace partitioning based on the properties of the object being read. Our key insight is that we do not need to partition the entire abstract state as done by CompAbs: It suffices to only partition the abstract values for the property name p and the value being read x[p] in the above example. This means that instead of analyzing the code 10 times, we only analyze it once, but using partitioned abstract values that retain the correlation between p and x[p]. The partitioned abstract values are introduced at t = x[p] and used at y[p] = t by means of specialized transfer functions. We refer to this variant of trace partitioning as value partitioning. Since partitioning individual abstract values does not increase the analysis complexity as much as partitioning entire states, it becomes feasible to apply value partitioning more extensively, at every dynamic property read where the property name is imprecise, thereby obviating the need for the syntactic pre-analysis. In this paper we present a theoretical framework for value partitioning, together with three instantiations: property-name partitioning (which is the one used in the example above), free-variable partitioning (to improve precision for free variables of closures), and type partitioning (to improve precision for predicate functions). Additionally, we extend the static analyzer TAJS with all three kinds of value partitioning and demonstrate that the approach is effective for analyzing popular JavaScript libraries. Value partitioning is a lightweight alternative to the existing approaches to relational analysis for JavaScript: Compared to CompAbs-style trace partitioning it avoids many redundant computations caused by similarities between different partitions, and compared to demand-driven value refinement it avoids the need for creating a separate backwards analysis. B. B. Nielsen and A. Møller 16:3 1 function mixin(object, source) { 2 baseFor(source, function (func, methodName) { 3 if (!isFunction(func)) 4 return; 5 object[methodName] = func; 6 if (isFunction(object)) 7 object.prototype[methodName] = function() { 8 ... 9 func.apply(...); 10 } 11 }); 12 } 13 14 function baseFor(source, iteratee) { 15 Object.keys(source).forEach(function (key) { 16 iteratee(source[key], key); 17 }); 18 } 19 20 // usage of mixin during initialization 21 mixin(lodash, lodash); Figure 1 Motivating example based on code from the Lodash library. In summary our contributions are: Value partitioning: a general static analysis technique that is capable of reasoning about relations between abstract values. Three instantiations of value partitioning, which tackle different challenges in static analysis for JavaScript, each involving relational properties: property-name partitioning: relations between dynamically computed object property names and values; free-variable partitioning: relations between functions and their free variables; and type partitioning: relations between arguments and return values of predicate functions. Experimental results: We show that value partitioning makes TAJS more precise than CompAbs [16] for several real-world JavaScript libraries, including Lodash, which is the most widely used library. The resulting precision is comparable to (and in case of the Lodash4 benchmark group substantially higher than) that of demand-driven value refinement [26], without the need for a separate backwards analysis. 2 Motivating Example and Overview Figure 1 shows a small code example based on Lodash (version 4.17.10), which is the most depended-upon of all npm packages.1 Lines 1–12 define the function mixin, which copies all function properties from source to object. If object is a function, a new function (which 1 Lodash (https://lodash.com/) has more than 115 000 dependents in npm and more than 27 million weekly downloads as of May 2020. ECOOP 2020 16:4 Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript on invocation calls the function to be copied) is also copied to object.prototype, such that instantiations of object (using the keyword new) also will have these functions. In line 21, which is executed during the initialization of Lodash, mixin is called with the library object as both arguments. The function mixin uses a helper function baseFor defined in lines 14– 18. It is called with source and a callback function defined in lines 2–11. The baseFor function then gets all the object property names from the source object using Object.keys, and the callback function is called (line 16) for each property name and corresponding property value. Line 3 checks whether func is a function. If so, the function is copied to object[methodName] in line 5. Note that func actually is the value source[methodName]. Line 6 checks whether object is also a function and if so, a new function is declared and written to object.prototype[methodName] in line 7. When invoked, that new function calls func using func.apply(...) in line 9. Such complex code is not unusual in modern JavaScript libraries. For a static analysis reasoning about the dataflow in this code, the correlation between methodName and func is critical. An analysis that loses track of this correlation will mix together all the properties of the library object lodash when analyzing the call mixin(lodash, lodash) in line 21. As a consequence, if the program being analyzed contains a call to, for example, lodash.map, that will be treated by the analysis as a call to any of Lodash’s more than 100 different functions, not only the actual map function, thereby triggering an avalanche of spurious dataflow. Existing approaches Existing JavaScript analyzers do not have precise information about the value of key in line 16, for various different reasons. (Most importantly, Object.keys produces an array of property names in unspecified order.) Previous work has suggested two approaches to analyze such code precisely even when key is imprecise. The CompAbs [16] approach uses trace partitioning guided by syntactic patterns. If trace partitioning is used at the dynamic property read operation in line 16, the abstract state is partitioned into a set of refined abstract states corresponding to the properties of the source object. This way the value of key is precise in each of those states, and the call in line 16 is analyzed separately for each of them. Trace partitioning, however, is expensive, so CompAbs limits the use of trace partitioning according to certain syntactic patterns. At this specific dynamic property operation, CompAbs chooses not to apply trace partitioning and fails to detect that the relation between methodName and func is important. The second approach is demand-driven value refinement [26], which can analyze the example code with sufficient precision to avoid mixing together the Lodash functions. With this approach, the analysis detects imprecision at the dynamic property write in line 5: methodName is an imprecise string and func can be many different functions. It then queries a backwards abstract interpreter asking for the possible value of methodName for each of the functions. The backwards analysis returns a precise property name for each function and thereby enables the dynamic property write operation to be modeled precisely. For the dynamic property write in line 7, the function defined in lines 7–10 is written to all properties of object.prototype, but the abstract value being written is augmented, such that the value of methodName remains precise. When reading func in line 9, the backwards analysis is queried to get the value of func relative to the value of methodName, thereby retrieving a precise value for func. This ensures the desired precision, but the approach requires a complicated backwards analysis. B. B. Nielsen and A. Møller 16:5 Value partitioning We will now informally explain how value partitioning can provide similar precision as demand-driven value refinement, but without the need for a backwards abstract interpreter. With traditional trace partitioning, as used by, for example, CompAbs, the analysis can track multiple abstract states for each program point, such that the different abstract states cover different assumptions about the execution paths that lead to that point. (Correlation tracking [25], determinacy-based analysis [24, 2], and loop sensitivity [22] can also be viewed as variations of trace partitioning.) The key idea behind value partitioning is that we can obtain a similar effect as trace partitioning by instead performing the partitioning at the level of individual abstract values. In principle, the resulting abstract domain is isomorphic to a traditional trace partitioning domain, but this approach provides more flexibility for using different kinds of partitioning for different parts of the abstract states. This general idea can be instantiated in multiple ways to track different kinds of relational properties. We next describe three instantiations that enable precise analysis of challenging JavaScript code, including the Lodash example. Property name partitioning One instantiation is property name partitioning, which performs partitioning at dynamic property reads, similar to the CompAbs technique, but on abstract values instead of abstract states. To illustrate this mechanism by example, consider the read operation in line 16 and the correlated write operation in line 5. Assume for simplicity that the source ob- ject has only two properties, {map: f1, trim: f2} where f1 and f2 are functions, and methodName is an abstract value that overapproximates all valid property names. When read- ing source[methodName], an analysis without value partitioning will read all the properties of source. When using value partitioning, we instead partition this value according to the prop- erty names of source, meaning that we obtain a value [t1 7→ f1, t2 7→ f2, t3 7→ undefined] where t1 , t2 , and t3 represent different partitions.2 Intuitively, t1 represents the execution traces where the property name being read is map, t2 similarly represents traces where the property name being read is trim, and t3 represents all other traces. We similarly write the partitioned value [t1 7→ "map", t2 7→ "trim", t3 7→ AnyString] to methodName.3 In this way, the resulting abstract state retains the correlation between the values of methodName and source[methodName]. Later the analysis reaches the write operation object[methodName] = func, with an abstract state where methodName is [t1 7→ "map", t2 7→ "trim", t3 7→ AnyString] and func is [t1 7→ f1, t2 7→ f2, t3 7→ undefined]. Since the property name and the value to be written have the same partitions, we can perform the dynamic property write separately for each partition, meaning that f1 is written to the map property, and analogously for the other two partitions, thereby avoiding mixing together the properties. Since the partitioning is performed at the value level, unlike traditional trace partitioning we do not need any extra call contexts to the callback function defined in line 2, so the overhead of value partitioning is negligible, even when the correlated read/write pairs span multiple functions. For this reason, we can apply property name partitioning at all dynamic property reads where the property name is imprecise, without the use of syntactic patterns. 2 In JavaScript, reading an absent property yields the special value undefined. 3 AnyString is an abstract value that represents any string. In practice we instead use a slightly more precise abstract value representing AnyString\{"map", "trim"}. ECOOP 2020 16:6 Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript Free variable partitioning A second instantiation of value partitioning is for handling free variables more precisely. In the example, this is useful for func in line 9, which is a free variable in the function defined in lines 7–10. At that function definition, we partition both the resulting abstract function value ` and the abstract value of func according to the existing partitioning of func, intuitively to be able to distinguish functions created with different values of the free variable. This means that the function value being written at the dynamic property write in line 7 is [t1 7→ `t01 , t2 7→ `t02 , t3 7→ `t03 ] where `t01 represents the function created at a point where func is f1 (i.e., that point is at the end of a t1 trace), and similarly for the other partitions. At the same time, the value of func becomes [t1 7→ f1, t2 7→ f2, t3 7→ undefined, t01 7→ f1, t02 7→ f2, t03 7→ undefined] where the three new partitions t01 , t02 , and t03 denote the new partitioning we have made (one abstract value can thus have multiple partitionings simultaneously). Using the property name partitioning mechanism described above, at the dynamic property write in line 7, `t01 is written to the map property of object.prototype, and similarly for the other properties. We can exploit the free variable partitioning information when the function is later called. Assume the analysis encounters a call to the map method. The abstract value of lodash.prototype.map is then `t01 . We now use t01 as a context in ordinary context sensitive analysis of the function, so that when reaching func in line 9, it suffices to consider only the t01 partition of func, which yields the precise value f1, so again, we successfully avoided mixing together the properties. Type partitioning The above two uses of value partitioning are sufficient for analyzing the motivating example without critical precision losses, but we can make the analysis even more precise using a third variant. The function named isFunction used in the branch condition in line 6 is a typical example of a predicate function, i.e., a one-parameter function that returns a boolean, in this case testing whether the value passed in is a function. Assume the abstract value of the argument object is fun1|obj2, meaning that it represents either a function fun1 or a non-function object obj2. With a simple analysis, the abstract return value and hence the branch condition is Bool representing any boolean value, so the analysis does not know that object cannot be obj2 inside the branch. This causes the analysis to spuriously raise a type error when writing to object.prototype in line 7. Type partitioning avoids that imprecision as follows. Type partitioning is triggered at any call to a function with one argument, and partitions that argument according to its types. In this case, the value of object is partitioned into [a 7→ fun1, b 7→ obj2]. The result value from isFunction then becomes [a 7→ true, b 7→ false], which we can exploit using ordinary control sensitivity [10] (also called type refinement [14]) at the “true” branch such that object in line 7 will only be fun1 and not obj2. Overview In Section 3 we give a brief introduction to the analysis domain of TAJS. Section 4 explains the general value partitioning mechanism, and Section 5 details the three instantiations: property name partitioning, free variable partitioning, and type partitioning. Section 6 describes our experimental evaluation, and Section 7 discusses related work. B. B. Nielsen and A. Møller 16:7 r1 [r2 ] ← r3 : Writes r3 to the property named r2 of the object r1 r1 ← r2 [r3 ]: Reads the property named r3 of the object r2 to r1 r1 ← x: Reads the value of the variable x to r1 x ← r1 : Writes r1 to the variable x r1 ← c: Assigns the constant c to r1 r1 ← function(x){· · · }: Creates a closure for the function and stores it in r1 if(r1 ): Conditionally propagates dataflow (to model if and while) r1 ← r2 (r3 ): Calls the function r2 with argument r3 and stores the result in r1 r1 ← r2 ⊕ r3 : Computes the binary operation r2 ⊕ r3 and stores the result in r1 Figure 2 The main flow graph instructions in TAJS. X ∈ AnalysisLattice = L → State n ∈ N : Nodes σ ∈ State = (L → Obj) × Registers c ∈ C : Contexts o ∈ Obj = P → Value p ∈ P : Property names r ∈ Registers = R → Value ` ∈ L = N × C : Locations v ∈ Value = Prim × P(L) Figure 3 Simplified abstract domain. 3 Background: The TAJS Analyzer In this section we give a brief introduction to a heavily simplified version of the analysis domain and program representation used in TAJS [11, 2], which lays the foundation for our extensions in the following sections. TAJS is an open-source dataflow analysis tool for JavaScript built as a monotone frame- work [12]. A JavaScript program is represented as a control flow graph for each function, with nodes representing primitive instructions of the different kinds listed in Figure 2. Each instruction operates on registers, which can be thought of as special local variables. For simplicity, we ignore this and receiver objects at calls, and we assume all functions have only one parameter. As an example, the single JavaScript statement y[p] = x[p] is represented as six flow graph nodes as shown in Figure 4. The components of the abstract domain are summarized in Figure 3. A location is a pair of a node and a context. The contexts allow for context sensitivity (using the context-sensitivity strategy described by Andreasen and Møller [2]). The main abstract domain, AnalysisLattice, is a lattice that maps locations to abstract states, where each state contains abstract values of object properties and registers. Objects are modeled using context-sensitive allocation-site abstraction [6, 20], so abstract object addresses are simply locations.4 Functions are special kinds of objects. Abstract values are modeled using a product of a constant-propagation lattice [15] named Prim of primitive values (strings, numbers, etc.) and a powerset lattice of object addresses. The analysis is control sensitive by pruning infeasible dataflow at if nodes. This includes not only eliminating flow along unreachable branches, for example when a branch condition is definitely false [27], but also filtering abstract values based on the branch condition [10, 14]. 4 TAJS models absence/presence of object properties and uses two artificial properties DefaultNumeric and DefaultOther to model properties with unknown numeric/non-numeric names; we ignore that here. ECOOP 2020 16:8 Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript n1 : r1 ← y n2 : r2 ← p n3 : r3 ← x n4 : r4 ← p n5 : r5 ← r3 [r4 ] n6 : r1 [r2 ] ← r5 Figure 4 Fragment of a control flow graph, for the single statement y[p] = x[p]. As an example, the JavaScript code if(z) is represented by two primitive instructions, r6 ← z and if(r6 ). In the “true” branch, not only r6 but also z must have the value true.5 To track the connection between r6 and z, a simple intraprocedural must-equals analysis is performed alongside the main dataflow analysis. We leverage this mechanism in Section 5, for example to obtain the information that r2 , r4 , and p must have the same value at the property read operation in Figure 4 (unless a property accessor changes p). To keep Figure 3 simple, we omit the must-equals information in the description of the State lattice. In the following sections, with a slight abuse of notation we let σ(r) denote the value of register r in state σ, and similarly, σ(x) denotes the value of variable x. Also, we use the notation σ(r) := . . . to describe the operation of writing a given value to register r and also to the variables and registers that are equal to r according to the must-equals information. If ` ∈ L is a location representing an object address, we sometimes write ` for the abstract value (⊥, {`}) ∈ Value. Similarly, for abstract values that represent primitive values only, we omit the location sets, for example, "foo" denotes the abstract value ("foo", ∅) ∈ Value. We omit many details of TAJS, including the definitions of the concretizations of the lattice elements, the definitions of the transfer functions for the different instructions, how values of variables are being stored in special activation objects, and how a call graph is built during the analysis. Analyzing full JavaScript also requires reasoning about prototypes, scope chains, implicit type conversions, exceptions, the standard library, property accessors (getters and setters), and much more. It suffices to know that the resulting abstract states soundly overapproximate the possible program behavior [7]. A trace is a concrete execution of the program expressed as a finite sequence of pairs (`, γ) where ` is a location and γ is a concrete state, starting at the program entry point with the initial call context in an empty state. The semantics of a program is defined as a set of traces. The collecting semantics is the program semantics projected onto the program locations: Given a location `, the collecting semantics for `, denoted J`K, is the set of states that appear at ` in the set of traces defined by the program semantics. The analysis result is thus a lattice element X ∈ AnalysisLattice such that J`K is a subset of the concretization of X(`) for all locations ` ∈ L. 5 In actual JavaScript, the value must be truthy, which also includes nonempty strings, nonzero numbers, and objects. B. B. Nielsen and A. Møller 16:9 t ∈ T : Partition tokens o ∈ Obj = P → PartitionedValue r ∈ Registers = R → PartitionedValue pv ∈ PartitionedValue = T ,→ Value Figure 5 Extension of the abstract domain for value partitioning. 4 Value Partitioning To prepare the analysis for value partitioning, we introduce a set T of partition tokens and replace occurrences of Value by PartitionedValue in the abstract domain, as shown in Figure 5. A partitioned value is a partial map from partition tokens to ordinary values. We use the notation [t1 7→ v1 , . . . , tk 7→ vk ] (or set-builder notation like [ti 7→ vi | i = 1, . . . , k]) to denote the partitioned value that maps ti to vi for each i = 1, . . . , k and is undefined for all other partition tokens. The partition tokens play a similar role as in trace partitioning [23], but at the level of abstract values. (We explain the differences between value partitioning and traditional trace partitioning in more detail in Section 7.) A partition token intuitively represents a set of execution traces. The special token any represents all traces, so the partitioned value [any 7→ v] has the same meaning as the ordinary value v in the original abstract domain. As an invariant, all partitioned values we use are defined for the token any.6 We extend partitioned values to be total functions pv : T → Value by defining pv(t) = pv(any) when t∈ / dom(pv).7 Assume X ∈ AnalysisLattice is the result of analyzing a given program, σ = X(`) is the abstract state at some location `, and [. . . , t 7→ v, . . . ] = σ(r) is the partitioned value of some register r. The meaning of such a partitioned value is that for any trace that ends at ` and is in the set of traces represented by t, the concrete value of r is in the concretization of the abstract value v. A covering 8 at a location ` is a set of partition tokens where the union of the sets of traces they represent is the set of all traces that lead to `. This means that if σ(x) = [. . . , t1 7→ v1 , ..., tk 7→ vk , . . . ] where σ = X(`) for some program variable x at location ` where {t1 , . . . , tk } is a covering, then for every concrete state in J`K, the value of x is in the concretization of at least one of the abstract values v1 , . . . , vk . For the initial abstract state at the program entry, all partitioned values use the trivial covering {any}. Now that we have generalized the abstract domain, it is easy to adjust all transfer functions for the different kinds of nodes to operate on partitioned values instead of ordinary values. As an example, the original transfer function for r1 ← r2 ⊕ r3 updates a given abstract state σ by σ(r1 ) := σ(r2 )⊕σ(r3 ) (where ⊕ applied to abstract values works as in constant propagation).9 6 When we define a partitioned value [ti 7→ vi | i = 1, . . . , k] without an any token, an any partition is implicitly created with value v1 t · · · t vk . 7 In trace partitioning terminology, this use of any corresponds to a simple pre-ordering of partition tokens. 8 For formal definitions of the notions of traces and coverings, see Rival and Mauborgne [23]. Basing our approach on partitions instead of coverings (a partition is a covering where all the trace sets are disjoint) could improve precision but would complicate the analysis without much practical benefit. 9 The actual TAJS analysis also models implicit type conversions. ECOOP 2020 16:10 Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript T :: = any (Section 4) | valhN, R, Valuei (Section 5.1) | funhF, C, T i (Section 5.2) | typehN, R, Typesi (Section 5.3) Types :: = undefined | null | number | string | boolean | object | array | function | regexp Figure 6 Partition tokens used by property name partitioning, free variable partitioning, and type partitioning. When switching to the domain with partitioned values, we simply replace σ(r2 ) ⊕ σ(r3 ) by [t 7→ pv 2 (t) ⊕ pv 3 (t) | t ∈ dom(pv 2 ) ∪ dom(pv 3 ) where pv 2 = σ(r2 ) and pv 3 = σ(r3 )]. The other transfer functions and least-upper-bound are adapted similarly. A small example can illustrate how partitioning can make the analysis relational. Assume the binary operation is equality, r1 ← r2 == r3 , and that we have two partitions, t1 and t2 , where both registers r2 and r3 have the value 42 in partition t1 , and both have the value "foo" in partition t2 . With partitioning, the value of r1 becomes [t1 7→ true, t2 7→ true] (i.e., definitely true), whereas without partitioning, r2 and r3 both have the value 42|"foo", so the value of r1 becomes AnyBool (i.e., true or false). To get any advantage of the new abstract domain, we of course need to modify specific transfer functions to selectively introduce partition tokens and further exploit the extra information available regarding relational properties between values. We show how that can be accomplished in Section 5. Those mechanisms rely on some general operations for manipulating the partitions in partitioned values. Most importantly, we use an operation ] when introducing new coverings: pv 1 ] pv 2 where pv 1 , pv 2 ∈ PartitionedValue denotes the combined partitioned value. For each token that is only present in one of pv 1 or pv 2 , the new value will be the value for that token, and for each token shared by pv 1 and pv 2 , the new value will be the join of the two respective values. 5 Three Instantiations of Value Partitioning We now present three instantiations of the value partitioning framework. Each of them targets a category of relational properties that are relevant to analysis of JavaScript libraries. Each instantiation introduces a family of partition tokens, as shown in Figure 6, along with some modification of the analysis transfer functions. Each partition token represents a set of traces, as explained in the following. 5.1 Property Name Partitioning The first use of value partitioning is for improving precision at correlated object property read/write operations as in the motivating example. Partition tokens for property name partitioning We introduce a family of partition tokens, valhn, r, vi, where n ∈ N , r ∈ R, and v ∈ Value. Such a token represents the set of traces where at the last occurrence of n, the value of register r is v. In all valhn, r, vi tokens we use in property name partitioning, the node n B. B. Nielsen and A. Møller 16:11 σ(r3 ) ] [valhn, r3 , pi 7→ p | p ∈ propNames(σ(r2 ))] σ(r3 ) := ] [valhn, r3 , otheri 7→ AnyString] if σ(r3 )(any) = AnyString σ(r ) otherwise 3 [valhn, r3 , pi 7→ readProp(σ(r2 )(any), p) | p ∈ propNames(σ(r2 ))] σ(r1 ) := ] [valhn, r3 , otheri 7→ undefined] if σ(r3 )(any) = AnyString [any 7→ readProp(σ(r )(any), σ(r )(any))] otherwise 2 3 Figure 7 Introduction of partitioned values at a dynamic property read node n of the form r1 ← r2 [r3 ]. is a property read node (i.e., of the form r1 ← r2 [r3 ]), the register r is the one holding the property name in that instruction (i.e., r3 in r1 ← r2 [r3 ]), and the value v is a property name (i.e., an element of P ).10 As an example, assume the code from Figure 4 appears inside a loop, and consider the following two traces that both end at n6 : τa = · · · (n1 , γ1a )(n2 , γ2a )(n3 , γ3a )(n4 , γ4a )(n5 , γ5a )(n6 , γ6a ) and τb = · · · (n1 , γ1b )(n2 , γ2b )(n3 , γ3b )(n4 , γ4b )(n5 , γ5b )(n6 , γ6b ) where each “· · · ” is a trace prefix leading from the program entry point to this part of the code, γ1a , . . . , γ6b are concrete states, and τa is a prefix of τb . The last occurrence of n5 (which is the instruction r5 ← r3 [r4 ]) is emphasized in each of the traces. Also assume that the value of the register r4 is "foo" in γ5a and "bar" in γ5b . Note that r4 is the register holding the property name at the n5 instruction. In this situation, the token valhn5 , r4 , "foo"i represents τa but not τb . Dynamic property reads Figure 7 shows the modified transfer function for read-property nodes, r1 ← r2 [r3 ]. The function readProp(v1 , v2 ) looks up the abstract value of properties named v2 in the abstract objects pointed to by v1 in the current state σ.11 Property name partitioning is triggered if the property name is not precise (here modeled as AnyString), so in that case we partition the property name r3 with respect to the properties that appear in the abstract objects pointed to by r2 (expressed as propNames(σ(r2 ))), and perform the property read for each partition to obtain a partitioned value for r1 . We use the artificial abstract value other ∈ Value to represent all other properties; for that partition, the result value becomes undefined.12 If 10 In JavaScript, property names are either strings, which we model in the sub-lattice Prim, or symbols, which can be modeled as special heap locations. 11 Reading an object property is a nontrivial operation in JavaScript because of prototypes, getters, and implicit type conversions. Importantly, the value partitioning mechanism is orthogonal to such JavaScript technicalities. 12 In our implementation we use a more precise string lattice, which allows us to express more precisely that σ(r3 ) for the valhn, r3 , otheri partition is AnyString\propNames(σ(r2 )), i.e., any string except for the property names that are covered by other partitions. See also footnote 3. ECOOP 2020 16:12 Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript for each t ∈ chooseCommonCovering(σ(r2), σ(r3 )): writeProp σ(r1 )(any), σ(r2 )(t), σ(r3 )(t) Figure 8 Exploiting partitioned values at a dynamic property write node, r1 [r2 ] ← r3 . the property name r3 is already precise (corresponding to the “otherwise” cases), there is no need to introduce new partitions, so in that case r3 is unmodified and the result value r1 is obtained directly using readProp and the any partition token. Recall that a valhn, r, pi token represents the set of traces where at the last occurrence of n, the value of register r is v. To respect this property we need to remove all existing valhn, _, _i tokens from the abstract state before applying the transfer function for dynamic property reads. (This is safe because every abstract value still has other coverings, in particular {any}.) Notice that for both r3 and r1 , the new partitions use the partition tokens valhn, r3 , pi where n is the read-property node. Evidently, the new partition tokens form a covering. Also, this new transfer function respects the interpretation of the newly added valhn, r, pi tokens, and due to the partitioning, the resulting abstract states maintain the relation between the involved object property names and values. Dynamic property writes Next, we modify the transfer function for dynamic property writes, r1 [r2 ] ← r3 , as shown in Figure 8, to take advantage of the partitionings introduced at dynamic property reads. The function writeProp(v1 , v2 , v3 ) writes v3 to the properties named v2 in the objects referred to by v1 .13 The function chooseCommonCovering finds a covering shared by the property name σ(r2 ) and the value to be written σ(r3 ). (An example is given below.) If multiple such coverings exist, a largest one (i.e., one with the largest number of partition tokens) is selected.14 Recall that the two values always share the {any} covering, which will be used if no other covering exist. When a covering has been chosen, the value is written to the appropriate object property for each partition, thereby exploiting the relational information. In case the {any} covering is chosen, the transfer function behaves as the original version without value partitioning. Example To better understand property name partitioning, we now explain the mechanism in more detail on the example given in Figure 4. Let us assume that σ(p) = [any 7→ AnyString], σ(x) = [any 7→ obj 2 ] and σ(y) = [any 7→ obj 1 ] where in state σ, obj 1 is the location of an empty abstract object and obj 2 is the location of an abstract object with two properties, {foo: 1, bar: 2}. This means when analyzing the read property node r5 ← r3 [r4 ] we have σ(r3 ) = [any 7→ obj 2 ] and σ(r4 ) = [any 7→ AnyString]. Since the property name r4 is 13 We omit the details of how the implementation of writeProp in TAJS handles strong/weak updates, setters, and implicit type conversions. Importantly, the value partitioning mechanism is orthogonal to such JavaScript technicalities. 14 Multiple coverings can arise if, for example, the same property name is used at two different property read operations. We choose the largest covering based on the heuristic that fine-grained coverings lead to higher precision than coarse-grained coverings. The most important consequence of this heuristic is that we avoid the {any} covering if others are available. In case of multiple largest ones, an arbitrary one is selected among them. B. B. Nielsen and A. Møller 16:13 pv ∈ PartitionedValue = T ,→ FPValue fv ∈ FPValue = FunctionPartitions × Value fp ∈ FunctionPartitions = P(T ) AnalysisLattice = C 0 × N → State c ∈ C 0 = FunctionPartitions × C Figure 9 Extensions of the abstract domain for free variable partitioning. imprecise, the first case in each definition in Figure 7 applies, meaning that value partitioning is triggered. Since propNames(σ(r2 )) = {"foo", "bar"}, we update r4 such that σ(r4 ) equals [valhn, r4 , "foo"i 7→ "foo", valhn, r4 , "bar"i 7→ "bar", valhn, r4 , otheri 7→ AnyString], where n is the read property node. Recall from Section 3 that the operation σ(r4 ) := . . . does not only modify r4 but also the must-equals variables and registers, meaning that this partitioned value is also written to r2 and p. The value being read gets the same partitions, such that σ(r5 ) becomes [valhn, r4 , "foo"i 7→ 1, valhn, r4 , "bar"i 7→ 2, valhn, r4 , otheri 7→ undefined]. When reaching the property write operation r1 [r2 ] ← r5 , the state σ contains σ(r2 ) = [valhn, r4 , "foo"i 7→ "foo", valhn, r4 , "bar"i 7→ "bar", valhn, r4 , otheri 7→ AnyString] and σ(r5 ) = [valhn, r4 , "foo"i 7→ 1, valhn, r4 , "bar"i 7→ 2, valhn, r4 , otheri 7→ undefined]. We now apply the transfer function from Figure 8. The two values σ(r2 ) and σ(r5 ) share two coverings: {any} and {valhn, r4 , "foo"i, valhn, r4 , "bar"i, valhn, r4 , otheri}. Since the second covering is largest, that one is picked by chooseCommonCovering(σ(r2 ), σ(r5 )). We therefore perform three writes corresponding to the abstract assignments obj 1["foo"]=1, obj 1["bar"]=2, and obj 1[AnyString]=undefined; notably, the properties foo and bar are not mixed together. 5.2 Free Variable Partitioning We now explain how to leverage value partitioning to gain precision for free variables, such as func in line 9 in the motivating example from Figure 1. Extending the abstract domain The first step is to extend the abstract domain as shown in Figure 9. The Value component in PartitionedValue is replaced by FPValue, which is a product of FunctionPartitions and Value. The component FunctionPartitions is a set of partition tokens, which we use for tracking which partitions the functions described in the Value component may have been declared in. (For instance, for the motivating example from Figure 1, the function declared in lines 7–10 was created in the partitions t01 , t02 , and t03 so the corresponding abstract values become15 ({t01 }, `), ({t02 }, `), and ({t03 }, `), where ` denotes the location for the created closure.) To preserve this information when analyzing calls to such functions, we also augment the set of contexts to include this information (replacing C by C 0 in AnalysisLattice). The FunctionPartitions set is empty for values and contexts that do not use free variable partitioning. 15 These three abstract values are denoted `t01 , `t02 , and `t03 , respectively, in the motivating example in Section 2. ECOOP 2020 16:14 Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript pv(t) if t ∈ dom(pv) otherwise if t = funhf, c, t0 i ∧ ⊥ pv(t) = ∃c0 , t00 : c 6= c0 ∧ funhf, c0 , t00 i ∈ dom(pv) pv(any) otherwise Figure 10 Redefining how partitioned values are extended to total functions, exploiting free variable partitioning. Next, we introduce a new kind of partition tokens, and we then describe how elements of FunctionPartitions are created at function expressions and used at read variable nodes. Partition tokens for free variable partitioning We introduce a new kind of partition tokens, funhf, c, ti, where f is a function, c ∈ C 0 is a context, and t ∈ T is a partition token. A trace is represented by such a token if (1) the trace ends at a program location that belongs to a closure that was created when the trace up to that point was a t trace, and (2) that point in the trace is in function f in context c. (For instance, in the motivating example, a trace ending in line 9 where the currently executed closure was created in line 7 at the end of a t1 trace can be represented by funhf, c, t1 i, where f is the function at lines 2–11 and c is the context for the call to that function.) We only allow such partition tokens to appear in abstract values of variables that are declared in f . Intuitively, we use these partition tokens to obtain a form of heap specialization (also called heap cloning or context sensitive heap) [20] for the activation objects of f .16 An important property is that if the abstract value of a variable x declared in a function f contains partition token funhf, c0 , t00 i for some c0 , t00 but not funhf, c, t0 i for any c, t0 where c=6 c0 , then f has not been invoked with context c in any trace represented by funhf, c, t0 i. This means that it is safe to redefine how partitioned values are extended to total functions as shown in Figure 10. The only difference between the new and the original definition from Section 4 is the second case, where ⊥ is returned to indicate that the set of traces for the given partition is empty due to the above mentioned property being satisfied. Function definitions Assume the analysis reaches a function definition node, r1 ← function(· · · ){· · · }, while analyzing a function f in context c, and that the function being defined has free variables x1 , . . . , xn that are declared in f (i.e., as parameters or local variables). Note that f is the function containing the function definition node being analyzed, not the function being defined. Let ` denote the location of the newly created closure according to the original transfer function without free variable partitioning. We now partition both the resulting function value of register r1 and the values of x1 , . . . , xn as shown in Figure 11. First, we use a function chooseCovering that finds a largest covering, denoted LC , among the values of x1 , . . . , xn . (If multiple such coverings exist, an arbitrary one is selected among them, as before.) 16 Local variables and arguments are stored as properties on activation objects, which are created on each invocation. B. B. Nielsen and A. Møller 16:15 LC = chooseCovering(σ(x1 ), . . . , σ(xn )) ( σ(xi ) ] [funhf, c, ti 7→ σ(xi )(t) | t ∈ LC ] if LC ⊆ dom(σ(xi )) σ(xi ) := σ(xi ) otherwise ( [t 7→ ({funhf, c, ti} ∪ fp, `) | t ∈ LC ] if LC ⊆ dom(σ(xi )) for some xi σ(r1 ) := [any 7→ (∅, `)] otherwise Figure 11 Introduction of partitioned values at a function definition node r1 ← function(· · · ){· · · }. For each xi for i = 1, . . . , n, if the current value of xi contains the covering LC , we add funhf, c, ti 7→ σ(xi )(t) to the value of xi for each t ∈ LC . (This evidently respects the meaning of funhf, c, ti tokens informally described in the beginning of the section.) Otherwise, if the current value of xi does not contain LC , we leave xi unmodified. For the result register r1 , we augment the function location ` by the same partition tokens. If at least one free variable has been partitioned (i.e., LC ⊆ dom(σ(xi )) for some xi ), then for each of the partition tokens t ∈ LC , the value of r1 becomes the augmented value ({funhf, c, ti} ∪ fp, `) where fp is the set of function partitions in the current context c. By augmenting the value using the funhf, c, ti token, the information about the partitioning is available when ` is later invoked, which is explained below. (The function partitions fp of the current context describe how the current function was declared in an outer scope, so by inheriting those, the partitioning also works for multiple layers of nested functions.) Otherwise, if none of the free variables have been partitioned, register r1 is assigned the partitioned value [any 7→ (∅, `)], which is equivalent to the original transfer function without free variable partitioning. Function calls At a function call r1 ← r2 (r3 ) where σ(r2 ) is an augmented function value (fp, v) (i.e., fp is a set of partition tokens introduced at function definitions and v refers to the set of closures that may be invoked), we use fp to augment the context for each callee. (The set of augmented contexts C 0 contains the FunctionPartitions component exactly for this purpose.) Assume for simplicity that v refers to a single closure location so we only have one callee. By augmenting the context, when analyzing the body of the callee we retain the information about the partitions where the callee closure was created, which we can exploit when reading its free variables as explained next. Variable reads Figure 12 shows the updated transfer function for read variable nodes r1 ← x, where we read a variable x in a calling context with function partitions fp. The set of function partitions fp tells us which partitions the current closure may have been created in. For this reason, if the abstract value of x contains partition tokens that are also in fp, we can obtain a covering for x by considering only those partition tokens. If there is no such partition token, we just read the value of x as in the original transfer function. ECOOP 2020 16:16 Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript ( F any 7→ {σ(x)(t) | t ∈ dom(σ(x)) ∩ fp} if dom(σ(x)) ∩ fp 6= ∅ σ(r1 ) := σ(x) otherwise Figure 12 Exploiting partitioned values at a read variable node, r1 ← x. 32 var o1 = {x: 1, y: 2}; 22 function f(v) { 33 var o2 = {}; 23 return function g() { 34 Object.keys(o1).forEach( 24 return v; 35 function h(p) { 25 } 36 var v = o1[p]; 26 } 37 o2[p] = function j() { 27 38 return v; 28 var foo = f("foo"); 39 } 29 var bar = f("bar"); 40 } 30 41 ); 31 assert(bar() != "foo"); 42 assert(o2.y() != 1); Figure 13 Free variable partitioning Figure 14 Free variable partitioning example with different contexts. example with partitioned argument. As an example, assume σ(x) = [funhf, c, ti 7→ 1, funhf, c0 , t0 i 7→ 2, . . . ] and fp = {funhf, c, ti}. The value of x tells us that x must be a local variable in function f which may have been called in contexts c and context c0 , and that x’s value is 1 or 2, respectively. Since fp = {funhf, c, ti}, we know that the current function is defined inside the lexical scope of f in context c, meaning that the value of x must be 1. Examples To better understand free variable partitioning, we provide two examples. The first example (Figure 13) shows how free variable partitioning can preserve precision when a function is called in multiple contexts, in a way that resembles traditional heap specialization [20]. The second example (Figure 14) shows how free variable partitioning can preserve the precision of free variables partitioned with property name partitioning. In Figure 13, lines 22–26 define a function f that returns a closure, which on invocation returns the argument passed to f. Lines 28 and 29 call f with the arguments "foo" and "bar" and store the returned closures in the variables foo and bar, respectively. Line 31 calls the closure stored in bar and asserts that the resulting value is not the string "foo". The two calls to f are analyzed in different contexts c and c0 (due to the context sensitivity mechanism mentioned in Section 3, as "foo" and "bar" are determinate values). For the invocation bar(), the resulting value is the value of the free variable v in the closure stored in bar. If not using heap specialization, the two concrete activation objects at the two calls to f would be modeled by a single abstract object, so the free variable v would have the imprecise abstract value AnyString. To reason precisely about the assertion in line 31, the analysis has to distinguish the value of v at the two calls. The baseline TAJS analyzer accomplishes this by the use of heap specialization [2], which provides two different abstract activation objects for the calls to f, so the two values "foo" and "bar" are kept separate. B. B. Nielsen and A. Møller 16:17 With free variable partitioning we obtain the same degree of precision as with heap specialization in this situation. Since v is a free variable in the closure created in line 23, we apply the top cases in the transfer functions shown in Figure 11 with LC = {any}. This means that v after the call to f("foo") will have the value [any 7→ "foo", funhf, c, anyi 7→ "foo"] and the value written to the foo variable is ({funhf, c, anyi}, `g ) where `g is the location of the closure created in line 23. For the call to f("bar"), the value for v will similarly be [any 7→ "bar", funhf, c0 , anyi 7→ "bar"] and the value written to bar is ({funhf, c0 , anyi}, `g ). Note the difference in the context part of the fun token (c at the "foo" call and c0 at the "bar" call), since the calls to f are in those two different contexts. The value of v then becomes [any 7→ AnyString, funhf, c, anyi 7→ "foo", funhf, c0 , anyi 7→ "bar"], so that the fun partitions preserve the precise values. Now when analyzing bar(), bar has the value ({funhf, c0 , anyi}, `g ), which means the calling context to the function g is augmented with the set of function partitions {funhf, c0 , anyi} as described above. When reading the free variable v in line 24, we use the first case in the transfer function defined in Figure 12, since dom(σ(x)) ∩ fp is {funhf, c0 , anyi}. This means that the resulting value from the variable read is the value [any 7→ "bar"], so we obtain the same precision as with heap specialization. This first example shows how the free variable partitioning mechanism works and how it relates to heap specialization, but it does not demonstrate any precision improvements compared to the existing TAJS analyzer, which does apply heap specialization. The second example, Figure 14, illustrates a simplified version of how free variable partitioning was used in the motivating example in combination with property name partitioning, which leads to a precision improvement of TAJS. Line 32 defines the object o1 with two properties, and line 33 defines o2 as an empty object. Lines 34–41 iterate over the properties of o1. For each property, it writes a function returning the value of o1[p] to the p property of o2. To prove that the assertion at line 42 always holds, it is critical that the values of v are not mixed together in the iterations. Using property name partitioning at line 36, the value of v becomes [valhn, r, "x"i 7→ 1, valhn, r, "y"i 7→ 2] and the value of p becomes [valhn, r, "x"i 7→ "x", valhn, r, "y"i 7→ "y"], where n is the read property node and r is the register storing the property name. (For clarity we ignore the other partition in this example.) When analyzing the closure creation at line 37, we use the top rules in Figure 11 with LC = {valhn, r, "x"i, valhn, r, "y"i}. This means that v is augmented with the additional partitions [funhh, c, valhn, r, "x"ii 7→ 1, funhh, c, valhn, r, "y"ii 7→ 2], and the value being written to o2[p] is [valhn, r, "x"i 7→ ({funhh, c, valhn, r, "x"ii}, `j ), valhn, r, "y"i 7→ ({funhh, c, valhn, r, "y"ii}, `j )]. Here, `j denotes the location of the closure created in line 37. At the dynamic property write, the property name and value to be written share the covering {valhn, r, "x"i, valhn, r, "y"i}, meaning that the write happens as described in Figure 8, so that o2.x becomes ({funhh, c, valhn, r, "x"ii}, `j ) and o2.y becomes ({funhh, c, valhn, r, "y"ii}, `j ). Now when o2.y is called in line 42, the call to j is augmented with the the set of function partitions {funhh, c, valhn, r, "y"ii}. Therefore when reading the value v in line 38, according to Figure 12 we only read the funhh, c, valhn, r, "y"ii partition. The result of reading v is then [any 7→ 2], so the analysis is precise enough to prove that the assertion at line 42 holds. 5.3 Type Partitioning Value partitioning can also be useful for partitioning values based on their types. Since JavaScript does not have function overloading, it is common to reflectively find the type of an argument, and based on the type run different pieces of code (as in line 3 in Figure 1). ECOOP 2020 16:18 Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript σ(r3 ) ] [typehn, r3 , tyi 7→ filter(σ(r3 ), ty) | ty ∈ types(σ(r3 ))] σ(r3 ) := if |types(σ(r3 ))| > 1 σ(r ) otherwise 3 Figure 15 Addition to the transfer function for a call node with one argument, r1 ← r2 (r3 ). filter restricts a partitioned value to represent only values that match the given type, and types returns the possible types of the given partitioned value. This is often done through the use of predicate functions, which are one-parameter functions that return a boolean value. By partitioning the arguments at calls to predicate functions, the analysis becomes able to track the relations between the arguments and the return values, and thereby boost the control sensitivity mechanism (see Section 3) at branches that involve such calls. Since the analysis does not know in advance whether a function returns boolean values, we simply perform this partitioning at all function calls with one argument, without considering what values the function may return. Partition tokens for type partitioning We introduce type partitioning tokens of the form typehn, r, tyi, where n ∈ N is a call node r1 ← r2 (r3 ), r ∈ R is the argument register in n (in this case r3 ), and ty ∈ Types using the set of types shown in Figure 6. Such a token represents the set of traces where the type of r is ty at the last occurrence of n. For example, the traces that reach line 7 in Figure 1 are represented by the token typehn, r, functioni where n is the call to isFunction in line 6 and r is the argument register of that call node. Function calls Figure 15 shows an addition to the transfer function for call nodes, r1 ← r2 (r3 ), to partition the argument value before the call takes place. The first case applies if the argument σ(r3 ) abstractly represents values of multiple types (i.e., |types(σ(r3 ))| > 1, where types returns the set of all the types the given abstract value may have). In this case we introduce a partition typehn, r3 , tyi for each ty ∈ types(σ(r3 )), such that the value in that partition is filter(σ(r3 ), ty), where filter restricts σ(r3 ) to only represent values of type ty. Since all the possible types are represented, the new partitions together form a covering. Recall that a typehn, r, tyi token only represents information about the last occurrence of n in a given trace. To ensure this property we always remove all existing typehn, _, _i tokens from the abstract state immediately before applying the modified transfer function for call node n. Example As an example consider the code in Figure 16, and assume x has the abstract value fun1|obj2 (representing either the function fun1 or the object obj2). Without type partitioning, the result of analyzing the isObj(x) call is the abstract value AnyBool (representing true or false), so both branches are analyzed with x being fun1|obj2; however, in a concrete execution, fun1 will never flow to the “true” branch, and obj2 will never flow to the “false” branch. B. B. Nielsen and A. Møller 16:19 47 function isObj(arg) { 48 if (typeof arg == ’object’) 43 function isObj(arg) { 49 return true; 44 return typeof arg == ’object’; 50 else 45 } 51 return false; 46 if (isObj(x)) { ... } else { ... } 52 } 53 if (isObj(x)) { ... } else { ... } Figure 16 Type partitioning example. Figure 17 Type partitioning example with control dependent relations. By using type partitioning, we partition x before calling the predicate function. In this example let n be the call node and let r be its argument register. Then x be- comes [typehn, r, functioni 7→ fun1, typehn, r, objecti 7→ obj2]. Now when analyzing the body of isObj, the expression typeof arg == ’object’ evaluates to the partitioned value [typehn, r, functioni 7→ false, typehn, r, objecti 7→ true]. When reaching the if branch, control sensitivity ensures that only the object partition flows to the “true” branch (i.e., x’s value becomes [typehn, r, functioni 7→ ⊥, typehn, r, objecti 7→ obj2] in that branch), and only the function partition flows to the “false” branch (i.e., x’s value becomes [typehn, r, functioni 7→ fun1, typehn, r, objecti 7→ ⊥] in that branch). Control dependent relations Predicate functions are sometimes implemented with control dependent relations between the argument and the result, as in the example in Figure 17. The example is contrived but it is not uncommon in predicate functions that the result values appear as the literals true or false in branches. With the type partitioning mechanism described above, the returned values will not be partitioned in this situation, since the partitions in arg do not propagate to the values true and false. To mitigate this issue, we augment the abstract states as shown in Figure 18 to keep track of partitions that must be dead or may be live (represented by the two P(T ) components, respectively). A partition is dead if the set of traces it represents is empty, and it is live otherwise. (We only keep track of the live partitions in coverings where there are any dead partitions.) Since the branch condition typeof arg == ’object’ is analyzed with a partitioned value for arg, by control sensitivity we know that the only traces that can reach the “true” branch are those represented by the object partition, so we record that typehn, r, objecti is live and typehn, r, functioni is dead in that branch, and conversely in the other branch. To exploit this information, we also update the transfer function for constants, r1 ← c, as shown in Figure 19. Basically, it assigns ⊥ to all dead partitions and the constant c to all live partitions. If there are no dead partitions, it behaves as usual, where the constant is written to the any partition. When the analysis reaches true (line 49), we obtain the partitioned value [typehn, r, functioni 7→ ⊥, typehn, r, objecti 7→ true], and similarly when analyzing false (line 51) we get [typehn, r, functioni 7→ false, typehn, r, objecti 7→ ⊥]. The join of these two values is [typehn, r, functioni 7→ false, typehn, r, objecti 7→ true], which becomes the result of isObj(x). Due to the control sensitivity mechanism, only obj2 then flows to the “true” branch, and only fun1 flows to the “false” branch in line 53. ECOOP 2020 16:20 Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript State 0 = State × P(T ) × P(T ) Figure 18 Abstract states updated to keep track of dead and live partitions. [t 7→ c | t ∈ livePartitions(σ)] ] [t 7→ ⊥ | t ∈ deadPartitions(σ)] σ(r1 ) := if deadPartitions(σ) 6= ∅ [any 7→ c] otherwise Figure 19 Updated transfer function for constant nodes, r1 ← c, for improved type partitioning. 6 Evaluation We have implemented the value partitioning framework (Section 4) and the three instantiations (Section 5) on top of TAJS v0.24. Implementing the general framework in TAJS required 900 lines of code, however most of this is boilerplate code for lifting operations on ordinary abstract values to also work on partitioned values. With the general framework in place, instantiations are easy to implement: property name partitioning (Section 5.1), free variable partitioning (Section 5.2), and type partitioning (Section 5.3) required only around 230, 250, and 60 lines of code, respectively. We disable TAJS’s for-in specialization technique, since it is subsumed by property name partitioning.17 We refer to our new analysis tool as TAJSValPar .18 Using this tool we evaluate our techniques by answering the following research questions: RQ1 How does TAJSValPar compare to existing state-of-the-art analyses for JavaScript? RQ2 What are the effects of the three different instantiations of value partitioning? All our experiments are conducted on an Ubuntu machine with a 2.6 GHz Intel Xeon E5-2697A CPU running a JVM with 10 GB RAM. 6.1 RQ1: Comparison with State-Of-The-Art Analyses We start by comparing TAJSValPar against the current state-of-the-art analyses for JavaScript: the baseline TAJS analyzer with static determinacy [2], TAJSVR [26] with demand-driven value refinement, and the CompAbs analyzer [16] based on the SAFE analyzer [17]. We use the same benchmarks as those used in the evaluation of TAJSVR , which is the most recent related work. Micro benchmarks We first evaluate TAJSValPar against a small collection of micro benchmarks that capture some of the main challenges that appear in analysis of modern JavaScript libraries and are used in previous work [16, 26]. The benchmarks all contain dynamic read/write pairs that 17 The motivation for introducing for-in specialization in [2] was to reason about correlated read/write pairs inside for-in loops. This relational information is now provided by property name partitioning. 18 TAJSValPar : TAJS with Value Partitioning B. B. Nielsen and A. Møller 16:21 Table 1 Micro-benchmarks that check how state-of-the-art analyses handle various dynamic read/write pairs that represent typical challenges in JavaScript library code. A 7 indicates that the analysis mixes together the properties of the object being manipulated, while a X indicates that it is sufficiently precise to keep them distinct. The CF, CG, AF, and AG benchmarks are drawn directly from [16], while M1, M2, and M3 are drawn directly from [26]. Benchmark TAJS CompAbs TAJSVR TAJSValPar CF X X X X CG X X X X AF 7 X X X AG 7 X X X M1 7 7 X X M2 7 7 X X M3 7 7 X X are variations of the pattern shown in the introduction and the motivating example. The results of the comparison are shown in Table 1. For these benchmarks, a test succeeds if it avoids mixing together properties in the dynamic read/write pairs. The first two examples, CF and CG, are loops where the static analyses have enough information to be able to unroll all the iterations and thereby analyze the read/write patterns with precise property names. For CF, property name partitioning in TAJSValPar gives the same degree of precision without loop unrolling. AF and AG are loops where the static analyses are incapable of obtaining a precise value for the property name used in the dynamic read/write pairs. TAJS fails to analyze these, but CompAbs detects the pattern syntactically and therefore applies trace partitioning to analyze the code precisely. TAJSVR also succeeds on these tests, because its backwards abstract interpreter is capable of providing the necessary relational information. In comparison, TAJSValPar can reason about the relational information on its own. Both TAJS and CompAbs fail on the last three tests (M1, M2, and M3). CompAbs fails on M1 and M3 because it does not apply partitioning due to the fragility of syntactic patterns, and it fails on M2 because the partitioning does not provide the necessary precision about free variables. Again, TAJSVR can analyze them all, since the backwards abstract interpreter is powerful enough to reason about all the cases, whereas TAJSValPar successfully preserves the relational properties by the use of value partitioning. These results demonstrate that for these benchmarks, TAJSValPar is capable of providing comparable precision to the demand-driven value refinement technique without the need for a complicated backwards analysis, and provides better precision than the other analyses. Library benchmarks The next set of benchmarks is taken from the evaluation of TAJSVR and consists of small test cases for popular real-world libraries. The libraries include the widely used functional utility library Underscore (which has more than 20 000 dependents in npm) v1.8.3 with 1 548 LoC and the most depended-upon package Lodash (more than 115 000 dependents). We analyze both Lodash3 (v3.0.0, 10 785 LoC) and Lodash4 (v4.17.10, 17 105 LoC), since their code bases are substantially different and therefore pose distinct challenges for static analysis. ECOOP 2020 16:22 Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript Table 2 Analysis results for real-world benchmarks (from [26]). For each group of benchmarks and for each of the four analyzers, we show the number of tests that are analyzed successfully and (in parentheses) the average analysis time per successful test. Benchmark group TAJS CompAbs TAJSVR TAJSValPar Underscore (182 tests) 0 (-) 0 (-) 173 (2.9s) 173 (2.7s) Lodash3 (176 tests) 7 (2.4s) 0 (-) 172 (5.5s) 173 (5.3s) Lodash4 (306 tests) 0 (-) 0 (-) 266 (24.7s) 289 (26.3s) Prototype (6 tests) 0 (-) 2 (23.1s) 5 (97.7s) 5 (34.1s) Scriptaculous (1 tests) 0 (-) 1 (62.0s) 1 (236.9s) 1 (55.2s) jQuery (71 tests) 3 (16.0s) 0 (-) 3 (13.5s) 3 (20.4s) The other libraries, Prototype v1.7.2, Scriptaculous v1.9.0, and jQuery v1.10,19 are popular libraries for client-side web programming. The analysis results are shown in Table 2. We classify an analysis of a benchmark as successful if it terminates within 5 minutes and the analysis result to our knowledge is sound. In particular, an analysis run is considered a failure if the analysis result does not have dataflow to the ordinary exit of the program. (All the tests pass in normal execution, so an analysis result is obviously unsound if there is no dataflow to the ordinary exit.) To increase confidence in the soundness of the analysis results for TAJSValPar , we apply thorough soundness testing as described at the end of this section. Increasing the time budget does not help for these benchmarks: as reported previously for JavaScript analysis tools, critical precision losses tend to cause a proliferation of spurious dataflow that drastically increases analysis time and renders the analysis results useless [26, 11, 22, 16]. The results for TAJSValPar are comparable to those of TAJSVR , which outperforms the other analyzers. TAJSValPar succeeds in analyzing all the benchmarks that TAJSVR can handle, plus 24 more (one Lodash3 test and 23 Lodash4 tests). Note the substantial improvement for the Lodash4 tests: the number of Lodash4 tests that are not analyzed successfully is reduced from 40 to 17. None of the analyzers do well on the jQuery benchmarks; a preliminary manual study shows that the reasons are unrelated to relational analysis. The results are as expected, since property name partitioning and free variable partitioning are alternative techniques to provide the relational information that TAJSVR obtains from its demand-driven value refinement. Furthermore, value partitioning is triggered more often during the analysis, which means that the precision improvements are not limited to the few critical cases where value refinement is triggered. On top of this, type partitioning provides some additional precision beyond the capabilities of TAJSVR . Comparing the performance between TAJSValPar and TAJSVR , the most significant differences are for the Prototype and Scriptaculous benchmarks. TAJSValPar is around 3–4 times faster than TAJSVR , which is mainly because property name partitioning makes the for-in specialization technique in TAJS obsolete. For Underscore and Lodash3, TAJSValPar is slightly faster than TAJSVR . This is encouraging, because analyzing dynamic property writes as the one in line 7 in Figure 1 is more expensive in TAJSValPar than in TAJSVR . In TAJSVR such an operation is handled as a single imprecise write (since the precision is recovered on demand), whereas TAJSValPar performs the write for each property that is copied. To soundly handle setters, all the writes happen in different states that are 19 This is the version of jQuery used in [2]. Note that [16] used the older v1.4.4. B. B. Nielsen and A. Møller 16:23 subsequently joined together, which causes TAJSValPar to spend some extra time at such writes. Since the analysis time is nevertheless similar, we can conclude that value partitioning is cheaper for analyzing other parts of the libraries. For Lodash4 and jQuery, TAJSVR is slightly faster than TAJSValPar . For Lodash4, the main reason is the handling of dynamic property writes, and for the jQuery benchmarks, type partitioning adds little performance overhead as seen in Table 3. Precision Previous work [2, 22, 26] established that type analysis and call-graph construction are useful metrics for measuring the precision of an analysis for JavaScript, and therefore we use these metrics to evaluate the analysis precision of TAJSValPar . All locations are treated context-sensitively in these measurements, meaning that we count the same location once for each reachable context. We count the number of possible types for the resulting value in each variable or property read and find that in 99.19% of the reads, a single unique type is read, with the average number of types being 1.02. For measuring precision of the call-graph construction, we measure the number of call-sites with unique callees, and find this number to be 99.95% of all call-sites. These numbers show that when the analysis succeeds, it does so with very high precision. Soundness Formally proving soundness of the three variants of value partitioning is out of scope of this paper, however, we will informally justify that the general approach is sound. Since general trace partitioning is known to be sound, it suffices to argue that the precision gained by value partitioning is equivalent to that obtained through trace partitioning. The key reason why this holds for property name partitioning and type partitioning is that the partition tokens represent the last occurrence of some node, meaning that if two values share partitions, they represent information about the same execution traces. This means that we could (if ignoring performance) instead have applied traditional trace partitioning, with exactly the same partition tokens and at the same nodes, resulting in the same precision. (For further discussion about the connection between value partitioning and trace partitioning, see Section 7.) Similarly for free variable partitioning, since the partitions are only allowed on activation objects, the precision is never higher than what would be obtained using heap specialization (where each partition would be represented by a distinct abstract activation object), and therefore soundness follows from soundness of heap specialization. Furthermore, to increase confidence in the soundness of our implementation, all the TAJSValPar results have been thoroughly soundness tested [3]. This means that the analysis results overapproximate all the dataflow facts that have been observed during concrete executions of the analyzed benchmarks. For every variable and property read observed concretely, we have checked that the concrete value is in the concretization of the corresponding abstract value in the analysis results, and similarly for property writes and function calls. All our benchmarks except one pass in total more than 7.6 million soundness tests. The one benchmark that fails is a Lodash4 test, which uses ES6 iterators in combination with Arrays.from, which is not fully supported in the latest version of TAJS and is unrelated to the use of value partitioning. ECOOP 2020 16:24 Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript Table 3 Analysis results for real-world benchmarks (from [26]) using different instantiations of value partitioning. “None” is without value partitioning, “P” is with property name partitioning, “P + FV” is with property name and free variable partitioning, and “F + PV + T” is with property name, free variable, and type partitioning. Benchmark group None P P + FV P + FV + T Underscore (182 tests) 0 (-) 149 (2.0s) 173 (2.5s) 173 (2.7s) Lodash3 (176 tests) 7 (2.4s) 167 (4.7s) 173 (5.1s) 173 (5.3s) Lodash4 (306 tests) 0 (-) 268 (16.8s) 274 (27.7s) 289 (26.3s) Prototype (6 tests) 0 (-) 0 (-) 5 (32.7s) 5 (34.1s) Scriptaculous (1 tests) 0 (-) 0 (-) 1 (53.1s) 1 (55.2s) jQuery (71 tests) 3 (16.0s) 3 (15.2s) 3 (16.5s) 3 (20.4s) 6.2 RQ2: Effects of the Three Instantiations We now investigate how much each of the three uses of value partitioning contributes to the results reported in the previous section. The results from running our analysis with only some instantiations enabled can be seen in Table 3. The column “P” is with only property name partitioning enabled; we see that it is sufficient for analyzing many of the Underscore and Lodash test cases, but not for any of the Prototype or Scriptaculous test cases. (Without property name partitioning but with the other two instantiations enabled, the analysis is not able to analyze more benchmarks than TAJS.) The column “P + FV” uses both property name partitioning and free variable partitioning. Also enabling free variable partitioning makes the analysis capable of analyzing many additional benchmarks: more Underscore and Lodash test cases, as well as some Prototype and Scriptaculous test cases. Compared to only property name partitioning, the analysis times are higher (for the reason discussed above regarding additional state joins). The last column “P + FV + T” is with all instantiations enabled and therefore contains the same numbers as shown in Table 2. We see that type partitioning enables the analysis of 15 additional Lodash4 tests, without significantly increasing the analysis time. We conclude that all three instantiations contribute to the results, where property name partitioning is the most important one, followed by free variable partitioning and then type partitioning. (TAJS already performs filtering at branches, as mentioned in Section 3; without that feature the effect of type partitioning would likely be larger.) 7 Related Work Trace partitioning Value partitioning can be viewed as a variant of trace partitioning [23] as explained in Sections 1, 2 and 4, but there are some important differences. Changing the original abstract domain in Section 4 to support traditional trace partitioning can be done by replacing L → State by L → T → State, so that an abstract state is maintained for each partition, at every location. Thus, different locations can partition the abstract states differently. Value partitioning instead has only one abstract state per location but partitions the individual abstract values, which adds an additional degree of flexibility: different parts of each abstract state can be partitioned differently. In particular, for the large parts of the states where we B. B. Nielsen and A. Møller 16:25 are not interested in relational information, we can use the {any} partitioning,20 while for the important registers and object properties, we can have nontrivial partitions. With traditional trace partitioning, the normal transfer functions are applied for each partition, which causes redundant computations because of the similarities between the different partitions21 ; with value partitioning, we only pay a price for partitioning at operations that involve abstract values with nontrivial partitions. This is the main reason for the low overhead of the technique. Another difference is that the partition tokens in traditional trace partitioning are actually lists of “directives” (the language of directives used by Rival and Mauborgne [23] is similar to our language of tokens in Figure 6), which can lead to a combinatorial explosion. By partitioning at the level of values and allowing multiple coverings in each partitioned value, we avoid the need to maintain such combinations. Relational analysis Traditional techniques for achieving relational analysis, as exemplified by the octagon abstract domain [19], focus on numeric relations, such as, linear inequalities. To reduce the cost of this approach, a syntactic pre-analysis called variable packing is typically used for partitioning the set of program variables, and one octagon is then used for each pack instead of tracking all possible combinations of inequalities. This kind of partitioning is reminiscent of value partitioning, but with the important difference that variable packing and octagons operate on sets of program variables whereas value partitioning works on individual abstract values. In our work with analysis of JavaScript libraries, we have not encountered a critical need for tracking numeric relations. The well-known analyzer Astrée [4] applies not only trace partitioning and octagons, but also a decision tree abstract domain that is used for tracking relations between booleans and numerical variables that affect control flow. That technique has some similarities with our type partitioning mechanism but relies on variable packing to avoid combinatorial explosions, whereas type partitioning uses the more lightweight value partitioning technique in combination with the existing control sensitivity mechanism of TAJS. The main purpose of value partitioning is to be able to reason about relations between different parts of the abstract state (i.e., program variables and registers) at the various program points. Some literature uses the term relational analysis with a slightly different meaning: to relate information across program points, typically relations between the entry and exits of functions [8, 5]. Static analysis for JavaScript Through the last decade, several static analyzers for JavaScript have been developed, including WALA [25, 24, 28], SAFE [17, 22], JSAI [13], and TAJS [11, 2, 26]. Although we focus on TAJS, the designs of SAFE and JSAI are reasonably similar, so we believe value partitioning could also be incorporated into those tools with little effort. As discussed in the introduction, much work has been put into improving precision of the analyses through different kinds of context sensitivity and elaborate abstract domains. The techniques include parameter sensitivity and heap context sensitivity [2], loop unrolling [22], 20 In our experiments, 99.4% of all abstract values have the trivial {any} partitioning. 21 This was shown experimentally in the work on TAJSVR [26, Section 7.1]. ECOOP 2020 16:26 Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript and syntactic patterns for detecting correlated read/write pairs and guiding context sensi- tivity [25]. Other works have explored more expressive string abstractions to reason more precisely about property names in dynamic property accesses [18, 1, 21]. Our abstract domain extension for value partitioning has few assumptions about the underlying abstract domain, so most of these techniques can be combined with value partitioning. Despite such precision improvement techniques, imprecision is inevitable, and only a few techniques have been designed to handle dynamic property accesses with imprecise property names, most importantly, CompAbs-style trace partitioning [16] and demand-driven value refinement [26]. Previous work has shown that demand-driven value refinement enables analysis of many more challenging benchmarks than CompAbs-style trace partitioning (as also discussed in Section 6), and that the trace partitioning approach causes a large amount of redundant computation [26, Section 7.1]. The fundamental drawback of demand-driven value refinement is that it requires a separate backwards abstract interpreter for not only the entire JavaScript language but also the standard library. The backwards abstract interpreter of TAJSVR is not simply the dual of TAJS but works goal-directed and with its own abstract domain based on intuitionistic separation logic. In contrast, value partitioning directly leverages the existing forward analyzer and thereby supports both the JavaScript language and the standard library essentially for free, which makes this approach substantially easier to develop and maintain. Furthermore, value partitioning is more general (for example, it enables type partitioning), and the three instantiations we have presented lead to better precision (for the Lodash4 tests). The HOO (heap with open objects) abstract domain [9] is a relational abstraction that is designed to reason more precisely about abstract objects whose properties cannot be known statically. That approach is highly expressive but not scalable to real-world JavaScript libraries as those considered in Section 6. 8 Conclusion We have presented value partitioning, a static analysis technique for reasoning about relational properties. It is a lightweight alternative to traditional trace partitioning techniques that allows relational information to be incorporated into the abstract values instead of requiring separate abstract states for the partitions. We have proposed three instantiations of value partitioning in JavaScript analysis: property name partitioning, free variable partitioning, and type partitioning, which enable precise reasoning for dynamic read/write pairs, free variables, and predicate functions, respectively. The experimental results show that extending the TAJS analyzer with the three variants of value partitioning enables precise and efficient analysis of complex JavaScript libraries including Lodash and Underscore, thereby outperforming a state-of-the-art technique that relies on trace partitioning and without requiring a complicated backwards analysis. For the libraries considered in this study, property name partitioning has the largest effect among the proposed variants. An interesting direction for future research is to investigate whether some of the traditional context sensitivity strategies used in TAJS and other JavaScript analyzers can be reformulated as new value partitioning instantiations, to make analysis faster while retaining precision. B. B. Nielsen and A. Møller 16:27 References 1 Roberto Amadini, Alexander Jordan, Graeme Gange, François Gauthier, Peter Schachte, Harald Søndergaard, Peter J. Stuckey, and Chenyi Zhang. Combining string abstract domains for JavaScript analysis: An evaluation. In Tools and Algorithms for the Construction and Analysis of Systems - 23rd International Conference, TACAS 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22-29, 2017, Proceedings, Part I, volume 10205 of Lecture Notes in Computer Science, pages 41–57, 2017. 2 Esben Andreasen and Anders Møller. Determinacy in static analysis for jQuery. In Proc. ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 17–31, 2014. 3 Esben Sparre Andreasen, Anders Møller, and Benjamin Barslev Nielsen. Systematic approaches for increasing soundness and precision of static analyzers. In Proceedings of the 6th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis, SOAP@PLDI 2017, Barcelona, Spain, June 18, 2017, pages 31–36, 2017. 4 Bruno Blanchet, Patrick Cousot, Radhia Cousot, Jérôme Feret, Laurent Mauborgne, Antoine Miné, David Monniaux, and Xavier Rival. A static analyzer for large safety-critical software. In Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation 2003, San Diego, California, USA, June 9-11, 2003, pages 196–207. ACM, 2003. 5 Cristiano Calcagno, Dino Distefano, Peter W. O’Hearn, and Hongseok Yang. Compositional shape analysis by means of bi-abduction. J. ACM, 58(6):26:1–26:66, 2011. 6 David R. Chase, Mark N. Wegman, and F. Kenneth Zadeck. Analysis of pointers and structures. In Proceedings of the ACM SIGPLAN’90 Conference on Programming Language Design and Implementation (PLDI), White Plains, New York, USA, June 20-22, 1990, pages 296–310. ACM, 1990. 7 Patrick Cousot and Radhia Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the 4th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, POPL ’77, pages 238–252, New York, NY, USA, 1977. ACM. 8 Patrick Cousot and Radhia Cousot. Modular static program analysis. In Compiler Construction, 11th International Conference, CC 2002, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2002, Grenoble, France, April 8-12, 2002, Proceedings, volume 2304 of Lecture Notes in Computer Science, pages 159–178. Springer, 2002. 9 Arlen Cox, Bor-Yuh Evan Chang, and Xavier Rival. Automatic analysis of open objects in dynamic language programs. In Static Analysis - 21st International Symposium, SAS 2014, Munich, Germany, September 11-13, 2014. Proceedings, pages 134–150, 2014. 10 Arjun Guha, Claudiu Saftoiu, and Shriram Krishnamurthi. Typing local control and state using flow analysis. In Programming Languages and Systems - 20th European Symposium on Programming, ESOP 2011, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2011, Saarbrücken, Germany, March 26-April 3, 2011. Proceedings, volume 6602 of Lecture Notes in Computer Science, pages 256–275. Springer, 2011. 11 Simon Holm Jensen, Anders Møller, and Peter Thiemann. Type analysis for JavaScript. In Proc. 16th International Static Analysis Symposium, pages 238–255, 2009. 12 John B. Kam and Jeffrey D. Ullman. Monotone data flow analysis frameworks. Acta Inf., 7(3):305–317, September 1977. 13 Vineeth Kashyap, Kyle Dewey, Ethan A. Kuefner, John Wagner, Kevin Gibbons, John Sarracino, Ben Wiedermann, and Ben Hardekopf. JSAI: a static analysis platform for JavaScript. In Proc. 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 121–132, 2014. ECOOP 2020 16:28 Value Partitioning: A Lightweight Approach to Relational Static Analysis for JavaScript 14 Vineeth Kashyap, John Sarracino, John Wagner, Ben Wiedermann, and Ben Hardekopf. Type refinement for static analysis of JavaScript. In DLS’13, Proceedings of the 9th Symposium on Dynamic Languages, part of SPLASH 2013, Indianapolis, IN, USA, October 26-31, 2013, pages 17–26. ACM, 2013. 15 Gary A. Kildall. A unified approach to global program optimization. In Conference Record of the ACM Symposium on Principles of Programming Languages, Boston, Massachusetts, USA, October 1973, pages 194–206. ACM Press, 1973. 16 Yoonseok Ko, Xavier Rival, and Sukyoung Ryu. Weakly sensitive analysis for JavaScript object-manipulating programs. Softw., Pract. Exper., 49(5):840–884, 2019. 17 Hongki Lee, Sooncheol Won, Joonho Jin, Junhee Cho, and Sukyoung Ryu. SAFE: formal specification and implementation of a scalable analysis framework for ECMAScript. In Proc. International Workshop on Foundations of Object Oriented Languages, 2012. 18 Magnus Madsen and Esben Andreasen. String analysis for dynamic field access. In Proc. 23rd International Conference on Compiler Construction, volume 8409 of Lecture Notes in Computer Science. Springer, 2014. 19 Antoine Miné. The octagon abstract domain. Higher-Order and Symbolic Computation, 19(1):31–100, 2006. 20 Erik M. Nystrom, Hong-Seok Kim, and Wen-mei W. Hwu. Importance of heap specialization in pointer analysis. In Proceedings of the 2004 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis For Software Tools and Engineering, PASTE’04, Washington, DC, USA, June 7-8, 2004, pages 43–48. ACM, 2004. 21 Changhee Park, Hyeonseung Im, and Sukyoung Ryu. Precise and scalable static analysis of jquery using a regular expression domain. In Proceedings of the 12th Symposium on Dynamic Languages, DLS 2016, pages 25–36, New York, NY, USA, 2016. ACM. 22 Changhee Park and Sukyoung Ryu. Scalable and precise static analysis of JavaScript applica- tions via loop-sensitivity. In Proc. 29th European Conference on Object-Oriented Programming, pages 735–756, 2015. 23 Xavier Rival and Laurent Mauborgne. The trace partitioning abstract domain. ACM Trans. Program. Lang. Syst., 29(5), August 2007. 24 Max Schäfer, Manu Sridharan, Julian Dolby, and Frank Tip. Dynamic determinacy analysis. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, Seattle, WA, USA, June 16-19, 2013, pages 165–174. ACM, 2013. 25 Manu Sridharan, Julian Dolby, Satish Chandra, Max Schäfer, and Frank Tip. Correlation tracking for points-to analysis of JavaScript. In Proc. 26th European Conference on Object- Oriented Programming, 2012. 26 Benno Stein, Benjamin Barslev Nielsen, Bor-Yuh Evan Chang, and Anders Møller. Static analysis with demand-driven value refinement. Proceedings of the ACM on Programming Languages (PACMPL), 3:140:1–140:29, 2019. 27 Mark N. Wegman and F. Kenneth Zadeck. Constant propagation with conditional branches. ACM Trans. Program. Lang. Syst., 13(2):181–210, 1991. 28 Shiyi Wei, Omer Tripp, Barbara G. Ryder, and Julian Dolby. Revamping JavaScript static analysis via localization and remediation of root causes of imprecision. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, Seattle, WA, USA, November 13-18, 2016, pages 487–498. ACM, 2016.