Dec 2, 2023

A Deep Dive into Jscrambler's Obfuscation - Part 1: Opaque predicates

Introduction

Jscrambler, the leading client-side security solution specializing in JavaScript in-app protection and real-time webpage monitoring, is primarily recognized for its product known as ‘Code Integrity’. Essentially, it’s a JavaScript obfuscator that provides advanced obfuscation techniques to make the code resilient to tampering and reverse engineering.

The first code transformation we’ll explore is known as Opaque Predicates. Predicate is an expression that evaluates to a boolean value, so it results in either true or false. In this context, ‘opaque’ signifies something challenging to comprehend and lacks transparency. Combining these meanings, we can define an opaque predicate as a boolean expression whose value is difficult to understand. The idea is actually to construct, at obfuscation time, a predicate whose value is known at that moment, but is difficult for an attacker to figure out once obfuscated. There are multiple types of opaque expressions as they can always be true or false, taken any input parameters, or they can depend on the value of the input so you wouldn’t be able to say if the expression is true or false that easily.

Let’s examine these examples:

- this predicate will always be true
- this predicate is indeterminate

The motivation behind creating expressions like these is to use them to complicate the understanding and prediction of conditional expressions, specifically whether a particular branch will be taken.

If we take Example 1. and represent it in a control flow graph it would look something like this:

If you’re unaware that the predicate is always true, you won’t be able to predict which branch will be taken. Of course, these are just simple examples of opaque predicates and we will be looking into ones that Jscrambler implemented, which are not that simple.

Jscrambler’s Variations

var b3rer_ = s1vxe.J0f()[1010][936];
for (; b3rer_ !== s1vxe.e1N()[986][344]; ) {
  switch (b3rer_) {
    case s1vxe.J0f()[819][363]:
      var V5C_kD = H_HByh.w$y | s1vxe[540693];
      b3rer_ = s1vxe.e1N()[802][1020];
      break;
    case s1vxe.J0f()[166][23]:
      b3rer_ =
        V5C_kD === (H_HByh.U3b | s1vxe[425594])
          ? s1vxe.e1N()[480][768]
          : s1vxe.e1N()[66][825];
      break;
    ...LOT MORE SWITCH CASES
  }
}

What is shown above is an example of Jscrambler’s Control flow flattening and Opaque predicates transformations combined. If you understand JavaScript switch statements you will know that it evaluates an expression, in this example b3rer_, and then it matches the expression’s value against a series of case blocks, in this example case conditions are s1vxe.J0f()[819][363] and s1vxe.J0f()[166][23]. Those are opaque predicates that hide the fact which case block is going to be executed next. Actually, b3rer_’s initialization value, s1vxe.J0f()[1010][936], is matched to the first’s case condition, s1vxe.J0f()[819][363].

Don’t believe me? This is how the code looks after deobfuscating Opaque predicates transformation:

var b3rer_ = 948;
for (; b3rer_ !== 428;) {
  switch (b3rer_) {
    case 948:
      var V5C_kD = H_HByh.w$y | s1vxe[540693];
      b3rer_ = 642;
      break;
    case 539:
      b3rer_ = V5C_kD === (H_HByh.U3b | s1vxe[425594]) ? 342 : 627;
      break;
    ...LOT MORE SWITCH CASES
  }
}

Let’s explore what functions s1vxe.J0f and s1vxe.e1N evaluates to. At first glance, one might assume these are 2D matrices storing values for comparison, but that’s not the case. When we open a debugger and attempt to execute s1vxe.J0f()[1010][936], we’re met with a rather intriguing behavior. It turns out to be an infinitely nested array, which is quite fascinating! We’ll explore ‘how?’ in a moment.

s1vxe.J0f and s1vxe.e1N are proxy functions that execute the s1vxe[403026].h7IxGlY function when called. See here:

s1vxe.J0f = function () {
  return typeof s1vxe[403026].h7IxGlY === "function"
    ? s1vxe[403026].h7IxGlY.apply(s1vxe[403026], arguments)
    : s1vxe[403026].h7IxGlY;
};

s1vxe.e1N = function () {
  return typeof s1vxe[403026].h7IxGlY === "function"
    ? s1vxe[403026].h7IxGlY.apply(s1vxe[403026], arguments)
    : s1vxe[403026].h7IxGlY;
};

Let’s take a look at s1vxe[403026].h7IxGlY:

s1vxe[403026] = (function (z_k, T4N, J2v) {
  var M63 = 2;
  for (; M63 !== 1; ) {
    switch (M63) {
      case 2:
        return {
          h7IxGlY: (function i9t(p9m, A0F, a2S) {
            var P6S = 2;
            for (; P6S !== 32; ) {
              switch (P6S) {
                case 19:
                  n8A = p9m - 1;
                  P6S = 18;
                  break;
                case 33:
                  return W26;
                  break;
                case 22:
                  Z5H = Q_Z + ((n8A - Q_Z + A0F * G9a) % G0b);
                  W26[G9a][Z5H] = W26[n8A];
                  P6S = 35;
                  break;
                case 20:
                  P6S = G9a < p9m ? 19 : 33;
                  break;
                case 35:
                  n8A -= 1;
                  P6S = 18;
                  break;
                case 2:
                  var W26 = [];
                  var W3q;
                  P6S = 5;
                  break;
                case 10:
                  G9a = 0;
                  P6S = 20;
                  break;
                case 15:
                  Q_Z = e8C;
                  P6S = 27;
                  break;
                case 5:
                  var G9a;
                  var n8A;
                  var m9f;
                  var e8C;
                  P6S = 8;
                  break;
                case 12:
                  W26[W3q] = [];
                  P6S = 11;
                  break;
                case 34:
                  G9a += 1;
                  P6S = 20;
                  break;
                case 18:
                  P6S = n8A >= 0 ? 17 : 34;
                  break;
                case 8:
                  var Q_Z;
                  var G0b;
                  var Z5H;
                  P6S = 14;
                  break;
                case 17:
                  m9f = 0;
                  e8C = 0;
                  P6S = 15;
                  break;
                case 11:
                  W3q += 1;
                  P6S = 13;
                  break;
                case 14:
                  W3q = 0;
                  P6S = 13;
                  break;
                case 27:
                  Q_Z = e8C;
                  e8C = a2S[m9f];
                  G0b = e8C - Q_Z;
                  m9f++;
                  P6S = 23;
                  break;
                case 13:
                  P6S = W3q < p9m ? 12 : 10;
                  break;
                case 23:
                  P6S = n8A >= e8C ? 27 : 22;
                  break;
              }
            }
          })(z_k, T4N, J2v),
        };
        break;
    }
  }
})(1029, 3, [15, 1029]);

Control flow flattening transformation has been applied to this function. For the purpose of this post, we won’t delve into the details of how control flow flattening operates. Here’s the deobfuscated version of the function:

s1vxe[403026] = (function (z_k, T4N, J2v) {
  return {
    h7IxGlY: (function i9t(p9m, A0F, a2S) {
      var W26 = [];
      var W3q;
      var G9a;
      var n8A;
      var m9f;
      var e8C;
      var Q_Z;
      var G0b;
      var Z5H;
      W3q = 0;
      while (W3q < p9m) {
        W26[W3q] = [];
        W3q += 1;
      }
      G9a = 0;
      while (G9a < p9m) {
        n8A = p9m - 1;
        while (n8A >= 0) {
          m9f = 0;
          e8C = 0;
          Q_Z = e8C;
          do {
            Q_Z = e8C;
            e8C = a2S[m9f];
            G0b = e8C - Q_Z;
            m9f++;
          } while (n8A >= e8C);
          Z5H = Q_Z + ((n8A - Q_Z + A0F * G9a) % G0b);
          W26[G9a][Z5H] = W26[n8A];
          n8A -= 1;
        }
        G9a += 1;
      }
      return W26;
    })(z_k, T4N, J2v),
  };
})(1029, 3, [15, 1029]);

Now, s1vxe[403026].h7IxGlY references the i9t function, which is immediately invoked with three arguments: 1029, 3, and [15, 1029].

Let’s take a deeper look at the function body:

m9f = 0;
e8C = 0;
Q_Z = e8C;
do {
  Q_Z = e8C;
  e8C = a2S[m9f];
  G0b = e8C - Q_Z;
  m9f++;
} while (n8A >= e8C);

The following loop is executed only twice. After the loop finishes Q_Z will be equal to the first element of the 3rd argument and e8C will correspond to the last value of the 3rd argument. G0b is defined as the absolute difference between those two values.

After computing a value stored in Z5H (we will get back to that line later), the most crucial line follows. Pay close attention:

W26[G9a][Z5H] = W26[n8A]

This line is where the magic happens. It employs a concept known as circular referencing, an idea where an object refers to itself. Let’s explore this concept further with an example:

a = [[]]
a[0][0] = a[0]

If you were to execute this code, you would encounter a fascinating phenomenon: the variable a becomes infinitely nested. Why does this happen? Let’s break it down with a visual representation:

On the left, you see a as a red box containing a green box. Now, in line 2 of our code example, we introduce a new box inside the green one, which we’ll denote as yellow. But here’s the catch: that yellow box is, in fact, the green box itself, as our code example suggests.

JavaScript’s referencing logic dictates that boxes of the same color share the same memory address. This means all green boxes are essentially the same; they are considered equal by JavaScript. In essence, they are like identical twins.

Because of this behavior, when we place a green box inside itself, we create a circular reference. As a result of circular referencing, we find ourselves in a situation where an infinite number of green boxes nest within each other.

“Why did they implement something like this, what’s the purpose?” you might ask. Imagine having thousands of these individual boxes nested inside one enormous red box, with each box referencing the others. To gain a better understanding, let’s set a debugger at this line and set a debugger condition, such as pausing only when G9a is equal to 50. Once the debugger has paused, we can observe that G9a is 50, Z5H is 154, and n8A is 1028. In our analogy with boxes, this means that the 1028th box, along with its internal boxes, is placed inside the 154th box, which, in turn, is located within the 50th box.

This capability is incredibly powerful because it allows you to create a large number of opaque predicates, knowing the contents of each box and its nested components.

Analysis

Our task is to determine when two given opaque predicates are equal or not equal. In other words, we need to find out if they point to the same box.

Let’s take an example mentioned earlier: s1vxe.J0f()[1010][936]. We know that s1vxe.J0f() is a giant infinitely deep array generated by the function we just examined, while W26 is the array that the function returns (see function code).

That means s1vxe.J0f()[1010][936] W26[1010][936]. If we look at our magic code again: W26[G9a][Z5H] = W26[n8A], we can deduce that G9a is 1010 and Z5H is 936. However, we don’t know n8A yet, but it’s our goal to find it because it represents one of the larger green boxes that multiple other boxes reference. By calculating n8A from G9a and Z5H, we can obtain an index for this bigger green box. If we do this for every opaque predicate, we will be able to compare those indexes statically since they are just integers.

Sounds cool! Let’s see if we can do that. Z5H is computed like this:

Z5H = Q_Z + (n8A - Q_Z + A0F * G9a) % G0b;

As previously said, we know that Q_Z corresponds to the first element of the 3rd argument, while G0b is equal to the difference between the second element of the 3rd argument and Q_Z. We also know that A0F represents the 2nd argument. When we insert these values into our equation we get:

Z5H = a2S[0] + (n8A - a2S[0] + A0F * G9a) % (a2S[1] - a2S[0])

Note: a2S is the 3rd argument. (see function code)

That leaves us with n8A as the only unknown value in our equation. If we subsitute all known variables with their values, we are left with:

936 = 15 + (n8A - 15 + 3 * 1010) % 1014
921 = (n8A + 3015) % 1014

We then know: : n8A + 3015 = 921 + k * 1014,

Additionally, since n8A must be an integer, we can run a small loop that increments k until we find a positive value for n8A.

For this set of input parameters, we get . And that’s it! We now know that s1vxe.J0f()[1010][936] is equal to s1vxe.J0f()[948]. If we then replace all ‘1D array indexing’ with their corresponding index e.g.: s1vxe.J0f()[948] . We can then know the next case condition and remove the control flow flattening.

More complicated examples

Let’s take a look at another example: s1vxe.e1N()[108][765][647].

As we know, s1vxe.e1N() represents an infinitely deep array, and we’re attempting to index it three dimensions deep. However, we’ll apply the same logic as before to navigate through this example.

Let’s focus on the first part of the opaque predicate: s1vxe.e1N()[108][765], denoting it as . We already established that X[647] is equivalent to our opaque predicate.

Applying the previously used logic, we can simplify with the appropriate parameters, resulting in:

s1vxe.e1N()[108][765]s1vxe.e1N()[441].

Now, our opaque predicate is transformed to s1vxe.e1N()[441][647]. Applying the same technique one more time we get the final result.

This process is recursive and can be repeated for any number of dimensions.

Conclusion

In this post, we’ve explored the concept of opaque predicates and how they can be used to obfuscate code. We’ve also examined how Jscrambler implements this technique and how we can reverse engineer it. This is only one of many obfuscation transformations that Jscrambler applies to its clients’ code. Stay tuned for more posts on this topic!

If you think I got something wrong or you have any questions, feel free to contact me on Discord: @sveba