Vulnerable Javascript

This presentation and write up is all inspired from Guy Podjarny, CEO of Snyk,

How is javascript vulnerable and what does this mean for node apps?

Introduction

Note some of my notes will contain links to websites, this is purely for your curiosity and you are not expected to know everything in the links.

What are some fundamental properties of Javascript?

Built in memory management
Native serialization
Event loop
Frequent encoding use
Npm packages

Let's discuss these properties first:

1. Built in memory management

The memory life cycle looks like the following:

Allocate resources
Read and write to resources
Free resources

The developer only explicitly deals with step 1 and 3 with lower programming languages (Ex: C free() and malloc() ). All developers explicitly deal with step 2. In javascript we only explicitly deal with step 2.

2. Native Serialization

Serialization is the process of taking an object/data structure stored in one computer and converting it to a format which is understandble by everyone.

If we wanted to send that array which is stored in your program over the internet, other computers wouldn't be able to retrieve that object since it doesn't lie in their memory.

Javascript fully supports JSON which makes it very easy to send objects over the internet.

3. Event Loop

The event loop is what allows Javascript to be considered a scalable server side language. That is because for each request it doens't need to create a new thread which takes up space and time. This elimination of new threads for each request makes it easier for us to manage each request. Imagine millions of users using a platform which creates a new thread on each request.

Here is a visual representation of the event loop:

The important part is to know that all events (function calls, event listener is triggered -> callback function..) are added to the message queue where they will eventually be executed by the event loop. There is a possibility that one process might block the single threaded event loop if programmers are not careful.

4. Encoding use

Url's use encoding , for example you can't put spaces in a url, instead a space is represented as a %20

HTML uses HTML entities to allow the programmer to display certain restricted characters such as the > character. The programmer needs to be careful to consider all types of encoding of a character.

5. Npm Packages

The Node Package Manager allows developer to access and download many javascript packages such as express, react, ws and much more...

One of the cons of NPM is that it has no vetting process to only allow high-quality packages with no malicious code to be submitted to the NPM colleciton. Instead low quality packages and packages with known vulnerabilies can be found in the NPM collection. It relies on users to flag low-quality and malicious packages.

Overview:

Built in memory management => Buffer vulnerabilities
Native serialization=> Type manipulation
Event loop=> Regex Dos
Frequent encoding use=> Sandbox escaping
Npm packages=> Vulnerable packages

Note, if you want to play with the demo, it is available here

St: First Vulnerability (URL encoding)

St is a package that serves static files from the server. This is how it looks like in code:

The problem is that st did not think about all the ways a dot . can get encoded.

A dot can be represented as: %2e

Using curl we will send a request to the about page:

curl localhost:3001/public/about.html

Our goal is to try and escape the sandbox (tight security structure that allows us to only access the minimum files needed). We will try and get the passwd file by going back several directories until we hit root (../../)

curl localhost:3001/public/../../../../../../../../../etc/passwd

The above fails since st had some security in place that prevented the user from going out of the app main directory. Let's try with the encoded dot %2e:

curl localhost:3001/public/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/etc/passwd

The above works and we get access to the passwd file

Takeaway:

Always think about all the ways a character can be encoded.

Marked: Second Vulnerability (HTML encoding)

Marked is a package that converts the markdown text format to HTML. It gets about 3 Million downloads per month.

Basic Syntax:

 # Hello  corresponds to <h1> Hello </h1> in HTML

In our application we have an input which accepts markdown text format and sends it to the back end to store in our database.

Whenever we talk about user input and database storage, there is the possibility of Cross Site Scripting (XSS). The jist of XSS:

If we pass javascript code wrapped in a HTML script tag (<script></script>) inside the input, it would be stored as a string in the database but would be executed as HTML on the client browser side.
Since our javascript code is now executable, we can steal user's cookies/session tokens, play with the DOM or redirect to a malicious website.

Lets try a few potential malicious inputs we can feed to our application:

<script>alert(1)</script>

When we enter the above in our input, we simply get back its string version and the javascript doesn't work.
That is because marked has a sanitize function which will encode the restricted characters such as < and > with their respective HTML entities which means the client browser can't execute the script since it contains the HTML entities and not the actual HTML script tags.

Let's try a more sophisticated approach:

[Gotcha](javascript:alert(1))

This is mark down syntax for creating a anchor link.

The above translates to: <a href="javascript:alert(1)">Gotcha</a>

As we can see the sanitze function is pretty robust. Let's see if it accounted for HTML entities:

[Gotcha](javascript&#58;alert(1&#41;)

Same as above, instead this time we replaced the colon and parenthesis by their HTML entity version hoping to bypass the sanitize function
The result is just an empty paragraph, damn you sanitize function.

Now for the ultimate trick, lets exploit the browser's friendliness to our advantage:

[Gotcha](javascript&#58this;alert(1&#41;)

Notice the this
Marked sanitize function won't catch this since they only account for HTML entities including the semi colon at the end (: and not &#58)
The browser is very lineant and will replace the &#58 with a colon : even though it really shouldn't.
End result:

Takeaway:

Be very rigorous with your sanitization, make sure to account for browser quirks.

Ms: Third Vulnerability (Regex Dos)

Ms is a package that converts input strings to milliseconds. This package is vulnerable to a Regular Expression Denial Of Service (Regex Dos).

Because this ms uses Regular Expression backtracking, if we pass it an incredibly long string that almost matches, it will backtrack until its explored all possibilities and determines that our string is not valid.

For this exploit, we will use the curl to post data to the web server. Let's start with a simple one:

curl -d 'content=Call mom in 20 minutes' -X POST localhost:3001/create

The result of that call is the following:

Using the following curl command we can effectively block the main thread of the application, hence acting like a Denial Of Service since no one can use the service until it finishes backtracking.

curl -d 'content=Call mom in '`printf %.0s5 {1..60000}`' minutea' -X POST localhost:3001/create

Your server when you execute that command:

Ok so what does the command do?:

It sets the number of minutes as 60,000 5's so instead of Call mom in 20 minutes, we have call mom in [INSERT 60,000 5's] minutea.
Notice the minutea. This is key since, it won't match minutes it will have to backtrack through all possible combinations.
Depending on the length you insert, it can block the thread for as long as you want. This can hike up server bills or simply crash it.

Takeaway:

The problem here is that ms didn't restrict the number of characters as input to the regex. Their fix was to simply put a fixed cap (~1000 characters). This prevents us from inserting insanely long strings.
Always be careful when dealing with regexes. Does it use backtracking? If so does it include a max input length?

Mongoose: Fourth Vulnerability (Type manipulation)

Mongoose is an Object Document Mapper. It allows us to define strongly typed schemas for a MongoDB database. There are multiple data types that Mongoose supports. One of them is the Buffer.

The Buffer allows us to read and manipulate binary data. This is useful for storing pictures and more.

Let's see how the Buffer class works. In the following example, we initialize a buffer with enough space to store the string '100'.

In this next example, we created a buffer with size 100 bytes. The contents of a newly created Buffer are unknown and may contain sensitive data. In my case, the Buffer picked data that was already zeroed out.

ENOUGH Theory

The following curl command sends data in the form of JSON to our server create route. This is the same as submitting stuff in the user input UI. We have more control this way.

More control to send JSON data, will allow us to exploit type manipulation

curl -d '{"content": "800"}' -H "Content-Type: application/json" -X POST localhost:3001/create

Result of above curl command:

To understand how this exploit works, we will need to explain a few things about how to app is receiving this data. The following is the Todo Schema that stores the to do list items users submit.

Notice the content is of Buffer type.

We have a create route:

In our create route:

First we get the content of the input using the key 'content'. Notice the data from the body is already an object, so it has been parsed as JSON by a middleware before arriving to our create route.
Then we pass it through our parse function, which in our case will just return the original item.
Then we put that item directly in our buffer.

If you're still confused how this is vulnerable, we can send a JSON object with content assigned to a number through curl. On the server side it will be parsed as a number which will then create a new Buffer of size [YOUR NUMBER] that contains possible unknown sensible data that lies on the web server.

curl -d '{"content": 800}' -H "Content-Type: application/json" -X POST localhost:3001/create

If we do this enough times we might be able to extract secrets from the web server such as API keys/passwords.

This is really called Remote Memory Exposure

Takeaways:

Always be aware of users manipulating the type. In this case we assumed that any input passed to the create route will already be of type string.
If dealing with a Buffer, perhaps make sure that data in the buffer is always zeroed out first.

Overall Javascript Takeaways

Consider all types of encoding

URL + HTML encoding
Better yet, whitelist

Prevent algorithms from taking up the event loop too long
Beware of Type Manipulation
Initialize Buffer

How do we protect ourselves from all of this?

Snyk

Snyk can look through your github projects and identify any vulnerabilities + fix them.
It has a Vulnerability Database which is updated very frequently. This database includes vulnerabilities found in pip, npm, go, ruby and many more packages.

Snyk also has a tool called wizard for node apps. Wizard will step through your node project and identify any vulnerabilities, ask you weather you want to patch or upgrade vulnerable packages. This is highly useful and recommended.

Introduction

Note some of my notes will contain links to websites, this is purely for your curiosity and you are not expected to know everything in the links.

What are some fundamental properties of Javascript?

1. Built in memory management

2. Native Serialization

3. Event Loop

4. Encoding use

5. Npm Packages

Overview:

St: First Vulnerability (URL encoding)

The problem is that st did not think about all the ways a dot . can get encoded.

Takeaway:

Marked: Second Vulnerability (HTML encoding)

Takeaway:

Ms: Third Vulnerability (Regex Dos)

Takeaway:

Mongoose: Fourth Vulnerability (Type manipulation)

ENOUGH Theory

Takeaways:

Overall Javascript Takeaways

How do we protect ourselves from all of this?

Snyk

THAT'S IT!!!