Advanced features

1. Semgrep: Flex Mode!

This tutorial covers some of Semgrep's newest features, and is mostly geared toward experienced Semgrep rule authors.

We'll start by reviewing an example from the composition tutorial that introduced metavariable-regex.

This time we have pre-filled the correct answer to get you started.

SEMGREP RULE

rules:
  - id: use-decimalfield-for-money
    patterns:
    - pattern-inside: |
        class $M(...):
          ...
    - pattern: $F = django.db.models.FloatField(...)
    - metavariable-regex:
        metavariable: '$F'
        regex: '.*(price|fee|salary).*'
    message: Found a FloatField used for variable $F. Use DecimalField for currency fields to avoid float-rounding errors.
    languages: [python]
    severity: ERROR

TEST CODE

ANSWER + EXPLAIN

Without the metavariable-regex here, the metavariable $F would match on line 9 as well.

However, you can use metavariable-regex to restrict metavariable matches based on the variable names.

You can also use regular expressions directly in semgrep patterns. See how in the next example! 🚂

2. Regex in patterns

SEMGREP RULE

Sometimes, regular expressions are the best tool for the job.

Semgrep lets you combine all of its other tools with regular expressions so you can use each where it makes most sense.

This example currently tries to use pattern-regex to match baseURL variables that do not start with http:// or https:// but it doesn't quite work.

Try using pattern-regex's cousin, pattern-not-regex to express this more elegantly.

TEST CODE

ANSWER + EXPLAIN

Regex matching is not the only form of logic that can be built into a Semgrep pattern.

Keep going for math comparisons like $X > 0! (Fun fact: 0! = 1)

3. Metavariable comparisons

Just as metavariable-regex could be used to restrict metavariables based on a regex, metavariable-comparison can be used to restrict metavariables based on mathematical comparisons.

This example currently matches on RSA keys that are initialized with 2048 bits or fewer.

Update it to match the message, so it only alarms on RSA keys with fewer than 2048 bits.

SEMGREP RULE

TEST CODE

ANSWER + EXPLAIN

4. Metavariable pattern

Do you remember back in the composition tutorial when we first introduced metavariable-regex and noted how it let you write code about your code about your code?

Metavariable-pattern is like that, but maybe one layer deeper. Or maybe infinite layers deeper. We sort of lost track.

Take a good hard look at this example to figure out what it is doing. There is a pattern-not embedded in the logic for a specific metavariable. This one was too hard to turn into a puzzle, so we are just giving it to you. Enjoy!

SEMGREP RULE

TEST CODE

5. Everybody loves Autofix

Everybody loves Raymond autofix because it doesn't just tell you how to fix your code - it does it for you!

Right now, autofix is mostly limited to single-line patterns and single-line fixes, but it still comes in quite handy.

Semgrep's rule format supports a fix: key that supports metavariable replacement, much like message fields.

In this example try writing a fix: that matches the message description.

SEMGREP RULE

TEST CODE

ANSWER + EXPLAIN

6. Go forth and find things!

You've reached the end of this tutorial. Semgrep has many other features that you can read about at the docsarrow-up-right.

Here's a favorite pattern that shows off what Semgrep can do. This rule searches for code injection in Django apps using eval() and has found real vulnerabilities!

SEMGREP RULE

TEST CODE