Storm Reference - Advanced - Variables

Storm supports the use of variables. A Variable is a value that can change depending on conditions or on information passed to the Storm query. (Contrast this with a Constant, which is a value that is fixed and does not change.)

Variables can be used in a variety of ways, from providing simpler or more efficient ways to reference node properties, to facilitating bulk operations, to performing complex tasks or writing extensions to Synapse in Storm.

These documents approach variables and their use from a user standpoint and aim to provide sufficient background for users to understand and begin to use variables. They do not provide an in-depth discussion of variables and their use from a fully developer-oriented perspective.

Storm Operating Concepts
Variable Concepts
Types of Variables
- Built-In Variables
- User-Defined Variables

Storm Operating Concepts

When leveraging variables in Storm, it is important to keep in mind the high-level Storm Operating Concepts. Specifically:

Storm operations (e.g., lifts, filters, pivots, etc.) are performed on nodes.
Operations can be chained and are executed in order from left to right.
Storm acts as an execution pipeline, with each node passed individually and independently through the chain of Storm operations.
Most Storm operations consume nodes — that is, a given operation (such as a filter or pivot) acts upon the inbound node in some way and returns only the node or set of nodes that result from that operation.

These principles apply to variables that reference nodes (or node properties) in Storm just as they apply to nodes, and so affect the way variables behave within Storm queries.

Variable Concepts

Variable Scope

A variable’s scope is its lifetime and under what conditions it may be accessed. There are two dimensions that impact a variable’s scope: its call frame and its runtime safety (“runtsafety”).

Call Frame

A variable’s call frame is where the variable is used. The main Storm query starts with its own call frame, and each call to a “pure” Storm command, function, or subquery creates a new call frame. The new call frame gets a copy of all the variables from the calling call frame. Changes to existing variables or the creation of new variables within the new call frame do not impact the calling scope.

Runtsafe vs. Non-Runtsafe

An important distinction to keep in mind when using variables in Storm is whether the variable is runtime-safe (”Runtsafe”) or non-runtime safe (”Non-Runtsafe”).

A variable that is runtsafe has a value independent of any nodes passing through the Storm pipeline. For example, a variable whose value is explicitly set, such as $string = mystring or $ipv4 = 8.8.8.8 is considered runtsafe because the value does not change / is not affected by the specific node passing through the Storm pipeline.

A variable that is non-runtsafe has a value derived from a node passing through the Storm pipeline. For example, a variable whose value is set to a node property value may change based on the specific node passing through the Storm pipeline. In other words, if your Storm query is operating on a set of DNS A nodes (inet:dns:a) and you define the variable $fqdn = :fqdn (setting the variable to the value of the :fqdn secondary property), the value of the variable will change based on the specific value of that property for each inet:dns:a node in the pipeline.

All non-runtsafe variables are scoped to an individual node as it passes through the Storm pipeline. This means that a variable’s value based on a given node is not available when processing a different node (at least not without using special commands, methods, or libraries). In other words, the path of a particular node as it passes through the Storm pipeline is its own scope.

The “safe” in non-runtsafe should not be interpreted as meaning the use of non-runtsafe variables is somehow “risky” or involves insecure programming or processing of data. It simply means the value of the variable is not safe from changing (i.e., it may change) as the Storm pipeline progresses.

Types of Variables

Storm supports two types of variables:

Built-in variables. Built-in variables facilitate many common Storm operations. They may vary in their scope and in the context in which they can be used.
User-defined variables User-defined variables are named and defined by the user. They are most often limited in scope and facilitate operations within a specific Storm query.

Built-In Variables

Storm includes a set of built-in variables and associated variable methods (Storm Reference - Advanced - Methods) and libraries (Storm Libraries) that facilitate Cortex-wide, node-specific, and context-specific operations.

Built-in variables differ from user-defined variables in that built-in variable names:

are initialized at Cortex start,
are reserved,
can be accessed automatically (i.e., without needing to define them) from within Storm, and
persist across user sessions and Cortex reboots.

Global Variables

Global variables operate independently of any node. That is, they can be invoked in a Storm query in the absence of any nodes in the Storm execution pipeline (though they can also be leveraged when performing operations on nodes).

$lib

The library variable ( $lib ) is a built-in variable that provides access to the global Storm library. In Storm, libraries are accessed using built-in variable names (e.g., $lib.print()).

See the Storm Libraries technical documentation for descriptions of the libraries available within Storm.

Node-Specific Variables

Storm includes node-specific variables that are designed to operate on or in conjunction with nodes and require one or more nodes in the Storm pipeline.

Note

Node-specific variables are always non-runtsafe.

$node

The node variable ($node) is a built-in Storm variable that references the current node in the Storm query. Specifically, this variable contains the inbound node’s node object, and provides access to the node’s attributes, properties, and associated attribute and property values.

Invoking this variable during a Storm query is useful when you want to:

access the raw and entire node object,
store the value of the current node before pivoting to another node, or
use an aspect of the current node in subsequent query operations.

The $node variable supports a number of built-in methods that can be used to access specific data or properties associated with a node. See the technical documentation for the storm:node object or the $node section of the Storm Reference - Advanced - Methods user documentation for additional detail and examples.

$path

The path variable ($path) is a built-in Storm variable that references the path of a node as it travels through the pipeline of a Storm query.

The $path variable is not used on its own, but in conjunction with its methods. See the technical documentation for the storm:path object or the $path section of the Storm Reference - Advanced - Methods user documentation for additional detail and examples.

Trigger-Specific Variables

A Trigger is used to support automation within a Cortex. Triggers use events (such as the creation of a node, setting the value of a node’s property, or applying a tag to a node) to fire (“trigger”) the execution of a predefined Storm query. Storm uses a built-in variable specifically within the context of trigger-initiated Storm queries.

$tag

Within the context of triggers that fire on tag:add events, the $tag variable represents the name of the tag that caused the trigger to fire.

For example:

You write a trigger to fire when any tag matching the expression #foo.bar.* is added to a file:bytes node. The trigger executes the following Storm command:

-> hash:md5 [ +#$tag ]

Because the trigger uses a wildcard expression, it will fire on any tag that matches that expression (e.g., #foo.bar.hurr, #foo.bar.derp, etc.). The Storm snippet above will take the inbound file:bytes node, pivot to the file’s associated MD5 node (hash:md5), and apply the same tag that fired the trigger to the MD5.

See the Triggers section of the Storm Reference - Automation document and the Storm trigger command for a more detailed discussion of triggers and associated Storm commands.

CSVTool-Specific Variables

Synapse’s CSVTool is used to ingest (import) data into or export data from a Cortex using comma-separated value (CSV) format. Storm includes a built-in variable to facilitate bulk data ingest using CSV.

$rows

The $rows variable refers to the set of rows in a CSV file. When ingesting data into a Cortex, CSVTool reads a CSV file and a file containing a Storm query that tells CSVTool how to process the CSV data. The Storm query is typically constructed to iterate over the set of rows ($rows) using a “for” loop that uses user-defined variables to reference each field (column) in the CSV data.

For example:

for ($var1, $var2, $var3, $var4) in $rows { <do stuff> }

See Synapse Tools - csvtool for a more detailed discussion of CSVTool use and associated Storm syntax.

User-Defined Variables

User-defined variables can be defined in one of two ways:

At runtime (i.e., within the scope of a specific Storm query). This is the most common use for user-defined variables.
Mapped via options passed to the Storm runtime (i.e., when using the --optifle option from Synapse cmdr or via Cortex API access). This method is less common. When defined in this manner, user-defined variables will behave as though they are built-in variables that are runtsafe.

Variable Names

All variable names in Storm (including built-in variables) begin with a dollar sign ( $ ). A variable name can be any alphanumeric string, except for the name of a built-in variable (see Built-In Variables), as those names are reserved. Variable names are case-sensitive; the variable $MyVar is different from $myvar.

Note

Storm will not prevent you from using the name of a built-in variable to define a variable (such as $node = 7). However, doing so may result in undesired effects or unexpected errors due to the variable name collision.

Defining Variables

Within Storm, a user-defined variable is defined using the syntax:

$<varname> = <value>

The variable name must be specified first, followed by the equals sign and the value of the variable itself.

<value> can be:

an explicit value / literal,
a node secondary or universal property,
a tag or tag property,
a built-in variable or method,
a library function,
a mathematical expression / “dollar expression”, or
an embedded query.

Examples

Two types of examples are used below:

Demonstrative example: the $lib.print() library function is used to display the value of the user-defined variable being set. This is done for illustrative purposes only; $lib.print() is not required in order to use variables or methods.

Keep Storm’s operation chaining, pipeline, and node consumption aspects in mind when reviewing the demonstrative examples below. When using $lib.print() to display the value of a variable, the queries below will:
- Lift the specified node(s).
- Assign the variable. Note that assigning a variable has no impact on the nodes themselves.
- Print the variable’s value.
- Return any nodes still in the pipeline. Because variable assignment doesn’t impact the node(s), they are not consumed and so are returned (displayed) at the CLI.
The effect of this process is that for each node in the Storm query pipeline, the output of $lib.print() is displayed, followed by the relevant node.
Use-case example: the user-defined variable is used in one or more sample queries to illustrate possible practical use cases. These represent exemplar Storm queries for how a variable or method might be used in practice. While we have attempted to use relatively simple examples for clarity, some examples may leverage additional Storm features such as subqueries, subquery filters, or flow control elements such as “for” loops or “switch” statements.

Assign a literal to a user-defined variable:

Assign the value 5 to the variable $threshold:

storm> $threshold=5 $lib.print($threshold)
5

Tag any file:bytes nodes that have a number of AV signature hits higher than a given threshold for review:

storm> $threshold=5 file:bytes +{ -> it:av:filehit } >= $threshold [ +#review ]
file:bytes=sha256:00007694135237ec8dc5234007043814608f239befdfc8a61b992e4d09e0cf3f
        :sha256 = 00007694135237ec8dc5234007043814608f239befdfc8a61b992e4d09e0cf3f
        .created = 2022/04/28 12:34:28.454
        #review

Assign a node secondary property to a user-defined variable:

Assign the :user property from an Internet-based account (inet:web:acct) to the variable $user:

storm> inet:web:acct=(twitter.com,bert) $user=:user $lib.print($user) | spin
bert

Find email addresses associated with a set of Internet accounts where the username of the email address is the same as the username of the Internet account:

storm> inet:web:acct $user=:user -> inet:email +:user=$user
inet:[email protected]
        :fqdn = gmail.com
        :user = bert
        .created = 2022/04/28 12:34:28.550

Assign a node universal property to a user-defined variable:

Assign the .seen universal property from a DNS A node to the variable $time:

storm> inet:dns:a=(woot.com,1.2.3.4) $time=.seen $lib.print($time) | spin
(1543289294000, 1565893967000)

Note

In the example above, the raw value of the .seen property is assigned to the $time variable. .seen is an interval (ival) type, consisting of a pair of minimum and maximum time values. These values are stored in Unix epoch millis, which are the values shown by the output of the $lib.print() function.

Given a DNS A record, find other DNS A records that pointed to the same IP address in the same time window:

storm> inet:dns:a=(woot.com,1.2.3.4) $time=.seen -> inet:ipv4 -> inet:dns:a +.seen@=$time
inet:dns:a=('woot.com', '1.2.3.4')
        :fqdn = woot.com
        :ipv4 = 1.2.3.4
        .created = 2022/04/28 12:34:28.712
        .seen = ('2018/11/27 03:28:14.000', '2019/08/15 18:32:47.000')
inet:dns:a=('hurr.net', '1.2.3.4')
        :fqdn = hurr.net
        :ipv4 = 1.2.3.4
        .created = 2022/04/28 12:34:28.889
        .seen = ('2018/12/09 06:02:53.000', '2019/01/03 11:27:01.000')

Assign a tag to a user-defined variable:

Assign the explicit tag value cno.infra.anon.tor to the variable $tortag:

storm> $tortag=cno.infra.anon.tor $lib.print($tortag)
cno.infra.anon.tor

Tag IP addresses that Shodan says are associated with Tor with the #cno.infra.anon.tor tag:

storm> $tortag=cno.infra.anon.tor inet:ipv4#rep.shodan.tor [ +#$tortag ]
inet:ipv4=84.140.90.95
        :type = unicast
        .created = 2022/04/28 12:34:28.985
        #cno.infra.anon.tor
        #rep.shodan.tor
inet:ipv4=54.38.219.150
        :type = unicast
        .created = 2022/04/28 12:34:28.992
        #cno.infra.anon.tor
        #rep.shodan.tor
inet:ipv4=46.105.100.149
        :type = unicast
        .created = 2022/04/28 12:34:28.993
        #cno.infra.anon.tor
        #rep.shodan.tor

Assign a tag timestamp to a user-defined variable:

Assign the times associated with Threat Group 20’s use of a malicious domain to the variable $time:

storm> inet:fqdn=evildomain.com $time=#cno.threat.t20.tc $lib.print($time) | spin
(1441670400000, 1504828800000)

Find DNS A records for any subdomain associated with a Threat Group 20 zone during the time they controlled the zone:

storm> inet:fqdn#cno.threat.t20.tc $time=#cno.threat.t20.tc -> inet:fqdn:zone -> inet:dns:a +.seen@=$time
inet:dns:a=('www.evildomain.com', '1.2.3.4')
        :fqdn = www.evildomain.com
        :ipv4 = 1.2.3.4
        .created = 2022/04/28 12:34:29.221
        .seen = ('2016/07/12 00:00:00.000', '2016/12/13 00:00:00.000')
inet:dns:a=('smtp.evildomain.com', '5.6.7.8')
        :fqdn = smtp.evildomain.com
        :ipv4 = 5.6.7.8
        .created = 2022/04/28 12:34:29.226
        .seen = ('2016/04/04 00:00:00.000', '2016/08/02 00:00:00.000')

Assign a tag property to a user-defined variable:

Assign the risk value assigned by DomainTools to an FQDN to the variable $risk:

storm> inet:fqdn=badsite.org $risk=#rep.domaintools:risk $lib.print($risk) | spin
85

Given an FQDN with a risk score, find all FQDNs with an equal or higher risk score:

storm> inet:fqdn=badsite.org $risk=#rep.domaintools:risk inet:fqdn#rep.domaintools:risk>=$risk
inet:fqdn=badsite.org
        :domain = org
        :host = badsite
        :issuffix = False
        :iszone = True
        :zone = badsite.org
        .created = 2022/04/28 12:34:29.334
        #rep.domaintools:risk = 85
inet:fqdn=badsite.org
        :domain = org
        :host = badsite
        :issuffix = False
        :iszone = True
        :zone = badsite.org
        .created = 2022/04/28 12:34:29.334
        #rep.domaintools:risk = 85
inet:fqdn=stillprettybad.com
        :domain = com
        :host = stillprettybad
        :issuffix = False
        :iszone = True
        :zone = stillprettybad.com
        .created = 2022/04/28 12:34:29.401
        #rep.domaintools:risk = 92

Assign a built-in variable to a user-defined variable:

Assign a ps:person node to the variable $person:

storm> ps:person=0040a7600a7a4b59297a287d11173d5c $person=$node $lib.print($person) | spin
Node{(('ps:person', '0040a7600a7a4b59297a287d11173d5c'), {'iden': '1275e1464cdd865aa78fbe45c3a4fee391713063e4015323b6c38cee81370f5b', 'tags': {}, 'props': {'.created': 1651149269446}, 'tagprops': {}, 'nodedata': {}})}

For a given person, find all objects the person “has” and all the news articles that reference that person (uses the Storm tee command):

storm> ps:person=0040a7600a7a4b59297a287d11173d5c $person = $node | tee { edge:has:n1=$person -> * } { edge:refs:n2=$person <- * +media:news }
inet:web:acct=twitter.com/mytwitter
        :site = twitter.com
        :user = mytwitter
        .created = 2022/04/28 12:34:29.608
media:news=00076a3f20808a14cbaa01ad51111edc
        .created = 2022/04/28 12:34:29.610

Note

See the technical documentation for the storm:node object or the $node section of the Storm Reference - Advanced - Methods user documentation for additional detail and examples when using the $node built-in variable.

Assign a built-in variable method to a user-defined variable:

Assign the value of a domain node to the variable $fqdn:

storm> inet:fqdn=mail.mydomain.com $fqdn=$node.value() $lib.print($fqdn) | spin
mail.mydomain.com

Find the DNS A records associated with a given domain where the PTR record for the IP matches the FQDN:

storm> inet:fqdn=mail.mydomain.com $fqdn=$node.value() -> inet:dns:a +{ -> inet:ipv4 +:dns:rev=$fqdn }
inet:dns:a=('mail.mydomain.com', '25.25.25.25')
        :fqdn = mail.mydomain.com
        :ipv4 = 25.25.25.25
        .created = 2022/04/28 12:34:29.779

Assign a library function to a user-defined variable:

Assign a value to the variable $mytag using a library function:

storm> $mytag = $lib.str.format("cno.mal.sofacy") $lib.print($mytag)
cno.mal.sofacy

Assign a value to the variable $mytag using a library function (example 2):

storm> file:bytes=e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 for $tag in $node.tags(code.fam.*) { $malfam=$tag.split(".").index(2) $mytag=$lib.str.format("cno.mal.{malfam}", malfam=$malfam) $lib.print($mytag) } | spin
cno.mal.sofacy

The above example leverages:

three variables ($tag, $malfam, and $mytag);

the $node.tags() method;

the $lib.split(), $lib.index(), and $lib.str.format() library functions; as well as

a “for” loop.

If a file is tagged as part of a malicious code (malware) family, then also tag the file to indicate it is part of that malware’s ecosystem:

storm> file:bytes=e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 for $tag in $node.tags(code.fam.*) { $malfam=$tag.split(".").index(2) $mytag=$lib.str.format("cno.mal.{malfam}", malfam=$malfam) [ +#$mytag ] }
file:bytes=sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
        :sha256 = e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
        .created = 2022/04/28 12:34:29.927
        #cno.mal.sofacy
        #code.fam.sofacy

Use a mathematical expression / “dollar expression” as a variable:

Use a mathematical expression to increment the variable $x:

storm> $x=5 $x=$($x + 1) $lib.print($x)
6

For any domain with a “risk” score from Talos, tag those with a score greater than 75 as “high risk”:

storm> inet:fqdn#rep.talos:risk $risk=#rep.talos:risk if $($risk > 75) { [ +#high.risk ] }
inet:fqdn=woot.com
        :domain = com
        :host = woot
        :issuffix = False
        :iszone = True
        :zone = woot.com
        .created = 2022/04/28 12:34:28.712
        #rep.talos:risk = 36
inet:fqdn=derp.net
        :domain = net
        :host = derp
        :issuffix = False
        :iszone = True
        :zone = derp.net
        .created = 2022/04/28 12:34:30.196
        #high.risk
        #rep.talos:risk = 78
inet:fqdn=hurr.org
        :domain = org
        :host = hurr
        :issuffix = False
        :iszone = True
        :zone = hurr.org
        .created = 2022/04/28 12:34:30.199
        #high.risk
        #rep.talos:risk = 92

Note

In the examples above, the mathematical expressions $($x + 1) and $($risk > 75) are not themselves variables, despite starting with a dollar sign ( $ ). The syntax convention of “dollar expression” ( $( <expression> ) ) allows Storm to support the use of variables (like $x and $risk) in mathematical and logical operations.