Design Concepts - General
In designing both data and analytical models, one of the first choices that must be made is whether something should be represented as:
a form
a property
a light edge
a tag
a tag associated with a form
Every modeling decision is unique, and a full discussion of the modeling process is beyond the scope of these documents. We include some basic guidance below as background.
Forms
In the majority of cases, if there is something you want to represent in Synapse, it should be a form. Synapse’s data model can represent everything from objects, to relationships, to events as forms. (See Data Model - Object Categories for a more detailed discussion.)
As part of Synapse’s data model, forms are more structured and less likely to change. This structure allows you to more
easily identify relationships between objects in Synapse and to navigate the data. Forms should be used to represent
things that are observable or verifiable at some level - this is true even for more abstract forms like “vulnerabilities”
(risk:vuln
) or “goals” (ou:goal
). If something represents an assessment or conclusion, it is likely a better
candidate for a tag.
In designing a form, we recommend not “over-fitting” the form to a specific use case. As a simple example, an email
address is an email address - there is no difference between a email address used as an email sender and an email address
used to register a domain. Creating two separate objects for email:sender
and email:registrant
confuses the
object (an email address) with how the object is used. The “how” is apparent in other parts of the data model (e.g.,
when used as an email sender, the email address will be present in the :from
property of an inet:email:message
).
We also recommend designing forms broadly - this may require some out-of-the-box thinking to consider how the form may apply to other fields, disciplines, or even locales (“how something works” in the United States may be different from how it works in Argentina or Malaysia).
Properties
Properties are details that further define a form. When creating a form, there are probably a number of “things you want to record” about the form that immediately come to mind. These are obvious candidates for properties.
A few considerations when designing properties:
Properties should be highly “intrinsic” to their forms. The more closely related something is to an object, the more likely it should be a property of that object. Things that are not highly intrinsic are better candidates for their own forms, for “relationship” forms, or for tags.
Consider whether a property has enough “thinghood” to also be its own form (and possibly type).
The data model supports multi-value array properties, but arrays are not meant to store an excessive number of values (largely for performance and visualization purposes). In this situation, a “relationship” form might be preferable. Another option would be to “reverse” the property relationship.
For example, a compromise (
risk:compromise
) may consist of a number of different attacks (risk:attack
nodes) representing steps in the overall compromise. Instead ofrisk:compromise
having an:attacks
array with a large number of values, arisk:attack
has a:compromise
property so that multiple attacks can be linked back to a single compromise.
Light Edges
In Synapse, it is preferable to represent most relationships as forms in the data model, as forms support the use of additional descriptive properties as well as tags for context. However, light edges can replace “relationship” forms where:
Additional properties or tags are unnecessary. That is, the only thing you need to record is that the relationship exists. In this case, a light edge can provide some performance gains over a relationship form.
The relationship you are representing could exist between a broad range of objects (vs. two specific kinds of objects). This is best illustrated with some examples.
A DNS A record represents a specific relationship between an FQDN (
inet:fqdn
) and the IP address (inet:ipv4
) that the A record points to. This specific relationship will never exist between any other objects - an FQDN’s A record will never point to a MAC address, and a file will never resolve to an IP. A form (inet:dns:a
) is appropriate here because the objects in the relationship are consistent - there is a one-to-one “A record” relationship between FQDNs and IPv4 addresses.Other relationships may be “one-to-many” or “many-to-many” in that the object on one or both sides of the relationship may vary.
A data source (
meta:source
node) can observe or provide data on various objects (such as a hash or an FQDN). Creating a relationship form to represent each possible combination ofmeta:source
node and object complicates the data model. This “one-to-many” relationship can be represented more efficiently with aseen
light edge.Similarly, a variety of objects such as articles (
media:news
), presentations (ou:presentation
), or files (file:bytes
) may contain references to a range of objects of interest, from indicators to people to events. This “many-to-many” relationship can be represented more efficiently with arefs
(references) light edge.
See Lightweight (Light) Edge for additional discussion.
Note
Digraph nodes (also known as ‘edge nodes’) were previously used to account for these types of arbitrary (one-to-many or many-to-many) relationships but the use of light edges is now preferred. See Digraph (Edge) Form for additional discussion.