DOKK Library

Splunk® Data Onboarding Cheat Sheet V2.1

Authors Aplura LLC.

License CC-BY-SA-4.0

Plaintext
   Splunk® Data Onboarding Cheat Sheet                                                                                                                                                             V2.1
                                                                                                                                                                                 https://www.aplura.com/ocs

   6 props.conf Settings You Should Have                                                                                                                 Useful strptime Directives

     For greater efficiency at getting data into Splunk, use these six                                                                                   Year (four digit/two digit)              %Y/%y
                                                                                                                                                         Month (number/name/abbr)                 %m/%B/%b
     props.conf settings when you define a source type.
                                                                                                                                                         Day of month (leading zero/no zero)      %d/%e
     [mysourcetype]                                                                                                                                      Hour (24 hour/12 hour)                   %H/%I
     TIME_PREFIX = regex of the text that leads up to the timestamp
     MAX_TIMESTAMP_LOOKAHEAD = how many characters for the timestamp                                                                                     Minute                                   %M
     TIME_FORMAT = strptime format of the timestamp                                                                                                      Second/Millisecond                       %S/%3N
     SHOULD_LINEMERGE = false (always false)                                                                                                             Epoch time                               %s
     LINE_BREAKER = regular expression for event breaks
     TRUNCATE = 999999 (always a high number)                                                                                                            Time zone (UTC offset/name)              %z/%Z
                                                                                                                                                         AM/PM                                    %p
   Useful Regular Expressions
   IP Address                                                                                   \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
   Syslog-ng header                                                                             [\r\n]+|^\w{3}\s+\d+\s+[\d:]{8}\s+\S+\s+
   Match to the first pipe (negated character class)                                            [^|]+
   Metadata Rewrites (to use, add TRANSFORMS-<classname> to a sourcetype stanza in props.conf, then add rewrite to transforms.conf)

                                                      [rewrite_host]
   Host                                               REGEX = ^Message\s+from\s+(\S+)
                                                      DEST_KEY = MetaData:Host
                                                      FORMAT = host::$1

                                                      [rewrite_sourcetype]
   Sourcetype                                         REGEX = this\s+is\s+another\s+sourcetype
                                                      DEST_KEY = MetaData:Sourcetype
                                                      FORMAT = sourcetype::other_sourcetype

                                                      [rewrite_index]
   Index                                              REGEX = this\s+should\s+go\s+elsewhere
                                                      DEST_KEY = _MetaData:Index
                                                      FORMAT = other_index
   Field Extractions
   Using                                       In props.conf:
                                               [mysourcetype]
   EXTRACT                                     EXTRACT-user_src = \s(?<user>\S+)\s+logged\s+in IN source_field


   Using                                       In props.conf:
                                               [mysourcetype]
   REPORT                                      REPORT-user_src = mysourcetype_user_source

                                               In transforms.conf:
                                               [mysourcetype_user_source]
                                               REGEX = \s(\S+)\s+logged\s+in\s+from\s+(\S+)
                                               FORMAT = src::$1 user::$2

   Lookups

   props.conf                                          [mysourcetype]
                                                       LOOKUP-mysourcetype-actions = my_lookup event_field OUTPUT lookup_field

                                                       [my_lookup]
   transforms.conf                                     filename = mysourcetype_actions.csv
                                                       case_sensitive_match = false
                                                       max_matches = 1

   Field Aliases, SED Commands, Calculated Fields (add to sourcetype stanzas in props.conf)
   Field alias                                         FIELDALIAS-myalias = my_field AS new_field my_field AS new_field2

   SED command                                         SEDCMD-abc_to_xyz = s/abc/xyz/g

   Calculated field                                    EVAL-total_bytes = bytes_in + bytes_out
   Search-Time Operation Order                                                                                                                                              = search-time    gray items are
                                                                                                                                                                            = index-time     optional
            EXTRACT                          REPORT                 KV_MODE                           FIELDALIAS                           EVAL                    LOOKUP

    Provided by Aplura, LLC. Splunk Consulting and Application Development Services. sales@aplura.com https://www.aplura.com
   Splunk ia a registered trademark of Splunk, Inc.
v2.1.5
                                                              This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.                    Many Solutions, One Goal.
                                                      Getting Data Into Accelerated Data Models
                                             Review The Data
                                             After you have correctly onboarded your data (correct meta data, line breaking, and
                                             time stamping), review the events to determine which data models the events match. A
                                             single sourcetype can contain events that are appropriate for different data models. For
                                             example, a proxy feed can have authentication events for users logging in, web proxy
                                             events showing traffic, and configuration changes as administrators adjust settings.

                                             Extract Fields
                                             Configure field extractions to populate as many of the data model objects (fields) as you
                                             can. See the Splunk Common Information Model Add-on Manual to learn what the field
                                             contents and names should be.

                                             Configure Event Types
                                             Configure event types for the data. Event types should use searches that capture all of
                                             the events you expect to fill in a particular data model. For example, to capture all login
                                             events (both successes and failures), you might use a search like:

                                             sourcetype=my_sourcetype “Login for user” (“failed” OR
                                             “succeeded”)

                                           Tag The Event Types
                                             Tag the event types you just created. The CIM Add-on Manual tells you the tags which
                                             should be used for the data model you are aiming for. While tagging can be done in
                                             other ways, the current best practice is to attach the tags to event types.

                                             Review Index Constraints
                                             Newer versions of the CIM Add-on use index constraints to improve performance and
                                             let you control what data to accelerate. Use the CIM Add-on Setup page to confirm that
                                             the constraints include the indexes that contain the data you are working with.


                                             Preview The Data Model
                                             While the data model acceleration might take a while to process, you can preview the
                                             data with the datamodel command. A template for this search looks like:

                                             | datamodel <data model name> <data model child object> search |
                                             search sourcetype=<new sourcetype> | table <data model name>.*

    Provided by Aplura, LLC. Splunk Consulting and Application Development Services. sales@aplura.com https://www.aplura.com

   Splunk ia a registered trademark of Splunk, Inc.      This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.
                                                                                                                                                              Many Solutions, One Goal.
v2.1.5