Troubleshooting Symantec File Reader Restarts

By | Symantec DLP Endpoint Prevent, Symantec DLP Network Discover | No Comments

FileReader restarts usually occur due to timeout issues.  These timeouts are generally caused by the following:

  1. Connectivity issues with the Monitor Controller or Network
  2. Poorly written RegEx Rules
  3. A bad email message.  Bad messages can be caused by incorrect header information or foreign characters in the message that Symantec DLP is unable to process

If the FileReader restarts itself occasionally, this is normal behavior.  However, if you are experiencing consistent FileReader restarts in your environment, there are a few things you can do to determine the cause:

  1. FileReader may fail to start (and restart) if it can’t receive all the configuration information it needs. To troubleshoot the exact cause, look in the FileReader log first to identify which FileReader subsystem isn’t starting. Once it’s identified that a particular subsystem isn’t receiving its configuration, one should look in the MonitorController log to see if the corresponding subsystem has been initialized successfully. One of the common failures is inability to ignite cryptographic keys in the MonitorController because the ignition password on the disk got out of sync with the Administrator password in the database. In this case the password issue must be fixed and only after that should the MonitorController be restarted. Look at the VontuMonitorController.conf file in the config directory. Check the java heap size. If it is the default value of 128 and 256, you will probably need to increase the memory setting of the heap depending on the RAM available on the server.
  2. Too many exceptions in policies. Each new exception, while improving the ability to “short-circuit” detection, also has the knock-on effect of multiplying the size of the overall detection matrix – all of which is loaded into memory when FileReader starts. While there is no exact limit on the total number of exception, a good rule of thumb is that more than 10 exceptions in a policy will start to have an impact at some point – after which the next exception may result in FileReader being unable to load.
  3. Check to make sure the MessageChain.CacheSize & MessageChain.NumChains match the CPU Cores.
  4. Check your policies.  Oftentimes FileReader restarts will occur because of a particular policy.  For example, if a Regex in a particular policy exceeds given thresholds (such as maximum component time), then the FileReader will restart.  Look at the log files for the “intentionally restarting process” message which identifies the message chain component causing the restart.  If this component is “Detection” the most likely cause is a poorly written regular expression.
  5. Check for “bad” messages. Save the *.vpcap file that contains the message in question. You can use the file for testing without having to actually send the message again.
  6. Check for locked *.vpcap files.
    1. Stop Packet Capture so that you do not get noise in the test. Start FileReader process. If the *.vpcap file gets picked up, the inductor is working. If the inductor is not working, find out why. The most common problem is that some process has a lock on the files. Other than that, collect the FileReader log and contact support.
  7. If the inductor is working, the problem may be in the Layer 7 Parser or the Content Extractor. Visually inspect the FileReader log for any exceptions, warnings or severe log messages.
  8. While the Content Extractor can often have problems processing various file formats it can rarely, if ever, be blamed for a FileReader restart.
  9. Dying threads can cause FileReader to stop reporting heartbeats and eventually be restarted. Look in VontuMonitor.log for exceptions. Each exception in that log file is an indicator of a serious problem (Java crash or other defect) and is a likely cause of a FileReader restart.

Symantec Network Detection is not working for DLP User Groups that index the Domain Users AD Security Group

By | Symantec DLP Network Discover | No Comments

Do not reference the Domain Users Security Group from a DLP User Group. Any users whose primary group is Domain Users will not be indexed by Enforce. Instead add users to a separate security group and reference that group from the DLP User Group. In addition you can also point the DLP User Group to the individual AD user or an OU(Organizational Unit) that contains that user.

For example, in order to index the user below from a DLP User Group, you would need to add the Engineering AD group as that user is a member of that group. However if you were to add the Domain Users group, it would not index that user because Domain Users is that user’s primary group.

User's Primary Group

 

How To troubleshoot DLP Network Discover scan common errors

By | Symantec DLP Network Discover | No Comments

Access is denied – Permissions error, scan does not have the correct credentials to access the file, or additional access controls exist on the file, or AD access may have been interrupted, (try to manually access the files using the DLP scan user to see why the download failed)

Failed to Read – failed to read errors are generally related to files being locked and not available to be copied/downloaded

Failed to Download – often related to permissions errors (try to manually access the files using the DLP scan user to see why the download failed)

No such mount exists – this is related to file reader restarts, when the file reader does not unmount the drive fully during the restart and then the scan cannot mount the drive to continue scanning.  This is a bug I have seen in the forums.

Exception occurred during initialization. Ensure that Microsoft Outlook is configured properly (.pst files). – Outlook needs to be installed on the server doing the scan to be able to read .pst files

 

Best Practices for Scanning Files Larger Than 30MB Using Discover

By | Symantec DLP Network Discover | No Comments

NOTE: These settings will slow down processing, so we do not recommend making these changes for non-Discover Detection Servers.

To process files that are larger than the 30MB standard limit, you must modify several settings in Discover.  The plan is to use two different Discover Servers.  One server is for files 30MB or smaller.  The first server should be using default settings.  The second server is for files larger than 30MB.  The configuration in this KB is for the second server.

The idea is to reduce the number of message chains while increasing the capacity of the chain (max file size, number of tokens, etc.). You must also adjust timeouts.

Regarding the servers

In the example below the modified parameters are based on a max size of 120 MB:

  • The fast path Discover Server (which could eventually run on the same machine as the Enforce Server) must have the standard configuration settings for a Discover server.
  • The slow path Discover server should be configured with the following Advanced Settings found on the Advanced Settings (Server Settings) Page in the UI.
  • BoxMonitor.FileReaderMemory = -Xms1578M -Xmx1578M (default = –Xrs -Xms1200M -Xmx1200M)

    This increases the available FileReader Memory.

  • BoxMonitor.HeartbeatGapBeforeRestart = 2100000 (default = 960000)The time interval (in milliseconds) that the BoxMonitor waits for a monitor process (for example, FileReader, IncidentWriter) to report the heartbeat. If the heartbeat is not received within this time interval the BoxMonitor restarts the process.  Increasing this value gives more time for FileReader to respond to Box Monitor.
  • ContentExtraction.LongTimeout = 300000 (default = 120000)
    The time interval (in milliseconds) given to the ContentExtractor to process a document larger than ContentExtraction.LongContentSize. If the document cannot be processed within the specified time it’s reported as unprocessed. This value should be greater than ContentExtraction.ShortTimeout and less than ContentExtraction.RunawayTimeout.
  • ContentExtraction.MaxContentSize = 120M (default = 30M)
    The maximum size (in MB) of the document that can be processed by the ContentExtractor. This increases the maximum file size limitation during Content Extraction.
  • ContentExtraction.RunawayTimeout = 600000 (default = 300000)
    The time interval (in milliseconds) given to the ContentExtractor to finish processing of any document. If the ContentExtractor does not finish processing some document within this time it will be considered unstable and it will be restarted. This value should be significantly greater than ContentExtraction.LongTimeout.
  • FileReader.MaxFileSize = 125829120 (default = 30000000)
    The maximum size of a message to be processed. Larger messages are truncated to this size.  This should match the ContentExtraction.MaxContentSize.
  • FileReader.MaxReadGap = 45 (default = 15)
    The time that a child process can have data but not have read anything before it stops sending heartbeats.  Increasing this value gives FileReader more time.
  • IncidentDetection.MaxContentLength = 20000000 (default = 2000000)
    Applies only to regular expression rules. On a per component basis, only the first MaxContentLength number of characters are scanned for violations. The default (2,000,000) is equivalent to > 1000 pages of typical text. The limiter exists to prevent regular expression rules from taking too long. This allows us to look throughout the document for regular expressions.
  • Lexer.MaximumNumberOfTokens = 120000 (default = 30000)
    Maximum number of tokens (including separators) extracted from each message component for detection. Applicable to all detection technologies where tokenization is required, e.g. System patterns, EDM, DGM. Increasing this value may cause the detection to run out of memory and restart.
  • MessageChain.CacheSize =  1 (default = 8)
    Limits the number of messages that can be queued in the message chains.
  • MessageChain.MaximumComponentTime = 1200000 (default = 600000)
    The time interval (in milliseconds) allowed before any chain component is restarted.  Giving more time for processing.
  • MessageChain.NumChains = 1 (default = 8)
    Note: For normal usage, it is recommended to set  MessageChain.NumChains = # of processors on the Discover box.

    The number of messages, in parallel, that the filereader will process. Setting this number higher than 8 (with the other default settings) is not recommended. A higher setting does not substantially increase performance and there is a much greater risk of running out of memory. Setting this to less than 8 (in some cases 1) helps when processing big files, but it may slow down the system considerably.

Additionally: Add the following line to the \vontu\protect\config\crawler.properties on the Discover server machine:

filesystemcrawler.workqueue.max.memory = 120000000

This value defaults to 60000000, but it must be the same or larger than the maximum message size.  All other settings should be standard.

NOTE: As per other settings, changes to the Advanced Server Settings and to properties files require a recycling of the Vontu Monitor in order to take effect.

Targets should be configured for each set of shares to scan: One target is assigned to the slow path server and only scans files larger than 10MB; the other target is assigned to the fast path and scans files smaller than 10MB. This setup allows you to scan all file types up to 120MB.

Text files larger that 120MB will be truncated, but the first 120MB will be processed.

Other file types: *.doc, *.xls, *.ppt, *.pdf, *.zip, etcetera will be ignored if they are larger than 120MB because Vontu’s Message Cracking technology cannot recognize them.

If you must include .xls files, you must disable formula extraction.

To disable formula extraction:

  1. Edit the formats.ini file in the following directory:
    1. 11.5 and earlier: \Vontu\Protect\lib\native\
    2. 11.6 and later: \SymantecDLP\Protect\plugins\contentextraction\Verity\x64\
  2. Change “getformulastring=2” to “getformulastring=0”.
  3. Restart the Monitor Server.
  4. Disable formula extraction on all the detection servers.
    Note: If you use index document matching (IDM) on Excel files, disable formula extraction on Vontu Enforce Server for consistency between IDM indexing and detection.

 

If you are using IDM and it does not work on files that exceed the content extraction limit, this problem has been addressed in the E-track 2229997. This is the workaround:
“You can change the advanced setting ‘DDM.MaxBinMatchSize’ to 30,000,000 (instead of 300,000,000) on each Detection Server and matching of large binary files will work. This will only fix the issue with files that verity cannot extract any partial text for.”

Pre-12.0 Please note: It would also be advisable to use 64 Bit environments since you may be required to adjust the overall JVM size of the FileReader JVM beyond the out of the box settings to accommodate the amount of detection chains. Otherwise, you may run into an out of memory error ( OOM ).

Default ports used by Symantec DLP

By | Symantec DLP Enforce, Symantec DLP Network Discover | No Comments

Default ports for Symantec DLP:

Purpose Protocol Default Port
Enforce Server Console TCP 443
Enforce Upgrade Wizard TCP 8300
Communications from Enforce to Oracle Database TCP 1521
Communications from Enforce to Detection Servers TCP 8100
Communications from Endpoint Agents to Endpoint Servers(version 12.5+) TCP 10443
Ports Used by Network Discover Crawlers and Scanners Many Many
Ports Used by Network Prevent for Email (MTAResubmitPort) TCP 10026
Ports Used by Network Prevent for Email(ServerSocketPort) TCP 10025
Ports Used by Network Prevent for Web TCP 1344
Kerberos port for Enforce AD Authentication UDP 88
SMTP server for system alerts and response rule email notifications TCP 25
Syslog server for system alerts TCP 514
Syslog server for response rule notifications TCP 514
Active Directory connection for LDAP lookup plug-ins, user groups, and user list, user risk summary(not secure) TCP 389
Active Directory connection for LDAP lookup plug-ins, user groups, and user list, user risk summary(secure) TCP 636
Connection to Data Insight Server TCP 443
OCR Server Port TCP 8555
Network Discover Grid Leader Port TCP 61616
DLP 15.0+ Embedded Apache Tomcat (communication between Enforce Server processes related to DLP appliance management) TCP 8080
Connection between Enforce and Domain Controller Agent TCP 443

How to filter incidents and Summarise in Symatec DLP

By | Symantec DLP Enforce, Symantec DLP Network Discover | No Comments

To filter and summarise on Symantec DLP you have several options.

1. You can filter a scan via Status, specific Scan, Target ID and/or a time frame (Detection Date)  by using the drop down box.

 

2. These can be further filtered by choosing a specific Severity:

 

3. In order to display any changes made click the Apply button:

4. Clicking on the Advanced Filters & Summarization button will open a new box.

5. From here you can choose to add a filter or to summarise by a certain category

6. Adding a filter will give you the following options below. In this example the policy filter is applied which would allow you to filter via a specific policy.

7. There are plenty of different ways to filter scans and multiple filters can be applied at a given time.

8. If you would like to remove a filter, press on its respective red cross.

9. Scans may also be summarised by selecting a primary and secondary summary.

10. Once selected the scan will be summarised as shown below, with numbers of incidents and matches etc. shown both individually and overall.