# c42eventextractor - Utilities to extract and record Code42 security events and alerts

[](https://pypi.org/project/c42eventextractor/)
[](https://github.com/psf/black)
The `c42eventextractor` package provides modules that assist in the retrieval and logging of Code42 security events and
alerts. This is done by exposing handlers that allow developers to supply custom behaviors to occur when retrieving
events. By default, the extractors will simply print their results to stdout, but these handlers can be extended to
allow developers to record the event info to whatever location or format they desire.
## Requirements
- Python 2.7.x or 3.5.0+
- Code42 Server 6.8.x+
- py42 version 1.7.0+
## Installation
Install `c42eventextractor` using pip:
You can install the `c42eventextractor` package from PyPI, from source, or from distribution.
### From PyPI
The easiest and most common way is to use `pip`:
```bash
pip install c42eventextractor
```
To install a previous version of c42eventextractor via `pip`, add the version number. For example, to install version
0.2.9, you would enter:
```bash
pip install c42eventextractor==0.2.9
```
Visit the [project history](https://pypi.org/project/c42eventextractor/#history) on PyPI to see all published versions.
### From source
Alternatively, you can install the c42eventextractor package directly from
[source code](https://github.com/code42/c42eventextractor):
```bash
git clone https://github.com/code42/c42eventextractor.git
```
When it finishes downloading, from the root project directory, run:
```bash
python setup.py install
```
### From distribution
If you want create a `.tar` ball for installing elsewhere, run this command from the project's root directory:
```bash
python setup.py sdist
```
After it finishes building, the `.tar` ball will be located in the newly created `dist` directory. To install it, enter:
```bash
pip install c42eventextractor-[VERSION].tar.gz
```
## Usage - Code42 Security Events
To get all security events, use the `FileEventExtractor`:
```python
from c42eventextractor.extractors import FileEventExtractor
from c42eventextractor import ExtractionHandlers
import py42.sdk
code42_sdk = py42.sdk.from_local_account(
"https://example.authority.com",
"admin@example.com",
"password",
)
handlers = ExtractionHandlers()
# Add implementations for customizing handling response and getting/setting insertion timestamp cursors:
def handle_response(response):
pass
def record_cursor_position(cursor):
pass
def get_cursor_position():
pass
handlers.handle_response = handle_response
handlers.record_cursor_position = record_cursor_position
handlers.get_cursor_position = get_cursor_position
extractor = FileEventExtractor(code42_sdk, handlers)
extractor.extract()
# To get all security events in a particular time range, provide an EventTimestamp filter.
# Note that if you use `record_cursor_position`, your event timestamp filter may not apply.
from py42.sdk.queries.fileevents.filters import EventTimestamp
time_filter = EventTimestamp.in_range(1564694804, 1564699999)
extractor.extract(time_filter)
# If your timestamps are in string format, you can convert them by doing:
from datetime import datetime
begin_date_string = "21 June, 2020"
end_date_string = "22 June, 2020"
begin_date = datetime.strptime(begin_date_string, "%d %B, %Y")
end_date = datetime.strptime(end_date_string, "%d %B, %Y")
begin_timestamp = (begin_date - datetime.utcfromtimestamp(0)).total_seconds()
end_timestamp = (end_date - datetime.utcfromtimestamp(0)).total_seconds()
time_filter = EventTimestamp.in_range(begin_timestamp, end_timestamp)
extractor.extract(time_filter)
# You can put filters in an iterable and unpack them (using the `*` operator) in the `extract()`
# method. This is a common use case for programs that need to conditionally build up filters.
from py42.sdk.queries.fileevents.filters import DeviceUsername, FilePath
_NEEDS_DEVICE_USERNAME_FILTER = False
_NEEDS_FILE_PATH_FILTER = True
filters = []
if _NEEDS_DEVICE_USERNAME_FILTER:
filters.append(DeviceUsername.eq("test.user@example.com"))
if _NEEDS_FILE_PATH_FILTER:
filters.append(FilePath.is("path/to/file"))
extractor.extract(*filters)
```
## Usage - Code42 Security Alerts
Getting alerts is similar to getting security events, use the AlertExtractor with appropriate alert filters from the
`py42.sdk.queries.alerts.filters` module:
```python
from c42eventextractor.extractors import AlertExtractor
from py42.sdk.queries.alerts.filters import AlertState
# set up your sdk and handlers here
extractor = AlertExtractor(code42_sdk, handlers)
open_filter = AlertState.eq(AlertState.OPEN)
extractor.extract(open_filter)
```
### Using "OR" queries
The default behavior of the extractor is to "AND" all filter groups that get passed in for extraction. If you want to
construct an "OR" query between multiple filters, you can set the `.use_or_query` attribute on the
extractor instance and it will convert the query to "OR" all filters except those in the `.or_query_exempt_filters`
list. This list by default contains all of the timestamp filter classes (as using "OR" with a timestamp filter negates any
other filter categories). If you need to exclude any other specific filters from the "OR" group and keep them in the "AND"
part of the query, you can append either the base filter class (which will exempt any instance of that type of query from the
"OR" query), or pass in a constructed filter with value(s) to only exclude that exact filter from the "OR" group.
Example to "OR" two filters (`FileName` and `FileSize`) but keep the whole query restricted to only Exposure events:
```python
extractor = FileEventExtractor(code42_sdk, handlers)
file_name_filter = FileName.eq("document.txt")
file_size_filter = FileSize.greater_than(1024*1024*1024)
exposure_event_filter = ExposureType.exists()
# convert to OR query
extractor.use_or_query = True
# make sure the exposure_event_filter is included in the "AND" group
extractor.or_query_exempt_filters.append(exposure_event_filter)
extractor.extract(file_name_filter, file_size_filter, exposure_event_filter)
```
OR queries can be done with both the `FileEventExtractor` and `AlertExtractor`.
### Handlers
A basic set of handlers is provided in the `c42eventextractor.extraction_handlers.ExtractionHandlers` class.
These default to printing the response data and any errors to the console and stores cursor position in memory.
`c42eventextractor` also provides some common logging and formatting implementations that you may find useful for
reporting on security data.
For example, to extract and submit file events to a syslog server in CEF format, use the below as your
`handle_response` implementation:
```python
import json
import logging
from c42eventextractor.logging.handlers import NoPrioritySysLogHandler
from c42eventextractor.logging.formatters import FileEventDictToCEFFormatter
my_logger = logging.getLogger("MY_LOGGER")
handler = NoPrioritySysLogHandler("examplehostname.com")
handler.setFormatter(FileEventDictToCEFFormatter())
my_logger.addHandler(handler)
my_logger.setLevel(logging.INFO)
def handle_response(response):
events = json.loads(response.text)["fileEvents"]
for event in events:
my_logger.info(event)
```
To customize processing of results/errors further, or to persist cursor data to a location of your choosing, override
the methods on the provided handlers or create your own handler class with the same method signature as
`c42eventextractor.extraction_handlers.ExtractionHandlers`.
### Cursor Behavior
Because extractors automatically check for cursor checkpoints from the provided handlers, if the `.extract()` method
is called with the same filter classes used to store the checkpoint position (`DateObserved` for alerts and
`InsertionTimestamp` for file events), an exception will be raised if a cursor checkpoint already exists, as the
extractor will automatically add its own timestamp filter to the query.
## CEF Mapping
c42eventextractor includes mappings from JSON field names to common event format (CEF). These formatters are available
by importing the `c42eventextractor.logging.formatters` module. To create a logger that logs file events in CEF format
to a file, follow this guide:
```python
import logging
from c42eventextractor.logging.formatters import FileEventDictToCEFFormatter
formatter = FileEventDictToCEFFormatter()
handler = logging.FileHandler("output.txt", delay=True, encoding="utf-8")
logger = logging.getLogger("extractor_logger")
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.INFO)
```
The following tables map the data from JSON field names to CEF as well as forensic search field names.
### Attribute mapping
The table below maps JSON fields, CEF fields, and [Forensic Search fields](https://support.code42.com/Administrator/Cloud/Administration_console_reference/Forensic_Search_reference_guide)
to one another.
| JSON field | CEF field | Forensic Search field |
|:--------------------------:|:-------------------------------------:|:--------------------------------------:|
| actor | suser | Actor |
| cloudDriveId | aid | n/a |
| createTimestamp | fileCreateTime | File Created Date |
| deviceUid | deviceExternalId | n/a |
| deviceUserName | suser | Username (Code42) |
| domainName | dvchost | Fully Qualified Domain Name |
| eventId | externalID | n/a |
| eventTimestamp | end | Date Observed |
| exposure | reason | Exposure Type |
| fileCategory | fileType | File Category |
| fileName | fname | Filename |
| filePath | filePath | File Path |
| fileSize | fsize | File Size |
| insertionTimestamp | rt | n/a |
| md5Checksum | fileHash | MD5 Hash |
| modifyTimestamp | fileModificationTime | File Modified Date |
| osHostName | shost | Hostname |
| processName | sproc | Executable Name (Browser or Other App) |
| processOwner | spriv | Process User (Browser or Other App) |
| publicIpAddress | src | IP Address (public) |
| removableMediaBusType | cs1 (Code42AEDRemovableMediaBusType) | Device Bus Type (Removable Media) |
| removableMediaCapacity | cn1 (Code42AEDRemovableMediaCapacity) | Device Capacity (Removable Media) |
| removableMediaName | cs3 (Code42AEDRemovableMediaName) | Device Media Name (Removable Media) |
| removableMediaSerialNumber | cs4 | Device Serial Number (Removable Media) |
| removableMediaVendor | cs2 (Code42AEDRemovableMediaVendor) | Device Vendor (Removable Media) |
| sharedWith | duser | Shared With |
| syncDestination | destinationServiceName | Sync Destination (Cloud) |
| url | filePath | URL |
| userUid | suid | n/a |
| windowTitle | requestClientApplication | Tab/Window Title |
| tabUrl | request | Tab URL |
| emailSender | suser | Sender |
| emailRecipients | duser | Recipients |
### Event mapping
See the table below to map exfiltration events to CEF signature IDs.
| Exfiltration event | CEF field |
|:------------------:|:---------:|
| CREATED | C42200 |
| MODIFIED | C42201 |
| DELETED | C42202 |
| READ_BY_APP | C42203 |
| EMAILED | C42204 |