Skip to content

For Developers

Samuel Garcés Marín edited this page Jan 15, 2022 · 2 revisions

Some notes before starting.

You will see continuously the usage of Cows, particularly Cow<'static, str>. Cows reduces the number of memory allocations, allowing us to store a String or a &'static str. So we can add static fields like product=ASA or vendor=Cisco without allocating memory.

let log : SiemLog = SiemLog(.....);
log.add_field("event_message", SiemField::Text(Cow::Owned(supermessage.to_string()));
log.add_field("firewall_name", SiemField::Text(Cow::Borrowed("superfirewall"));
log.add_field("dropped_packets", SiemField::I64(12.0));
log.set_product("ASA");// Usage of Into<Cow<'static, str>>
log.set_vendor(Cow::Borrowed("Cisco"));

Custom Component

If you want to build your own components you only need to implement the trait SiemComponent that will allow the Kernel to manage the component.

pub trait SiemComponent : Send {
    fn id(&self) -> u64 {
        return 0
    }
    fn set_id(&mut self, id: u64);
    fn name(&self) -> &str {
        return &"SiemComponent"
    }
    /// Get the channel to this component
    fn local_channel(&self) -> Sender<SiemMessage>;
    /// Sets the channel of this component. It's the kernel who sets the channel
    fn set_log_channel(&mut self, sender : Sender<SiemLog>, receiver : Receiver<SiemLog>);

    /// Sets the channel to communicate with the kernel.
    fn set_kernel_sender(&mut self, sender : Sender<SiemMessage>);

    /// Execute the logic of this component in an infinite loop. Must be stopped using Commands sent using the channel.
    fn run(&mut self);

    /// Allow to store information about this component like the state or conigurations.
    fn set_storage(&mut self, conn : Box<dyn SiemComponentStateStorage>);

    /// Capabilities and actions that can be performed by this component
    fn capabilities(&self) -> SiemComponentCapabilities;

    /// Allows the Kernel to duplicate this component
    fn duplicate(&self) -> Box<dyn SiemComponent>;
    
    /// Initialize the component with the datasets before executing run
    fn set_datasets(&mut self, datasets : Vec<SiemDataset>);
}

Each component runs in its own Thread and is the Kernel the one responsible of cloning the component and run it in its own thread.

use usiem::components::{SiemComponent};
pub struct BasicComponent {
    kernel_sender: Sender<SiemMessage>,
    local_chnl_snd: Sender<SiemMessage>,
    local_chnl_rcv : Receiver<SiemMessage>,
    log_receiver: Receiver<SiemLog>,
    log_sender: Sender<SiemLog>,
    id: u64,
}
impl SiemComponent for BasicComponent {
   ...
}

The kernel is the piece of code responsible of connect the channels with the components. Some Kernels will allow the SIEM to work in a cluster. The default Kernel is a simple Kernel that is only able to run locally.

Be aware that you will need to process messages (SiemMessage) sent to your component. The posible messages are:

  • Command: Some other component want to execute a command in this component.
  • Response: You executed a command and now you receive a response.
  • Log: If by any chance your component is the one responsible of parsing, indexing or enrich logs, then you can receive logs directly by the kernel.
  • Notification: Its the local logging system. You wont be using println! if you want to notify problems.
  • Dataset: The last dataset reference that allows you to access a fast in-memmory shared datastore. Datasets don't use mutex and instead are references to a read-only object (BTreeMap in some cases). The dataset also has a channel that allows you to send updates to the DatasetManager to update the Dataset.
  • Alert: Only for the AlertingComponent.
  • Task: As command, but tasks can take much longer.
  • TaskResult: The result of the Task with the ID. If it has not finished, then there will be a None value. You can poll for the result with a Task message filled with the ID of the Task.

New parsers

The parsers do the real work. They parse the incomming logs. As with components you will need to implement the LogParser trait. A good parser will, not only parse a log, it will also return a schema with all the fields that can be extracted from a log and also a LogGenerator, as the name suggest, it randomly creates logs that can be parsed with him.

pub trait LogParser: DynClone + Send {
    /// Parse the log. If it fails it must give a reason why. This allow optimization of the parsing process.
    fn parse_log(&self, log: SiemLog) -> Result<SiemLog, LogParsingError>;
    /// Check if the parser can parse the log. Must be fast.
    fn device_match(&self, log: &SiemLog) -> bool;
    /// Name of the parser
    fn name(&self) -> &str;
    /// Description of the parser
    fn description(&self) -> &str;
    /// Get parser schema
    fn schema(&self) -> &'static FieldSchema;
    /// Get a log generator to test this parser
    fn generator(&self) -> Box<dyn LogGenerator>;
}

The function "device_match" needs to be fast but its not required per se, the parse_log function will return a error if the log cannot be parsed. Using the LogParsingError object allows the ParsingComponent to detect errors as well as optimize the selection of the correct parser. If the error is of type FormatError or ParserError a notification will be sent to the Analyst as to update the parser, and also the component should mark with the appropriate TAG the log to be reprocessed later.

pub enum LogParsingError {
    /// The parser can't be used with this log
    NoValidParser(SiemLog),
    /// The log is for this parser but there is a bug in the code
    ParserError(SiemLog, String),
    /// The log is for this parser but the sub module has not been implemented.
    NotImplemented(SiemLog),
    /// The log has change format the parser cant process it.
    FormatError(SiemLog, String)
}

For example, for the PaloAlto parser, if we can't find a comma <'>, then this parser cannot be used.

pub fn parse_log(log: SiemLog) -> Result<SiemLog, LogParsingError> {
    let log_line = log.message();
    let start_log_pos = match log_line.find(",") {
        Some(val) => val,
        None => return Err(LogParsingError::NoValidParser(log)),
    };
    ...

Later in the code if the "module" is not one of "TRAFFIC", "THREAT" and so, then we will return a NoValidParser.

Clone this wiki locally