Extended XBL (eXBL)

It’s the metadata-enriched version of the XBL list, and as such focused on compromised indicators obtained by behavioral heuristics. Each record is composed by the following fields:

  • ipaddress It’s the IP identified as the source of the bot-generated traffic. Always provided.

  • botname The name associated with the bot which activity has been detected; “unknown” if the detection can’t be clearly associated with a specific bot. Always provided.

  • seen The Unix timestamp of the last detected event for the given IP and the given botname. Always provided.

  • firstseen It’s the Unix timestamp of the first detection event for this IP+botname combination. It will match the value of seen if it’s the first sighting of this type on this IP. It’s reset whenever the given combination has seen no activity for at least a month. Always provided.

  • listed It’s the Unix timestamp of when the entry reached our database. It’s usually very close to the value of seen unless when the data is coming from batched processes. Always provided.

  • valid_until It’s the Unix timestamp of when the given entry will be considered “expired” from our dataset. Always provided.

  • detection Human-readable form, briefly describing how the data was collected; appears only when the heuristic can involve multiple ways of collecting such data.

  • rule It’s an internal ID pointing to the rule operating the detection. Detections operated by different means or rules will show different IDs, even when they refer to the same detection. Always provided.

  • dstip Destination IP of the traffic that triggered the detection; not always disclosed/available.

  • dstport Destination port of the traffic that triggered the detection; not always disclosed/available.

  • helo When the detection is operated from SMTP traffic, it’s the HELO string used in the SMTP session triggering the detection.

  • helos Specific to MPD detections only: it’s an array enumerating all the HELO strings involved in the detection of the behavior; appears only in records for the MPD heuristic.

  • heuristic It’s the heuristic applied to generate the detection, and as such has a limited number of possible values.

  • asn It’s the Autonomous System announcing the IP; obtained from routeviews data mostly.

  • lat Geographic Latitude of the IP; only provided when geolocation data is available.

  • lon Geographic Longitude of the IP; only provided when geolocation data is available.

  • cc The ISO Country Code of the nation where the IP resides; only provided when geolocation data is available.

  • protocol IP protocol of the traffic triggering the detection. Usually either UDP or TCP.

  • srcip Source IP of the traffic triggering the detection. Except in very strange corner cases, it matches the argument of the listing.

  • srcport Source port of the traffic triggering the detection, when it’s operated based on a single TCP/UDP session. Not always available.

  • subject Specific to detections operated on SMTP traffic, and therefore limited to the heuristics “SPAMBOT”, “IMPERSONATE” and “SMTPAUTH”. It’s the subject line (in the original encoding) for the message that triggered the detection.

  • uri Specific to the “SINKHOLE” heuristic, and to HTTP sinkholes detections only; it’s the URI of the HTTP request triggering the listing. Not always available.

  • useragent Specific to the “SINKHOLE” heuristic, and to HTTP sinkholes detections only; it’s the User-Agent header of the HTTP request triggering the listing. Not always available.

  • domain Mostly specific to the “SINKHOLE” heuristic, and to HTTP sinkholes in particular; it’s the domain/hostname the traffic triggering the detection is reaching, or -in other words- the sinkhole’d domain. Often obtained from the “host” header of the HTTP request triggering the listing. Not always available.