GeekWeek brings together key players in the field of cyber security to generate solutions to vital problems facing the entire community in an innovative and collaborative format. Participants have the opportunity to pick which ea they would like to focus on, based on the themes and sub-themes listed below. Teams will then be formed according to these preferences. Participants are also encouraged to suggest projects associated to each of the themes and sub-themes listed below. We value the community’s expertise and encourage participants to share their ideas with us.
If you have a specific area of interest or theme, you would like to work on, please indicate the number associated with it, as per the list below, in your application or once you receive your acceptance email. Each sub-theme will have an area of cybersecurity associated to it, but you should keep in mind that most themes would still touch several areas, giving you exposure to a multitude of subjects. For example, as this is a cybersecurity workshop, a theme focused on programming will also touch on cyber data analysis activities.
Areas listed in the sub-themes:
[N] Network & Infrastructure
[P] Programming & Advanced Algorithm
[R] Reverse Engineering
[O] Operating Systems Internals
[D] Cyber Data Analysis and Visualization & Data Mining
National Domain Name System (DNS) [D]
In GeekWeek 3, we have explored the possibilities offered by data visualization technologies to better understand Cyber Health: we looked at abuse data, diverse open source databases and BGP traffic, among other things.
Cyber Security Posture [D | N]
Canada has now a national DNS operated by the Canadian Internet Registration Authority (CIRA). Every day, a large numbers domains are blocked to protect Canadian Shield users online. This is only the first step, how could we develop better analytics and information sharing processes to improve Canadian Shield to protect industry and government partners, as well as private citizens, from unintentionally accessing these malicious domains?
Cyber Sovereignty [D]
One would expect that communications between Canadian endpoints do not need to be routed outside of the country. However, for efficiency purposes or other practical reasons, this is not always the case, which can pose security risks. The Border Gateway Protocol (BGP) manages routes followed by internet communications from its source to its destination. By manipulating BGP, data can be rerouted in cyber attackers’ favour and allow them to intercept or modify traffic. How can we develop analytics to detect malicious behaviours and changes in traffic routes in real time? How can we prevent traffic hijacking to help protect Canadian communications?
Preparing for a wireless future [D]
From cellphones to home appliances, an increasing number of devices are wirelessly connected to the internet, and as the use of 5G becomes widespread, we can expect this trend to continue at full speed. However, with increasing wireless connectivity comes a growing number of cybersecurity threats targeting cellphones, such as SIMSwapping, and other wirelessly connected devices. Telecommunication data can offer invaluable insight on the threats targeting mobile devices. How could we parse and analyze wireless network data, such as Signaling System No. 7 (SS7) data, to gain a better understanding of mobility, and especially 4G/LTE or 5G related, threats? How could we build processes and systems to better prevent wireless threats and better protect Canadians?
Detecting and Decoding Advanced Persistent Threat (APT) Malware [R | P]
Malicious APT artifacts are usually stealthy, well-crafted and possess strong anti-analysis and evasion techniques, making their detection and decoding complex. Using knowledge gained from in-depth reverse engineering of a chosen APT malicious sample, how could we build tools in order to detect malicious activity from these samples in local environments? Also, how could we develop the ability to evaluate IPv4 spaces for the presence of APT implants?
Honeypots/Improving Internet Emulator [N | P | D]
Cyber threats are constantly evolving, and malicious actors keep finding innovative ways to infect a system. Honeypots are systems that mimic real environments to fool threats actors and gather intelligence on emerging infection vectors. To be effective, honeypots need to keep up with the latest infection techniques. Moreover, an Internet emulator for dynamic malware analysis is also a kind of honeypot. It is not always possible to let malware communications use the Internet when performing dynamic analysis. Although Inetsim is an Internet emulator that could be used to address this problem, it was developed more than 10 years ago and does not leverage the capabilities of the latest technologies. How could we develop a new and better Internet simulation for isolated dynamic malware analysis tools (not only simulating services, but also content)? How could we also simulate Industrial Control Systems and user interactions? Could we build a system that does both Internet emulator and honeypot? How could we leverage new concepts like machinelearning, the cloud, or improve the honeypot/internet emulator response rates to collect sophisticated malicious artifacts and emerging threats?
SPAM/Phishing/Smishing [D | P]
Analyzing SPAM emails sent from botnets, phishing, and smishing URLs allow us to manually extract Indicators of Compromise (IOC). How could we develop techniques to find and extract relevant and actionable information automatically from billions of SPAM emails and recently created domains in real time? Further, how could we develop analytics to identify SPAM/phishing/smishing campaigns?
Memory Analysis [O | P]
Configurations embedded in malware artifacts are a wealth of actionable information that can be leveraged in further analyses. For example, the configuration of many malware families contains the addresses of their Command & Control servers. How could we leverage the configuration information that is available at runtime in order to derive valuable forensics data about the state of the system? How could this process be structured in a framework to permit automation and scaling (e.g., using tools like CAPE or malscan)?
Cyber Neighbourhood Watch [P]
It is not always easy to accurately validate the maliciousness of an indicator, and evaluate the possible negative side effects of blocking the traffic accordingly. During a previous Cyber Centre event, an initiative was developed in collaboration with Canadian telecommunication companies to create a method in which different organizations can communicate useful information regarding network traffic to other partner organizations, using MISP. Could we democratize determining indicator quality and relevancy through a member voting system? How could we create a system supporting such a community? Further, how could this knowledge be used to enrich a national DNS?
Malicious Infrastructure and Threat Hunting [P | R]
There are many cyber threats in the wild, and it can be difficult to automatically recognize malicious infrastructure and phishing attempts from the large amounts of data. One method of detection is by switching the path section of a URL with another path known to be associated with malicious infrastructure. How could we distill data to identify recipes we can use to find rules that will, in turn, find more data and/or add value to the existing data? How could we validate that our recipes for rules are correct? How could we automatically recognize and attribute gathered malicious data and infrastructure?
Cloud monitoring and analytics [D]
As more and more government services move to the Microsoft Azure cloud, it becomes essential for the Cyber Centre to understand how users can secure and monitor their cloud tenancies, as well as its physical infrastructure. Microsoft Azure provides logs for a client to monitor their cloud activities and resources. How could we create a framework to extract, analyze and visualize cloud logs into actionable information? How could we derive a set of good practices for Microsoft Azure cloud users and work with cloud providers to improve their services?
Infrastructure Mapping [R | D]
To be able to devise a proper response to malicious actors’ wrongdoings, it is necessary to start by understanding the malicious infrastructure they are leveraging. How could we pool IOCs from industry, government and law enforcement agencies and enrich them to their fullest degree to determine the architecture of the malicious infrastructure?
Operationalize Hunting Malicious Sample [R | D]
Operational hunting is different from typical IOC hunting. It specifically looks for actionable intelligence that can be utilized to build a case against the malicious actor behind a sample. How could we effectively hunt samples (through host analysis, surface analysis, reverse engineering, etc.) to determine the internal composition of the cyber threat?
Actor Attribution [R | D]
To be able to take action against malicious actors, a strong case must be built, with backing admissible evidence that clearly identifies the individual or individuals responsible for the cyber threat. How could we identify specific individuals behind the attacks by researching IOC and actor monikers?
Malicious Infrastructure and Threat Hunting [P | R]
One of the biggest issues behind cross-organizational collaborative hunting is attempting to find a mutual method for sharing, storing, normalizing and visualizing the data. How could we ingest the vast data on the malicious infrastructure, the malware and the actor attribution? How could we leverage the activities done by the other GeekWeek teams to produce actionable output?
Follow the money [P | D]
Cryptocurrencies are often used as the method of choice by threat actors to received payments from their different victims. How could we mine the blockchain and gather key information about this ecosystem? How could we use this data to trace back the actors or group cyber threats together? How could we find a repeatable and sustainable way to measure the costs and the revenues of cyber threat actors and the cyber threat economy?
Contributing to Community Tools [P]
Open source tools and services built by the cyber community play an important role in making cyber security more accessible, therefore strengthening our response capacity against cyber threats. For instance, how could we work together to improve community services and tools, such as MISP, Malpedia or Ghidra, and connect more people to those services? How could we contribute to open source tools with code that could benefit the entire cyber community?
Advanced Genetic Malware Analysis [R | P |D]
“Information Retrieval” is an emerging malware analysis and reverse engineering technique that decomposes new unknown malware into existing known components and wheels from the existing data. The existing tools, such as Intezer, rely on exact code matching algorithms and cannot retrieve information that slightly differs from the original code. How could we develop an information retrieval system able to perform inexact matching and able to adapt to different CPU architectures? How could we leverage existing tools, such as Kam1n0 and Ghidra, to design a system both flexible and scalable?
Advanced malware clustering [P]
New malware is discovered everyday, but a majority of discoveries are variants of existing malware. How can we automatically cluster those variants based on their behaviour, and then group them with their affiliated family? How can we extract shareable IOCs and signatures from a cluster of similar malware?
Advanced malicious infrastructure clustering [P]
Malicious infrastructure changes constantly and makes significant efforts to hide its affiliation. Similarly, new phishing websites that impersonate companies are created every day, always containing minor differences to evade automated detection. How could we use machine learning and image recognition algorithms to automatically cluster and attribute malicious infrastructure and websites?
Validation Cyber Threat Infrastructure [P]
As malicious actors change and relocate their infrastructure to avoid detection, we need to constantly verify that our data is still valid. How could we build a system that regularly browses the Internet to validate the information on malicious infrastructure and ensure it is still active?
Automated Knowledge Extraction (a.k.a It is all about graphs!) [P]
A cyber threat story usually starts with only a handful of indicators. The analyst then has to manually create relationships with other information sources to complete the story and discern the full picture. How could we automate this process? How could we automatically draw a graph of indicators that shows the analyst the entire story without manual processing?